Democratizing Trustworthy Software Development

About the project

Objective
Code generation with AI is making very fast progress. Nonetheless, it suffers from a fundamental drawback: it is unreliable, and users developers cannot trust the generated code, which is a major blocker. Our project TRUST-SW aims to solve this problem: AI will generate code alongside verification materials using state-of-the-art formal verification. By doing so, project TRUST-SW will enable users to produce dependable, correct code with generative AI. The specific objectives are:

Generate specification from human language using LLMs: Develop capabilities for LLMs to translate human language into specification with no or minimal human supervision.
Empower LLMs with deductive, explainable reasoning: Equip LLMs with deductive reasoning skills to debug and validate code based on natural language instructions and novel techniques.
Embed and evaluate real-world constraints into code generation: Ensure that generated code aligns with practical requirements, handling complex constraints of real-world applications.

Background
Recent advances in generative AI, particularly large language models, have made it possible to automatically generate software from natural language prompts. While these tools significantly accelerate software development, the generated code may contain subtle bugs, security vulnerabilities, or logical errors that are difficult to detect. At the same time, formal verification techniques can mathematically guarantee software correctness but often require significant expertise and manual effort. TRUST-SW addresses this gap by exploring how AI systems can work together with formal reasoning and verification tools to automatically generate software that is not only functional but also provably correct and trustworthy.

Cross-disciplinary collaboration
TRUST-SW brings together expertise from several research areas, including artificial intelligence, formal verification, programming languages, and software engineering. The project combines advances in large language models with rigorous methods from formal methods and automated reasoning. This interdisciplinary collaboration enables the development of novel approaches where AI-generated code can be systematically verified and validated, bridging the gap between machine learning–based code generation and mathematically grounded software assurance.

PIs: Marco Chiesa, Martin Monperrus, Mariano Scazzariello.

Democratizing Trustworthy Software Development

About the project

Contacts

Marco Chiesa

Martin Monperrus

Mariano Scazzariello