-
Notifications
You must be signed in to change notification settings - Fork 43
Available Project Topics
This is not a list of elaborated project/thesis proposals. It is rather a list of possible topics together with ideas how to realize them.
For students of the University of Freiburg: Most of these topics can also result in a Bachelor Project, a Bachelor Thesis, a Master Laboratory, a Master Project, as well as a Master Thesis. If you are interested please contact us and we will discuss together how to adapt the topic accordingly.
Contact: Dominik
When we have verified a program, we would like to produce some kind of proof artifact that shows that the program is indeed correct. Such a proof should be easily checkable. This can be helpful within our tool (it can help us to find bugs in our verifier), and we can also output such proofs in a machine-readable format understood by other tools.
For concurrent programs, a useful proof format are so-called Owicki-Gries proofs (after Susan Owicki and David Gries). However, a naive translation from our own data structures to Owicki-Gries proofs produces exponentially large proof artifacts. This defeats the purpose, as checking such large proofs is very expensive. A previous MSc thesis has developed a better algorithm which should often be able to produce small proofs, but this algorithm has not yet been implemented.
In this project, we will implement this algorithm. Depending on the scope of the project, we may also improve the algorithm, add further optimizations, or output the computed Owicki-Gries proof in a format that can be checked by other tools.
Contact: Dominik
Ultimate implements different algorithms for automated program verification. These algorithms are proven to be theoretically sound, i.e., if they declare that a program is correct, then the program is truly correct. However, like any large software project, the implementation of these algorithms in Ultimate sometimes contains bugs. As a result, Ultimate might declare that a program is correct when it is actually incorrect. In order for us to truly trust a verification result, Ultimate should be able to output a detailed, formal proof of correctness (or certificate) that can be automatically checked by an independent tool.
In this project, we will take some first steps towards this goal. We will take Ultimate's internal notion of a proof (called a Floyd/Hoare annotation) and use it to produce proof scripts for the the interactive proof assistant coq. With these scripts, coq should be able to prove that a Floyd/Hoare annotation produced by Ultimate actually shows that the given program is correct. To control the complexity, we will for now focus on a class of simple programs (e.g. no concurrency, no recursion, no pointers).
Note: Some previous knowledge of coq or a related system (Isabelle, Agda, Lean) will be helpful for this project. If you have not seen such tools before, you should be willing to invest some time to acquaint yourself with interactive proof assistants.
Contact: Frank
Abstract Interpretation is a known technique to abstract the behavior of a program. There are different abstract domains that specify how to abstract. Abstract interpretation is already implemented in ULTIMATE with various domains.
However, there exists also a similar approach that is implemented in ULTIMATE called SIFA (Symbolic Interpretation with Fluid Abstractions). SIFA is based on the ideas of abstract interpretation (i.e. abstracts a programs), but it works slightly different and has proven to be more modular and precise. Therefore the functionality of the abstract domains slightly differs between SIFA and abstract interpretation.
The goal of this project is to implement new abstract domains in the SIFA framework of ULTIMATE.
Contact: Dominik
In a concurrent program, different possible schedulings of threads lead to a large number of different executions (so-called interleavings). To verify a program, we have to show that it behaves correctly for all interleavings. This can make automated verification very expensive or even impossible.
The key observation of commutativity in this context is that for some statements, the order in which they are executed does not matter (e.g. for x:=x+1
and y:=5
, the order does not matter, but for x:=x+1
and x:=0
it may).
If two interleavings differ only in the ordering of such "commuting" statements, they are equivalent, and it suffices to prove correctness for one of these interleavings.
We investigate how this observation can be exploited by so-called partial order reduction (POR) algorithms, in order to verify programs faster, and even verify programs that could not be verified (at all) without commutativity.
Projects in this area can either be more heavy on first-order logic and program semantics, or focus more on automata theory, or they can combine both aspects. An incomplete list of possible topic areas is given below.
-
What does "commutativity" mean? There are many possible definitions for this, which lead to different verification algorithms.
- Conditional commutativity uses information about the program derived from a separate analysis, or from the ongoing verification itself.
- Abstract commutativity focuses on a particular correctness proof of the program, and declares that statements commute if the order of their execution does not matter for this proof.
- Stratified commutativity combines multiple definitions of commutativity in a single analysis.
- How do I choose? I.e., if two interleavings are equivalent, which one should I verify? The answer has a significant influence on how complex a correctness proof needs to be, and how efficiently our automata algorithms behave.
- How do I reduce? Programs typically have infinitely many interleavings, which we represent with finite automata. In order to remove redundant interleavings, there exist many POR algorithms, with different strengths and limitations. Implementing more of these algorithms (and adapting them to our automata-based setting) can speed up the verification.
Literature: The following papers describe our research in this area. Feel free to take a look :)
- Ultimate GemCutter and the Axes of Generalization (2022): A short, high-level introduction to our verification tool
- Sound Sequentialization for Concurrent Program Verification (2022): A more-detailed description of the underlying theory, and two POR algorithms we use
- Stratified Commutativity in Verification Algorithms for Concurrent Programs (2023): A discussion of abstract and stratified commutativity, and a new POR algorithm for stratification
In Ultimate we use bounded Petri nets a acceptors of regular languages. For an introduction you can have a look at the Petri net seminar slides of Nico Bühler from the 2015/16 Automata theory seminar. Petri nets are very efficient on languages that occur while verifying concurrent programs. There are several topics available. The following list could (in principle) be extended by every Petri net operation that is not yet supported.
- Petri net minimization
- Fold the finite prefix of an branching process back to a Petri net.
- Optimizations of various operations (emptiness, difference, ...)
Contact: Matthias
Write interpolant automata to file. Use these interpolant automata for the verification of similar programs (e.g., the same program to which a patch was applied).
We already have a preliminary version of an Eclipse CDT interface that works with old versions of Eclipse. Adapt our interface to recent version of Eclipse and improve it.
Contact: Matthias
We have powerful tools for the logical reasoning over mathematical integers and hence we represent integer data types in computer programs by mathematical integers in our analyses.
However, using mathematical integers is unsound in general since primitive data types usually have a finite number of values. E.g., an unsigned int
in C can take only values from 0 to 4294967295. Our analysis can be made sound by using one of the following transformations.
-
The Falke Transformation
We transform a C program over primitive data types into a program over mathematical integers. After each arithmetic operation we apply a modulo operation. E.g., if x is an unsigned int the expression
x + 1
is translated to(x + 1) % 4294967296
. -
The Nutz Transformation
We transform a C program over primitive data types into a program over mathematical integers. Before each comparison operation we apply a modulo operation. E.g., if x and y are unsigned ints, the expression
x == y
is translated to(x % 4294967296) == (y % 4294967296)
. Check the following examples for more details NutzTransformation01.c NutzTransformation02.c NutzTransformation03.c NutzTransformation04.c NutzTransformation05.c
The latter transformation is already implemented. Task: implement the first transformation and evaluate both.
Contact: Matthias
Our x86 computers store data bytewise. E.g., if a program writes the unsigned int 259 at the address ADDR, the computer will store only the least significant eight bit (here 3 in decimal representation) at ADDR and the next eight bit (here 1 in decimal representation) will be stored at ADDR+1.
In a naive model for the memory of the computer we read and store date bytewise. This is very costly, especially if we use integers to represent data. If the program writes the value of an unsigned int variable x at address ADDR we have to use modulo and division operations to get the value of each byte and we store x%256 at ADDR, we store (x/256)%256 at ADDR+1, ...
The Hoenicke-Lindenmann memory model circumvents this problem. Here, the memory is modeled as an array whose indices are addresses and whose values are integers. If the program writes the value of x to ADDR we just store its value at this address in our array. This is unsound, e.g., in the case were we read data from ADDR+1 (because we stored nothing here). See HoenickeLindenmannConcessions01.c for an example
Do not store only the integer values in a data array. Use additionally an OrigAddress array and a TypeSize array. The indices of both arrays are addresses. The values of OrigAddress are non-positive integers, the values of TypeSize are positive integers. In the OrigAddress array we store how far the real data is displaced. E.g., if we store an int (bytesize four) at ADDR, we also store
OrigAddress[ADDR]:=0
OrigAddress[ADDR+1]:=-1
OrigAddress[ADDR+2]:=-2
OrigAddress[ADDR+3]:=-3
In the TypeSize array we store how many memory cells would be occupied by the naive memory model. E.g., if we store an int (bytesize four) at ADDR, we also store
TypeSize[ADDR]:=4
These three arrays will allow us to compute the exact data for each address in a way that does not require composition/decomposition to bytes in most cases.
Take care that the control flow graph has two parallel edges for reading data.
One for the simple case (OrigAddress[ADDR]==0)
and one for the case where we
have to compose bytes.
Take care that the control flow graph has two parallel edges for storing data.
One for the simple case (OrigAddress[ADDR]==0 || size >=TypeSize[ADDR] )
and
one for the case where we have to decompose into bytes.
Nice feature of traces that contain at least one simple-case-edge: Will be infeasible in most cases.
Nice feature of traces that contain only simple-case-edges: Additional arrays will be irrelevant in most cases and hence not occur in proofs.
In case the decomposition of integers to bytes would be too costly we could overapproximate data operations in the non-simple case edges. We would loose precision but we would still be sound.
We have to initialize the OrigAddress array with zeros. We can do this using the following assume statement at the beginning of the program.
forall i OrigAddress[i]==0
However, quantifiers are very expensive!
We have much more array operations than before.
Contact: Matthias
Once, we had a C2Boogie translation that was independent of type sizes (e.g., sizeof(long) is not fixed to 4 bytes). We discontinued the support for this setting for good reasons:
- Unsound in the SV-COMP
- Introduces nonlinear arithmetic
- ... ?
Once, our plan was to check use exactly the semantics of the C standard, and not the semantics of common compilers/architectures (e.g., if some operation is implementation defined we should just havoc the resulting variable) We discontinued the support for this for good reasons: A lot of nondeterminism is expensive.
It seems to me that other tool try to become more implementation specific.
Let us don't play (only) where the elephants dance.
Furthermore, other tools might not check portable code, because they have problems that we don't have.
- We can deal with nonlinear arithmetic.
- Our C2Boogie translation is quite flexible.
- Our model checker can deal with a lot of nondeterminism.
I think we can write a nice paper in which we explain how ultimate finds bug in code that should strictly follow the C standard and that was meant to be portable.
Contact: Daniel
Ultimate is still based on the old Eclipse 3.x API. We want to migrate completely to the Eclipse e4 API. Ths requires a comprehensive analysis of the current Eclipse API usage in Ultimate, the identification of necessary changes, and the implementation of those changes.
For an overview over the topic, see the description of RCP and a small migration tutorial.
Contact: Daniel
We are currently using IPC to comunicate with various external SMT solvers. We would like to explore whether using the Java bindings of various solvers would provide us with performance improvements or not.
To this end, we want to use JavaSMT, an SMTLib abstraction layer.
The project would involve integrating JavaSMT into Ultimate, implementing a wrapper from our own data structures to JavaSMT and back, and compare the performance with our IPC approach.
- Home
- Ultimate Development
- Ultimate Build System
- Documentation
- Project Topics