- Write a failing test
- Run the test (it should fail)
- Write the minimum code to pass the test
- Refactor the code
Need to solve the editing problem, i.e. how the LLM modifies the code. Full control over the code.
Lack of control. Easiest to implement.
OpenAI Codex is open source, I could modify it to support TDD. Both control and a lot of problems are already solved. By far the most complex option.