Intent Fulfillment Benchmark for Coding Agents
Does providing structured intent to coding agents improve implementation effectiveness?
Run the benchmark with your agent, model, or treatment. Submit results via pull request.