intent-bench

Intent Fulfillment Benchmark for Coding Agents

Does providing structured intent to coding agents improve implementation effectiveness?

Control (prompt only) Treatment (prompt + intent)

Treatment Comparison

Run the benchmark with your agent, model, or treatment. Submit results via pull request.

Reproduce