📂 Released Data and Outputs We release both processed benchmark data in data/processed and all outputs of evaluated LLMs data/outputs.