10 Models × 13 Benchmarks × 5 Frameworks — Complete evaluation results from the EffGen paper
From the ICML 2026 submission (under review)
EffGen consistently outperforms LangChain, AutoGen, and Smolagents across all 10 models and 13 benchmarks, with the largest gains on smaller models where optimization matters most.