AlphaQT-Bench: Diagnosing the Gap between Financial Code Generation and Quantitative Reasoning in LLMs

Publication
In The 64th Annual Meeting of the Association for Computational Linguistics