SQLSpace: A Representation Space for Text-to-SQL to Discover and Mitigate Robustness Gaps.
Published in Findings of the Association for Computational Linguistics: EMNLP 2025, 2025
Recommended citation: Neha Srikanth, Victor S. Bursztyn, Puneet Mathur, and Ani Nenkova. 2025. SQLSpace: A Representation Space for Text-to-SQL to Discover and Mitigate Robustness Gaps. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 1533–1559, Suzhou, China. Association for Computational Linguistics. https://aclanthology.org/2025.findings-emnlp.81/
We introduce SQLSpace, a human-interpretable, generalizable, compact representation for text-to-SQL examples derived with minimal human intervention. We demonstrate the utility of these representations in evaluation with three use cases: (i) closely comparing and contrasting the composition of popular NL2SQL benchmarks to identify unique dimensions of examples they evaluate, (ii) understanding model performance at a granular level beyond overall accuracy scores, and (iii) improving model performance through targeted query rewriting based on learned correctness estimation. We show that SQLSpace enables analysis that would be difficult with raw examples alone: it reveals compositional differences between benchmarks, exposes performance patterns obscured by accuracy alone, and supports modeling of query success.
