Guidance on Using Spider2 for Local Text-to-SQL Benchmarking #28

shafqatjamil · 2024-11-29T10:52:29Z

Hi,

I am new to Spider2 and looking for guidance on how to use it effectively for benchmarking our existing Text-to-SQL model. While I have successfully set up Spider2 locally, I am unsure about the next steps to proceed with benchmarking.

Here is what I have done so far:

Followed the steps mentioned in the Spider2 Lite documentation.
Copied the SQLite files and executed evaluate.py.
However, I am not sure how to proceed with running Spider2 for evaluating my own Text-to-SQL model. Could you please guide me on:

The steps required to configure Spider2 to evaluate a custom Text-to-SQL model.
How to integrate my model into the Spider2 benchmarking pipeline.
Any resources, tutorials, or documentation that would help me better understand and use Spider2 for this purpose.
Best practices for benchmarking Text-to-SQL models using Spider2.
I would greatly appreciate any pointers or advice to get started.

Thank you in advance for your time and help!

antonio-veezoo · 2024-12-02T09:47:08Z

+1

lfy79001 · 2024-12-02T18:20:13Z

Hi,

For Spider2-Lite, we have reproduced three commonly used baselines: Codes, DAILSQL, and DINSQL. You can run the code by visiting the following path: [Spider2-Lite Baselines](https://github.com/xlang-ai/Spider2/tree/main/spider2-lite/baselines). If you wish to run your own Text-to-SQL model, please refer to the data processing methods used in these baselines.

For Spider2, you can refer to the [Spider-Agent](https://github.com/xlang-ai/Spider2/tree/main/methods/spider-agent) implementation and build your own method based on it.

If you encounter any bugs or issues, feel free to contact us promptly.

Thank you!

shafqatjamil · 2024-12-04T11:13:08Z

Thank you for your prompt response, really appreciate that. I have installed dependencies in dinsql and tried to runrun.sh script. I am getting token limit exceeded error. Its more of a gpt issue and I am trying to process prompt in chunks now. I have few more questions.

Is there a minimum dataset/sqls requirement for evaluation?
Do we need to provide preprocessed json files or these will be created from sql dump file, if we somehow provide sql dump file of our own database.

I am new to this fields therefor you might find my questions silly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Guidance on Using Spider2 for Local Text-to-SQL Benchmarking #28

Guidance on Using Spider2 for Local Text-to-SQL Benchmarking #28

shafqatjamil commented Nov 29, 2024

antonio-veezoo commented Dec 2, 2024

lfy79001 commented Dec 2, 2024

shafqatjamil commented Dec 4, 2024 •

edited

Loading

Guidance on Using Spider2 for Local Text-to-SQL Benchmarking #28

Guidance on Using Spider2 for Local Text-to-SQL Benchmarking #28

Comments

shafqatjamil commented Nov 29, 2024

antonio-veezoo commented Dec 2, 2024

lfy79001 commented Dec 2, 2024

shafqatjamil commented Dec 4, 2024 • edited Loading

shafqatjamil commented Dec 4, 2024 •

edited

Loading