-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Description
Is your feature request related to a problem or challenge?
Thanks to some great work from @Omega359 as part of each commit to main DataFusion runs many thousand queries from the sqlite test suite ❤
This is documented here:
datafusion/datafusion/sqllogictest/README.md
Lines 218 to 252 in d47f7fb
## Running Tests: `sqlite` | |
Test files in `data/sqlite` directory of the datafusion-testing crate were | |
sourced from the [sqlite test suite](https://www.sqlite.org/sqllogictest/dir?ci=tip) and have been cleansed and updated to | |
run within DataFusion's sqllogictest runner. | |
To run the sqlite tests you need to increase the rust stack size and add | |
`INCLUDE_SQLITE=true` to run the sqlite tests: | |
```shell | |
export RUST_MIN_STACK=30485760; | |
INCLUDE_SQLITE=true cargo test --test sqllogictests | |
``` | |
Note that there are well over 5 million queries in these tests and running the | |
sqlite tests will take a long time. You may wish to run them in release-nonlto mode: | |
```shell | |
INCLUDE_SQLITE=true cargo test --profile release-nonlto --test sqllogictests | |
``` | |
The sqlite tests can also be run with the postgres runner to verify compatibility: | |
```shell | |
export RUST_MIN_STACK=30485760; | |
PG_COMPAT=true INCLUDE_SQLITE=true cargo test --features=postgres --test sqllogictests | |
``` | |
To update the sqllite expected answers use the `datafusion/sqllogictest/regenerate_sqlite_files.sh` script. | |
Note this must be run with an empty postgres instance. For example | |
```shell | |
PG_URI=postgresql://postgres@localhost:5432/postgres bash datafusion/sqllogictest/regenerate_sqlite_files.sh | |
``` |
When expected output changes (for example, error messages) we currently use a script https://github.com/apache/datafusion/blob/main/datafusion/sqllogictest/regenerate_sqlite_files.sh that:
- It relies on a fork of sqlogictest
- It relies on a modified driver program
This is problematic because
- As the code in datafusion is updated (for example to new sqlogictest versions) the modified driver program may not work with the new sqlogictest version
- The modified driver program may not work with new versions of sqllogictest
This happened with #14824 which made it hard to update the expected output
Describe the solution you'd like
I would like to make sure that regenerate_sqlite_files.sh
will always work and will not bitrot over time
@Omega359 says:
That is exactly what I was thinking and hopefully will fix tonight. I think a decent short-term fix is to 'lock' the sqllogictest-rs dependency version and add a comment that any update to it will require a full run of the regenerate script before committing.
Long term ideally would be to improve my changes to my fork of sqllogictest-rs such that they would be suitable to submit a PR to that project. That is not an insignificant amount of work to be honest and I'm a bit thin on time for the next month or two.
Describe alternatives you've considered
No response
Additional context
See last time we had to update the scripts based on changes: