-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[sqllogictets] Remove postgres container orchestration #5015
Conversation
@@ -232,9 +232,21 @@ jobs: | |||
name: "Run sqllogictest with Postgres runner" | |||
needs: [linux-build-lib] | |||
runs-on: ubuntu-latest | |||
services: | |||
postgres: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this starts the container using the code from integration-tests
(above in this file) rather than running docker from within the harness
@@ -1,21 +0,0 @@ | |||
CREATE TABLE aggregate_test_100_by_sql |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these statements are now run in the individual setup files
/// ``` | ||
/// | ||
/// And read the file locally. | ||
async fn run_copy_command(&mut self, sql: &str) -> Result<DBOutput> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was quite pleased with this -- it allows the tests to run COPY FROM 'file'
@@ -15,6 +15,28 @@ | |||
# specific language governing permissions and limitations | |||
# under the License. | |||
|
|||
|
|||
statement ok |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
drive by clean up -- this can be run directly in the .slt file
## Setup test for postgres | ||
### | ||
|
||
onlyif postgres |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used the onlyif
statement to run different setups for DataFusion and posgres
cc @melgenek |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the approach of having setup per file. It is more clear where data comes from this way.
Removing a container-per-file approach ditches isolation for tests. I would suggest having a schema per test file.
debug!("Using posgres dsn: {dsn}"); | ||
|
||
let config = tokio_postgres::Config::from_str(&dsn)?; | ||
|
||
let mut retry = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can probably get rid of the retry. It was sort of a hack because queries were executed immediately after a docker startup. And sometimes Postgres wasn't fully ready. Assuming that Postgres gets started in advance, you can assume that the connection succeeds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think some of the options from the container setup (that I copied from @jimexist ) may help.
ports:
- 5432/tcp
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed retry in c41a0a8
datafusion/core/tests/sqllogictests/src/engines/postgres/mod.rs
Outdated
Show resolved
Hide resolved
datafusion/core/tests/sqllogictests/src/engines/postgres/mod.rs
Outdated
Show resolved
Hide resolved
`ORDER BY` will not match DataFusion and the tests will diff. | ||
|
||
```shell | ||
docker run \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docker run \ | |
docker run --rm \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think --rm
removes the container. I normally prefer to leave them around for debugging
|
||
onlyif postgres | ||
statement ok | ||
DROP TABLE IF EXISTS aggregate_test_100_by_sql; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dropping a table seems like a hack to the fact that Postgres database is reused for multiple test files.
It would be nice to introduce isolation by, for example, having a schema per file. The schema could be either random or based on the test file name.
I drafted a very basic implementation of the approach. melgenek@3394c87
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks -- this is a good point @melgenek
At the very least the tests should clean up after themselves (as in I should put a drop_table
at the end). I like the idea of a separate schema, however. I will try and add that shortly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking more about the relevance of schemas and isolation.
If Datafusion's test suite ever becomes big enough, you would be able to parallelize it. And schemas would help run multiple files in parallel without interference.
Co-authored-by: Yevhenii Melnyk <melnyk.yevhenii@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks andrew
Benchmark runs are scheduled for baseline = 25ab1f9 and contender = 930c8de. 930c8de is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
Closes #5009
Rationale for this change
#4834 added a way to run sqllogictests using postgres 🦾 (kudos to @melgenek )
However, it currently orchestrates the postgres containers with rust test code which means running these tests requires docker to run the (full) datafusion test suit locally
What changes are included in this PR?
COPY
so it can be run from the postgres testsPG_DSN="postgresql://postgres@127.0.0.1/postgres"
environmentAre these changes tested?
Yes, in CI
Are there any user-facing changes?
Not really -- only to other devs