Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(api): make create_table uniform #5736

Merged
merged 1 commit into from
Mar 15, 2023

Conversation

cpcloud
Copy link
Member

@cpcloud cpcloud commented Mar 13, 2023

This PR address recent issues created around DDL.

Notes

  • I didn't deprecate ImpalaTable.load_data. I'm not entirely what should be done with it.
  • For clickhouse's create_table, since clickhouse itself doesn't have a default
    table engine, I preserved that: you have to pass engine="Something" to its
    create_table. It breaks the uniformity, but I'm not sure how else to handle
    it.

TODO

  • Implement create_view for pandas/dask/polars
  • Implement create_view/drop_table/truncate_table for clickhouse
  • Review/implement truncate_table (I don't think these makes sense for the in-memory backends, so those'll raise an exception)

Closes #5394
Closes #5392
Closes #5390
Closes #4316
Closes #2339

@cpcloud cpcloud added this to the 5.0 milestone Mar 13, 2023
@cpcloud cpcloud added feature Features or general enhancements backends Issues related to all backends labels Mar 13, 2023
@cpcloud cpcloud marked this pull request as draft March 13, 2023 22:01
@github-actions
Copy link
Contributor

github-actions bot commented Mar 13, 2023

Test Results

         9 files           9 suites   11m 52s ⏱️
  3 833 tests   3 772 ✔️   61 💤 0
29 412 runs  28 981 ✔️ 431 💤 0

Results for commit dbfe647.

♻️ This comment has been updated with latest results.

@cpcloud cpcloud force-pushed the create-table-uniformity branch 6 times, most recently from a07a119 to 969e26b Compare March 14, 2023 13:09
@gforsyth
Copy link
Member

Thoughts on forcing keyword arguments for a bunch of these DDL functions?
I was thinking everything after name and obj should be keyword-only (so schema, database, overwrite and temp)

@cpcloud
Copy link
Member Author

cpcloud commented Mar 14, 2023

SGTM

@gforsyth gforsyth force-pushed the create-table-uniformity branch 2 times, most recently from 00f2e31 to d4e9e9b Compare March 14, 2023 14:31
@cpcloud cpcloud added the ddl Issues related to creating or altering data definitions label Mar 14, 2023
@cpcloud cpcloud marked this pull request as ready for review March 14, 2023 16:34
Copy link
Member

@gforsyth gforsyth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks good to me -- I'll leave it for a bit in case anyone else wants to take a reviewing pass.

@cpcloud
Copy link
Member Author

cpcloud commented Mar 14, 2023

There are some snowflake things that are failing locally for me. #5741 should help with that

Copy link
Member

@jcrist jcrist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me. A few small questions/comment, but none should be blockers.

ibis/backends/base/sql/alchemy/__init__.py Outdated Show resolved Hide resolved
ibis/backends/bigquery/__init__.py Outdated Show resolved Hide resolved
ibis/backends/pandas/__init__.py Show resolved Hide resolved
@gforsyth gforsyth force-pushed the create-table-uniformity branch 3 times, most recently from a20b677 to 455c1a5 Compare March 15, 2023 13:16
@cpcloud cpcloud force-pushed the create-table-uniformity branch 2 times, most recently from bc60aee to b6f5df7 Compare March 15, 2023 17:42
@cpcloud
Copy link
Member Author

cpcloud commented Mar 15, 2023

Cloud backends:

cloud in 🌐 albatross in …/ibis on  create-table-uniformity is 📦 v4.1.0 via 🐍 v3.10.10 via ❄️  impure (ibis-3.10.10-env)
❯ pytest -m 'bigquery or snowflake' -q -n auto --dist loadgroup
bringing up nodes...
xxxxxxx.xx.xx.x..xx...........xx...x.......xx..x.x...............x..x..x....x.x........x......x.x.....x.x..x.. [  5%]
..x................x.x..x.x.x.x..x..........xx.........x......x..x.x...x......x.x.x.x.xx.x...x.xxxx........... [ 10%]
x..x...xx.x.x.x.xx.....x.x..xx.........x.x..xx....xx.......x.....x.....x..x..........x.......x...............x [ 16%]
..x......................................x.................................................................... [ 21%]
.x.xx.............x.....x......x..xx.....xx.....x..........x..x..x........xx..........xx.x..x........x........ [ 27%]
...x...............................................x..............x....................................xxxxxxx [ 32%]
x.xxx.x.xxx.x.xxx.xxxx.xxxxxxxx.xx.xxxxxx..x.xxxx..xx.....x...x........xx......x...x.......x...x......x.x..... [ 38%]
........x...x......x.....x.xx.x.......x.....x.x.x.....x...x.x..x..x........x......x...x..x......x...x..x.s.x.. [ 43%]
...x....s......x................................x.....................x........x.............x........s.....x. [ 48%]
.x........x.........x.....x.x....xx.......x...x......x..xx...x..xxx.x..x..........xx.x.......x..xx..x.x..x.xx. [ 54%]
.x..x..xx.x.x....xx.....xx....x.....x.x............x...................x...x.....x..........x.....x....x...... [ 59%]
..............x...x........x..................x.xx......................................................x.xx.. [ 65%]
....x.x.x...........xx..xxx..x.x..x.x....xx......x........x..........x.....x...............x.x................ [ 70%]
....x....................................x..........................x...xx.xxxxxxxxxx..xx.x..........xx....... [ 76%]
................x...........x..............x.....................x.x.....x...........x.....xx..x.............. [ 81%]
...x......xx....x......x...x........x...s...x........x.x.x............x.....x..xxx.s......x................... [ 86%]
.........x.................................................................................................... [ 92%]
................................x....x...s...........x..................x..................................... [ 97%]
...........................x...............                                                                    [100%]
1688 passed, 6 skipped, 329 xfailed in 273.12s (0:04:33)

@cpcloud
Copy link
Member Author

cpcloud commented Mar 15, 2023

I am authoring a longer commit message, hold off on merging please!

This commit addresses most outstanding DDL API discrepancy issues including:

- `create_table`/`create_view` for pandas, dask and polars
- making the various DDL APIs as uniform as possible (see clickhouse for
  an example of divergence)
- deprecation of `load_data` (except on impala, since it's significantly
  different from the others)
- add clickhouse implementations of `create_table`/`create_view`/`create_database`
- standardization of APIs for creating tables

During the process of getting all of this to work, I uncovered multiple
issues with `snowflake-sqlalchemy`'s quoting behavior and had to monkey
patch in `normalize_name` to avoid the broken heuristic they are using.

Additionally, to avoid having to solve the "which case should I use?"
problem in multiple places, I decided to remove our backend-scoped
use of `sqlalchemy.MetaData`. Without removing it, we'd have to deal
with identifiers' case not matching. It's possible there's a performance
hit, but removing this maintenance burden until someone comes along
saying it's slow is worth it IMO.

BREAKING CHANGE: Snowflake identifiers are now kept **as is** from the database. Many table names and column names may now be in SHOUTING CASE. Adjust code accordingly.
@cpcloud
Copy link
Member Author

cpcloud commented Mar 15, 2023

Ok, merging on green!

@cpcloud cpcloud enabled auto-merge (rebase) March 15, 2023 22:05
@cpcloud cpcloud merged commit 833c698 into ibis-project:master Mar 15, 2023
@cpcloud cpcloud deleted the create-table-uniformity branch March 15, 2023 22:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backends Issues related to all backends ddl Issues related to creating or altering data definitions feature Features or general enhancements
Projects
None yet
3 participants