Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(hive): Update _latest_partition_from_df in HiveEngineSpec to work on tables with multiple indexes #14302

Merged
merged 4 commits into from
Nov 23, 2021

Conversation

codenamelxl
Copy link
Contributor

SUMMARY

Fix #14301

DataFrame.ix is deprecated so I replaced it with .iloc.
Since Hive return partition in form ds={partition name}/ds={partition name} for table with multiple indexes, I also change the logic to make it work for this case.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

BEFORE:
image

AFTER:
No more error

TEST PLAN

  • Choose a Hive partitioned table by multiple indexes.
  • Check if /superset/extra_table_metadata return correct result with latest value of indexes.
  • I tested this in Spark Thrift Server but should also be similar in Hive.

ADDITIONAL INFORMATION

  • Has associated issue:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@codenamelxl codenamelxl changed the title Fix _latest_partition_from_df in HiveEngineSpec (fix) Update _latest_partition_from_df in HiveEngineSpec to work on tables with multiple indexes Apr 22, 2021
@codenamelxl codenamelxl changed the title (fix) Update _latest_partition_from_df in HiveEngineSpec to work on tables with multiple indexes fix(hive): Update _latest_partition_from_df in HiveEngineSpec to work on tables with multiple indexes Apr 22, 2021
@codecov
Copy link

codecov bot commented Apr 22, 2021

Codecov Report

Merging #14302 (fc2480a) into master (d05c561) will increase coverage by 0.93%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #14302      +/-   ##
==========================================
+ Coverage   76.99%   77.92%   +0.93%     
==========================================
  Files        1047     1005      -42     
  Lines       56497    51904    -4593     
  Branches     7799     5703    -2096     
==========================================
- Hits        43499    40448    -3051     
+ Misses      12742    11256    -1486     
+ Partials      256      200      -56     
Flag Coverage Δ
hive ?
mysql 81.99% <100.00%> (+<0.01%) ⬆️
postgres 82.00% <100.00%> (+<0.01%) ⬆️
presto ?
python 82.09% <100.00%> (-0.41%) ⬇️
sqlite 81.68% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
superset/db_engine_specs/hive.py 70.27% <100.00%> (-16.22%) ⬇️
superset-frontend/src/setup/setupPlugins.ts 0.00% <0.00%> (-100.00%) ⬇️
superset-frontend/src/setup/setupPluginsExtra.ts 0.00% <0.00%> (-100.00%) ⬇️
...src/filters/components/Range/RangeFilterPlugin.tsx 0.00% <0.00%> (-88.58%) ⬇️
superset/db_engines/hive.py 0.00% <0.00%> (-85.19%) ⬇️
...-frontend/src/visualizations/presets/MainPreset.js 0.00% <0.00%> (-80.00%) ⬇️
...end/src/filters/components/Range/transformProps.ts 20.00% <0.00%> (-80.00%) ⬇️
...c/visualizations/FilterBox/FilterBoxChartPlugin.js 0.00% <0.00%> (-75.00%) ⬇️
...c/visualizations/TimeTable/TimeTableChartPlugin.js 0.00% <0.00%> (-75.00%) ⬇️
...tend/src/visualizations/FilterBox/controlPanel.jsx 0.00% <0.00%> (-50.00%) ⬇️
... and 508 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d05c561...fc2480a. Read the comment docs.

@codenamelxl codenamelxl reopened this Apr 22, 2021
Copy link
Member

@dpgaspar dpgaspar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test please

@codenamelxl
Copy link
Contributor Author

Can you add a test please

Test added. Please let me know if there is anything else I can do.

@codenamelxl codenamelxl reopened this Apr 23, 2021
@dpgaspar dpgaspar closed this May 3, 2021
@dpgaspar dpgaspar reopened this May 3, 2021
Copy link
Member

@dpgaspar dpgaspar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but CI is failing with

black....................................................................Failed
- hook id: black
- files were modified by this hook

reformatted /home/runner/work/superset/superset/tests/db_engine_specs/hive_tests.py
All done! ✨ 🍰 ✨
1 file reformatted, 732 files left unchanged.

Use pre-commit for fix that automatically

@pull-request-size pull-request-size bot added size/S and removed size/M labels May 5, 2021
@codenamelxl codenamelxl requested a review from dpgaspar May 5, 2021 16:17
@codenamelxl
Copy link
Contributor Author

codenamelxl commented May 5, 2021

Thank you for the pointer. I have fixed formatting to pass black.

@codenamelxl
Copy link
Contributor Author

CI failing with:
superset/models/dashboard.py:174:0: W0125: Using a conditional statement with a constant value (using-constant-test)
seems not related to this PR.

@codenamelxl codenamelxl force-pushed the master branch 2 times, most recently from 77d19cb to 8266f8b Compare May 18, 2021 09:06
@codenamelxl
Copy link
Contributor Author

CI failing with:
superset/models/dashboard.py:174:0: W0125: Using a conditional statement with a constant value (using-constant-test)
seems not related to this PR.

I rebased to master to try make CI green.

@codenamelxl
Copy link
Contributor Author

@dpgaspar Sorry for direct tag.
Would you mind taking a look at this. I think the test failed is not related to this PR.
Thank you.

@codenamelxl codenamelxl closed this Jun 9, 2021
@codenamelxl codenamelxl reopened this Jun 9, 2021
@villebro
Copy link
Member

@codenamelxl sorry for the delayed review, but I agree with the proposed change here. Would you mind rebasing this PR so we can get it merged?

Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the improvement!

@villebro villebro merged commit bc855f4 into apache:master Nov 23, 2021
@codenamelxl
Copy link
Contributor Author

Thank for the rebase. Sorry for the late reply.

AAfghahi pushed a commit that referenced this pull request Jan 10, 2022
… on tables with multiple indexes (#14302)

* Fix _latest_partition_from_df in HiveEngineSpec

* Add test HiveEngineSpec._latest_partition_from_df

* Fix formatting to pass black

Co-authored-by: Ville Brofeldt <ville.v.brofeldt@gmail.com>
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 1.5.0 labels Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/S 🚢 1.5.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[connector:hive] extra_table_metadata fail for Hive partitioned table
4 participants