Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Purposing analysisTypes for Druid refresh metadata #1979

Closed
3 tasks done
noppanit opened this issue Jan 13, 2017 · 3 comments
Closed
3 tasks done

Purposing analysisTypes for Druid refresh metadata #1979

noppanit opened this issue Jan 13, 2017 · 3 comments

Comments

@noppanit
Copy link
Contributor

noppanit commented Jan 13, 2017

If you have a very large dataset in Druid, refresh metadata can take a very long time and results in timeout.

Not sure if we can pass analysisTypes just only cardinality or something that's configurable which results in a much faster query.

https://github.com/airbnb/superset/blob/2d866e3ffa9bfedd3b3dad0d3463767aae879a14/superset/models.py#L2018

http://druid.io/docs/latest/querying/segmentmetadataquery.html#analysistypes

Because by default it will look for all the types and it seems like we only care about the columns.

  • I have checked the superset logs for python stacktraces and included it here as text if any
  • I have reproduced the issue with at least the latest released version of superset
  • I have checked the issue tracker for the same issue and I haven't found one similar

Superset version

Latest

Expected results

faster refresh druid metadata

Actual results

faster refresh druid metadata

Steps to reproduce

superset refresh_druid -m true

@noppanit
Copy link
Contributor Author

analysisTypes is supported by pydruid.

@xrmx
Copy link
Contributor

xrmx commented Jan 18, 2017

Has this been fixed in #1983?

@noppanit
Copy link
Contributor Author

Yes it's fixed in #1983. I'm closing the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants