Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer #936

Merged
merged 4 commits into from
Apr 11, 2019

Conversation

emtwo
Copy link

@emtwo emtwo commented Apr 8, 2019

These are a couple of small UI tweaks related to schema browser changes.

Edit: I've added some more changes.

  1. get_schema() is now faster with only two postgres queries
  2. Column metadata is available when Athena uses glue
  3. Fixing a bug in the metadata processing

@emtwo emtwo force-pushed the emtwo/schema_ui_fixes branch from d98c01b to 5fd1c54 Compare April 9, 2019 18:07
@emtwo emtwo requested review from jezdez and washort April 10, 2019 14:59
@emtwo emtwo force-pushed the emtwo/schema_ui_fixes branch from 5fd1c54 to fbf7ce3 Compare April 10, 2019 17:44
Copy link

@washort washort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good

columns = ColumnMetadata.query.all()

for column in columns:
if column.exists == False:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can just be

Suggested change
if column.exists == False:
if not column.exists:

if column.exists == False:
continue

if column.table_id not in columns_by_table_id:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use .setdefault(column.table_id, {}).append({...}) here

@emtwo emtwo force-pushed the emtwo/schema_ui_fixes branch from fbf7ce3 to 6b369e2 Compare April 10, 2019 19:44
@emtwo emtwo force-pushed the emtwo/schema_ui_fixes branch from 6b369e2 to 42f691a Compare April 11, 2019 01:42
@emtwo emtwo merged this pull request into master Apr 11, 2019
emtwo pushed a commit that referenced this pull request Apr 11, 2019
…ty example column in schema drawer (#936)

* Closes #934, #935: Remove type from schema browser and don't show empty example column

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.
emtwo pushed a commit that referenced this pull request Apr 11, 2019
…ty example column in schema drawer (#936)

* Closes #934, #935: Remove type from schema browser and don't show empty example column

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.
@jezdez jezdez deleted the emtwo/schema_ui_fixes branch April 15, 2019 14:44
jezdez pushed a commit that referenced this pull request May 13, 2019
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.
@jezdez jezdez mentioned this pull request May 13, 2019
2 tasks
jezdez pushed a commit that referenced this pull request May 16, 2019
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.

Co-authored-by: Alison <github@bankofknowledge.net>
washort pushed a commit that referenced this pull request Jun 10, 2019
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.

Co-authored-by: Alison <github@bankofknowledge.net>
jezdez pushed a commit that referenced this pull request Jun 13, 2019
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.

Co-authored-by: Alison <github@bankofknowledge.net>
washort pushed a commit that referenced this pull request Jun 27, 2019
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.

Co-authored-by: Alison <github@bankofknowledge.net>
washort pushed a commit that referenced this pull request Jun 28, 2019
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.

Co-authored-by: Alison <github@bankofknowledge.net>

Schema Improvements Part 2: Add data source config options.

Adding BigQuery schema drawer with data types and samples.
jezdez pushed a commit that referenced this pull request Aug 19, 2019
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.

Co-authored-by: Alison <github@bankofknowledge.net>

Schema Improvements Part 2: Add data source config options.

Adding BigQuery schema drawer with data types and samples.
washort pushed a commit that referenced this pull request Sep 16, 2019
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.

Co-authored-by: Alison <github@bankofknowledge.net>

Schema Improvements Part 2: Add data source config options.

Adding BigQuery schema drawer with data types and samples.
washort pushed a commit that referenced this pull request Sep 17, 2019
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.

Co-authored-by: Alison <github@bankofknowledge.net>

Schema Improvements Part 2: Add data source config options.

Adding BigQuery schema drawer with data types and samples.
emtwo pushed a commit that referenced this pull request Nov 5, 2019
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.

Co-authored-by: Alison <github@bankofknowledge.net>

Schema Improvements Part 2: Add data source config options.

Adding BigQuery schema drawer with data types and samples.
jezdez pushed a commit that referenced this pull request Jan 22, 2020
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.

Schema Improvements Part 2: Add data source config options.

Adding BigQuery schema drawer with data types and samples.

Add empty migration to replace the removed schedule_until migration

Add merge migration.

Co-authored-by: Alison <github@bankofknowledge.net>
jezdez pushed a commit that referenced this pull request Feb 5, 2020
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.

Schema Improvements Part 2: Add data source config options.

Adding BigQuery schema drawer with data types and samples.

Add empty migration to replace the removed schedule_until migration

Add merge migration.

Co-authored-by: Alison <github@bankofknowledge.net>
jezdez added a commit that referenced this pull request May 4, 2020
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.

Schema Improvements Part 2: Add data source config options.

Adding BigQuery schema drawer with data types and samples.

Add empty migration to replace the removed schedule_until migration

Add merge migration.

Co-authored-by: Alison <github@bankofknowledge.net>
Co-authored-by: Jannis Leidel <jannis@leidel.info>
jezdez added a commit that referenced this pull request May 14, 2020
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.

Schema Improvements Part 2: Add data source config options.

Adding BigQuery schema drawer with data types and samples.

Add empty migration to replace the removed schedule_until migration

Add merge migration.

Fix spacing issue with data scanned value in query execution metadata.

Increase schema refresh timeout.

Co-authored-by: Alison <github@bankofknowledge.net>
Co-authored-by: Jannis Leidel <jannis@leidel.info>
robhudson pushed a commit that referenced this pull request Jun 11, 2020
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.

Schema Improvements Part 2: Add data source config options.

Adding BigQuery schema drawer with data types and samples.

Add empty migration to replace the removed schedule_until migration

Add merge migration.

Fix spacing issue with data scanned value in query execution metadata.

Increase schema refresh timeout.

Remove old migrations.

Co-authored-by: Alison <github@bankofknowledge.net>
Co-authored-by: Jannis Leidel <jannis@leidel.info>
emtwo pushed a commit that referenced this pull request Jul 15, 2020
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.

Schema Improvements Part 2: Add data source config options.

Adding BigQuery schema drawer with data types and samples.

Add empty migration to replace the removed schedule_until migration

Add merge migration.

Fix spacing issue with data scanned value in query execution metadata.

Increase schema refresh timeout.

Remove old migrations.

Co-authored-by: Alison <github@bankofknowledge.net>
Co-authored-by: Jannis Leidel <jannis@leidel.info>
jezdez added a commit that referenced this pull request Oct 15, 2020
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.

Schema Improvements Part 2: Add data source config options.

Adding BigQuery schema drawer with data types and samples.

Add empty migration to replace the removed schedule_until migration

Add merge migration.

Fix spacing issue with data scanned value in query execution metadata.

Increase schema refresh timeout.

Remove old migrations.

Co-authored-by: Alison <github@bankofknowledge.net>
Co-authored-by: Jannis Leidel <jannis@leidel.info>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants