Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regularly check for dataset updates by providers #3329

Closed
Marigold opened this issue Sep 30, 2024 · 2 comments
Closed

Regularly check for dataset updates by providers #3329

Marigold opened this issue Sep 30, 2024 · 2 comments

Comments

@Marigold
Copy link
Collaborator

Problem

Providers release updates to their datasets, but we are not notified of them. Knowing which datasets have newer versions and could be updated would be helpful information on the ETL dashboard.

Solutions

Check data from the snapshot URL

Many snapshots contain a URL for the data. We could fetch the data from the URL, compare it to the data in our existing snapshot, and notify us if they differ.

Ask ChatGPT to check for updates

Our snapshot metadata file contains all the information about the provider, dataset, and URLs. We could provide this information to ChatGPT and ask it to use web search to check for updates (e.g., by checking the provider's changelog). This might not work perfectly but is simple enough to try.

@pabloarosado
Copy link
Contributor

Hi @Marigold, thanks for writing this up, and sorry, I didn't ignore it! I just thought it would be good to create a more detailed issue to properly shape the project. If you want, you can add your proposal in #3339 , and feel free to close this issue. Thanks!

@Marigold
Copy link
Collaborator Author

Marigold commented Oct 1, 2024

Closing in favour of #3339

@Marigold Marigold closed this as completed Oct 1, 2024
@larsyencken larsyencken closed this as not planned Won't fix, can't repro, duplicate, stale Oct 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants