-
-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🎉 wizard: anomalist (v2) #3388
Merged
Merged
🎉 wizard: anomalist (v2) #3388
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Quick links (staging server):
Login: chart-diff: ❌
Edited: 2024-10-11 10:15:18 UTC |
* 🎉 Add CLI for running anomaly detectors * merge with wizard-anomalist, pass ci/cd --------- Co-authored-by: lucasrodes <lucasrodes@users.noreply.github.com>
* Start a new staging server for branch 'variable-mapping' * add to_sql * define sqlite db name in variable * new methods to store variable mapping * force int if possible * fix inifinte loop * save variable mapping * minor ui tweak * add undo capabilities * store var mapping
This was referenced Oct 7, 2024
Merged
* ✨ wizard: anomalist ui * rename file * rename + tweak UI * function to get variable uris from indicator list * tweak config * minor fixes * demo * org: folder for app * ci/cd
* 🎉 anomalist: Detect new datasets automatically * Add temporary duplicates of the energy and electricity mix datasets for testing purposes * Add another temporary step * Move common function to detect new datasets to utils cached * Fix wrong mapping of dataset ids in indicator upgrader * Edit dag and energy steps to be able to play around with mappings and anomalies * Improve map_datasets * Let anomalist detect new datasets and list them * Cache inputs * remove redundant code --------- Co-authored-by: lucasrodes <lucasrodes@users.noreply.github.com>
…cator (#3368) * ✨ wizard: anomalies * wip * bump streamlit * wip * wip: chart * wip * todo * plot indicator * re-structure * wip: loading indicators * fix API grapher_chart * deprecate chart_html * chart_html -> grapher_chart * clean * feature: Detect abrupt changes in consecutive versions of an indicator * Improve compare_tables * Add new BARD score and improve compare_tables * ci/cd * wip * wip * changed module name * custom components module * add methods to get uris * get dataset uris * update import * update gpt pricing * update import * wip * provide entity-context for anomaly * wip: anomalist v2 * Implement detection of different kinds of anomaly types * Rename script * Rename script * Rename script * Create a class AnomalyDetector, simplify code * Improve scores dataframe * Rename score column * wip * wip * Improve detection of abrupt changes in time series * Add population score * Create function to get views for a list of variables * Add analytics score * Improve anomaly aggregation * Align with master * Align with master * Fix minor bug * minor cleaning * map entities only if explicitly asked * reduce re-implemented functions * avoid usage of get_connection * Ignore formatting issues --------- Co-authored-by: lucasrodes <lucasrodes@users.noreply.github.com>
* 🎉 anomalist: Improve anomalist CLI * Allow for multiple anomalies, datasets and variable ids * Fix small issues and let data loading use maximum number of workers
* ✨ anomalist: ui flog * wip * wip * enable multiple indicator plot * allow full entity mapping load * bugfix * polish demo * ci/cd
* 🎉 anomalist: Improve Anomalist backend * Improve types of anomaly_detection and cli * Minor refactor and removing useless todo * Move anomaly detection to a separate module * Prevent Anomaly from failing if table already exists * Big refactor to be able to add version change anomalies * Rename anomalies * Move detectors to a separate module * Use entity_name instead of entity_id * Convert to long format afterwards * Pass data explicitly to generate scores df
* ✨ wizard: improve app flow * add option to drop table when creating * adapt to new api * new function to create tables in anomalist * improve comments * checkfirst flag when creating table * re-order code * bug fixes in app flow * improve pagination ui * tweak internal grapher_chart flow * entity selection * module for chart configs * adjust for indicator upgrades * enable re-scan * help text, anomaly types, upgrade anomalies
* ✨ Add GP outlier detector * drop anomalies with zero values
* ✨ anomalist: stop using mock * style * ✨ anomalist: stop using mock data * re-order mock data * replace mock data with real data * discard df if all-zero
* 🐛 anomalist: Fix unknown variable ids * Fix missing variable ids when detecting anomalies in multiple datasets * Update misleading comment
* ✨ anomalist: nits * abstract df parsing logic * add GP outlier * add dfReduced to table * reset index * incorporate GP * re-arrange functions, add link to indicator * stop reducing dfScore
* ✨ anomalist: stop using mock * style * ✨ anomalist: stop using mock data * re-order mock data * replace mock data with real data * 🎉 anomalist: Add population and analytics scores * Store scores with all years and combine them on app * Add anomaly and population score, as well as weighted score * Move get_scores to utils --------- Co-authored-by: lucasrodes <lucasrodes@users.noreply.github.com>
* ✨ anomalist: nits * abstract df parsing logic * add GP outlier * add dfReduced to table * reset index * incorporate GP * re-arrange functions, add link to indicator * ✨ anomalist: test llms for summary * stop reducing dfScore * wip * wip * llm summary button * add function to get variables from DB * tag: icon is optional * AI summary
* ✨ Add max_time and n_jobs to gp_outlier
* 🐛 Fix anomalist bugs
* 🎉 anomalist: Experiment with different anomaly detection methods * Improve script to visualize anomalies * Improve visualization of anomalies, and try different methods * Improve cli * Some refactoring * Add useful comment * ✨ anomalist: Improve automatic detection of new datasets (#3429) * ✨ anomalist: Improve automatic detection of new datasets * Create new functions to detect new datasets, and speed up anomalist * Infer variable mapping * Use inferred variable mapping in Anomalist * Move function to get datasets info
* ✨ Add anomalist to owidbot
…3434) * 🐛 anomalist: Fix bug with unknown indicators and long loading time * Stop storing dfScore, which takes a long time to load * Fix GP detecting anomalies on old variables (which is unnecessary)
* 🐛 anomalist: Fix bug with unknown indicators and long loading time * Stop storing dfScore, which takes a long time to load * Fix GP detecting anomalies on old variables (which is unnecessary) * ✨ anomalist: Small improvement in Anomalist filters * Show instead of hide detectors in filter
* ✨ persist filter values in URL
Merged
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
How to work with this PR: