-
-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate external Huggingface data to 311 Data Huggingface repo #1714
Comments
This ticket is ready to be picked up |
ETA: Sunday 5/19 |
Updating the ETA to Sunday 6/1 |
Added PR that enables 2022 data for now. Waiting for reviews to make sure no issues before continuing to add the other years with same implementation. |
Most recent PR only enable 2022 data, reopening this issue to continue migrate other older years. |
PR is approved, we're ok to shelve this ticket until we decide we need even earlier data. |
@Skydodle I just wanted to get a paper trail on our reasoning for fully closing this. Is it correct that dates 2019 and prior would require a serious amount of data cleaning in order to smoothly integrate it 2020-2024? Could you outline some of the technical hurdles that you had encountered when looking at those datasets? |
What would consume the most time is that the anomalies in the csv would most likely not be detected until it's been transformed to parquet, upload to 311's hf, configured to displayed on the UI, then we'll see some data not displaying correctly or not displaying at all. And then backtrack to make the correction and redo the entire process. I've created some tools for debugging in PR #1747 |
Overview
We need to port in 2016-2022 data into the 311-Data HF repo so that users can have access to all available 311 request data
Action Items
Resources/Instructions
The text was updated successfully, but these errors were encountered: