This repository has been archived by the owner on Aug 21, 2024. It is now read-only.
Draft: Proposal to enable ZST encoded archives loaded by the transformer #574
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The retriever has been switched to provide zstandard encoded raw data instead of pure JSON.
As this is indeed an archived format it cannot be read like pure JSON data.
To enable the transformer to work with the new data we need to add the proper file conversion.
These changes are up for debate. However we need to fix this within the day as the scheduled transformer run fortomorrow morning will break the data again otherwise.
I've also removed the delete statement in our S3 upload as this removes files that aren't within the upload folder but are present in the S3 bucket.