You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jun 11, 2024. It is now read-only.
I was chatting with @lelandrichardson about some of our plans for supporting multiple rendering targets (#73) and generating a review HTML page (#176), and we thought it might not be a bad idea to change the way we upload images.
Currently, we upload an HTML file and a set of images as a sort of bundle to a single "directory". Over many runs, this is likely to produce duplicate images. What if, instead, we put the images in a more general location where the filenames are hashes of the file contents. That way, if we have multiple images that are the same, we don't upload them again.
This would have the advantage of allowing us to make all examples browsable on the review page with minimal overhead.
If we do this, the files should be stored in sub-directories so that the directory doesn't end up growing to contain millions of files over time.
We could probably also apply the same technique to the JavaScript files that we need for the review page, to cut down on duplication, but that seems less valuable than doing it for the images.
The text was updated successfully, but these errors were encountered:
This is a good idea. Do you have an idea for how it could be implemented as well? It could be as simple as using the uploader configuration option that I'm adding/thinking of adding as part of supporting #176.
It would be nice to avoid re-uploading assets that already exist, so we might want to build in that check to the upload process. imageDirectory option sounds okay to me, but I think we could probably just set it to something and not make it configurable, right?
Ah, that might be right. Since people mostly have to specify an s3 bucket name. We can choose to make a special folder for images, and check before upload.
We might also try a potential optimization where we upload a JSON file for every sha that maps examples to their images. ~~~Then we can potentially download all of the images for the base commit if they already exist. Not sure if that will be faster but it might depending on how long the build takes.~~~ Then we can compare the hash of the current snapshot to the hash in the JSON file, which should be a lot faster.
I was chatting with @lelandrichardson about some of our plans for supporting multiple rendering targets (#73) and generating a review HTML page (#176), and we thought it might not be a bad idea to change the way we upload images.
Currently, we upload an HTML file and a set of images as a sort of bundle to a single "directory". Over many runs, this is likely to produce duplicate images. What if, instead, we put the images in a more general location where the filenames are hashes of the file contents. That way, if we have multiple images that are the same, we don't upload them again.
This would have the advantage of allowing us to make all examples browsable on the review page with minimal overhead.
If we do this, the files should be stored in sub-directories so that the directory doesn't end up growing to contain millions of files over time.
We could probably also apply the same technique to the JavaScript files that we need for the review page, to cut down on duplication, but that seems less valuable than doing it for the images.
The text was updated successfully, but these errors were encountered: