You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Deletes content directly from this.cache which is the cache directory configured by npm and is shared across other modules, this is especially relevant for make-fetch-happen. When we remove content that could've been written by someone else, we're putting the other module in a position that it can have an up to date and correct index entry that directs to content that isn't there. Granted, make-fetch-happen should be more resilient to this, but I think the correct fix here is to be a little smarter about how pacote handles the cache.
Using the example of make-fetch-happen, data is already in cacache and will already be read from there. Right now pacote will read from the cache, create a new integrity stream to verify the data again (cacache already did this when it read the original content), and write back to the same cache while also returning the data to the user. This is an extra write and hash that we absolutely do not need to do, and it occurs for every single file that is retrieved through the RemoteFetcher and RegistryFetcher classes.
This module should be modified such that it only uses cacache for subclasses that are not already cached by their own means, which today is file, dir, and git. Arguably we shouldn't bother caching any of these things at the pacote level since it's reasonably likely git state will change without us knowning, and copying a file (or packing a directory and copying the result) to cacache isn't really of much benefit.
What / Why
Currently the
cleanupCached()
method inlib/fetcher.js
seen here: https://github.com/npm/pacote/blob/latest/lib/fetcher.js#L323Deletes content directly from
this.cache
which is the cache directory configured by npm and is shared across other modules, this is especially relevant for make-fetch-happen. When we remove content that could've been written by someone else, we're putting the other module in a position that it can have an up to date and correct index entry that directs to content that isn't there. Granted, make-fetch-happen should be more resilient to this, but I think the correct fix here is to be a little smarter about how pacote handles the cache.Using the example of make-fetch-happen, data is already in cacache and will already be read from there. Right now pacote will read from the cache, create a new integrity stream to verify the data again (cacache already did this when it read the original content), and write back to the same cache while also returning the data to the user. This is an extra write and hash that we absolutely do not need to do, and it occurs for every single file that is retrieved through the RemoteFetcher and RegistryFetcher classes.
This module should be modified such that it only uses cacache for subclasses that are not already cached by their own means, which today is file, dir, and git. Arguably we shouldn't bother caching any of these things at the pacote level since it's reasonably likely git state will change without us knowning, and copying a file (or packing a directory and copying the result) to cacache isn't really of much benefit.
This is related to npm/arborist#297
The text was updated successfully, but these errors were encountered: