-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce cache initialization and operation times #4253
Comments
Which directory do you use for downloading? If you use the dir returned by Context.getExternalFilesDir(), I think it shouldn't be indexed by the system. |
Yes, I am using Context.getExternalFilesDir(null), just like in the demo. Now I am thinking about creating a new folder for each media, this should help the file system. But I will lose the ability to make multiple downloads, because it forces me to instantiate a new download manager everytime. |
If the issue is the system is scanning too many files, even if you put each download to a different folder, the total number of files, won't be changed. If the issue is a single folder has to many files, than it might help. You're right DownloadManager/Cache works with a single folder. If you want to use multiple folders then you need to create multiple DownloadManagers and Caches. Also for playback you need to use the right Cache instance. Could you try putting an empty file which is named ".nomedia" in to the parent folder of the cache folder. Not in to the cache folder as SimpleCache would delete it when the app starts. One more thing, please provide the information requested in the issue template as much as possible. |
If we cannot reduce the number of files and if creating multiple folders won't help the system, I believe it could at least help the SimpleCache and the player launch faster. Yesterday I did try the .nomedia file, but in the cache folder and in order to avoid to have it deleted before android indexation, I created it each time I start y service and each time I initiate my cache, but I was still not sure the file would be deleted before android indexation process. I did it that way because I was not sure creating it in the parent folder would affect the subfolder. Otherwise: Thanks for your help. Issue Template: Issue descriptionAndroid file system performance issue because of too much .exo file in the DownloadManager's cache folder. the process "/system/bin/sdcard" consumes a lot of ressources at each android boot and each cache initialization (after force killing and restarting the app for example) Reproduction stepsDescribe how the issue can be reproduced, ideally using the ExoPlayer demo app. Link to test contentThe content is downloaded on the local storage. it's Dash Streams with 4 second segments and Drm encryption. Version of ExoPlayer being used2.8.0 Device(s) and version(s) of Android being usedNvidia Shield, Android 7.0 |
Perhaps you can easy try if multi folder improves sdcard issue by manually moving files to different folders. |
And about SimpleCache, is there a way to generate an index for each media? so that the SimpleCache instance won't have to seek in all the cache Are the methods getKeys() and getCachedSpans(key) what I need? For my use case, is that SimpleCache instance appropriated? about the .nomedia file: I retested in the parent folder, it does not help the sdcard process |
Multiple folders is the solution to this bug. @erdemguven I didn't need to create multiple DownloadManagers, I wanted to keep the DownloadService to stay as functional as before. Details: |
Thanks for investigating this. I'll look into this and decide to use either a solution similar to what you have done or modifying SimpleCache to place each media under a separate folder. Sorry, I just noticed, I forgot to recommend another way to reduce the number of exo files. You can increase maxCacheFileSize when you create CacheDataSource. By default it's 2MB so downloaded media is divided to exo files max 2MB size. Passing a bigger number will make it create fever exo files. Problems with a bigger exo file size are, increased risk of losing more data in case the app crashes and you won't be able to read the currently downloaded exo file until it reaches to the max size or the end of stream. |
@erdemguven I followed your recommendation and changed the maxCacheFileSize to 20MB for testing, but it seems it cannot help because of my source's segment size: So the current code can divide chunks bigger than maxCacheFileSize, but it cannot combine chunks smaller than it. Is there anything else I can do to force building bigger files? I also noticed something else: in my download folder I always have all my chunks of 2MB or so and I also have as much files of 100KB~200KB what are they? is it audio? (they also are .exo) |
Unfortunately, there is no way to combine segments in to a single file. |
Alright, thanks for your help @erdemguven. |
@kvillnv Thank you for your tips. By the way, it is able to apply maxCacheSize to DownloaderConstructorHelper by inject custom CacheDataSinkFactory. |
@erdemguven I found critical bug related with this issue. If user download more than 65535 files in external memory which is the maximum number of FAT32 format, new downloaded chunks overwrite existed chunks. In my case, the average size of chunk is 55KB, and the size of cache folder is around 3.74GB, 65535 * 55KB. |
@KiminRyu, thanks for letting us know. It looks like we should definitely support multiple sub cache folders. |
@erdemguven I confirm multiple folders really helps a lot, so far performance and management is a lot better. |
@kvillnv Do you have any plan to send PR about sub cache folders? or could you leave some details about your changes in solution? This bug causes a lot of problem for my customers. It will be really helpful. Thank you! |
@erdemguven Any update for sub cache folders? PR? |
My understanding from this thread is that splitting the cached files into multiple folders improves performance, even though the total number of files present is the same (actually slightly greater, if you count the additional directories as files). If that's the case then we can simply shard the cached files between a number of directories, or in an approximately balanced tree of directories. There's no particular need for each directory to represent anything logical, like a piece of content, and it's much simpler to implement without trying to do this. To make the change, I think we need some idea of how many files you can put in a directory before performance starts to degrade. Does anyone have good data for this? Obviously we need to stay under 65535 for the FAT32 limit, but it sounds like performance starts dropping off well before that. We could pick something arbitrary like 1000, but it would be preferable to make the decision based on actual data. |
@ojw28 I had up to 250 folders for a total of 500GB. Largest folders were containing up to 5GB splitted in 2000 files. Performance is optimum. Also, having one piece of content per folder has another advantage: the folder can be deleted before launching a removeAction in the DownloadService so that the deletion is done instantly. |
Thanks for the information!
We're aware of the benefit, but the approach doesn't fit nicely with (non-download) caching use cases. In particular:
So we're trying to avoid having to go down that route if possible. We believe most of the latency associated with content deletion is actually due to repeatedly re-writing the cache index every time a segment is removed. We've already addressed this in the As an aside: One content per folder probably works really nicely for apps that download HLS streams, but it's not going to work nicely for apps that download 10,000 small MP3 files :). In that case you'd end up with 10,000 directories containing one file in each, which I suspect probably suffers from the same performance issues described in this thread. Approximately balanced (but otherwise arbitrary) sharding probably helps in both use cases. |
It's also likely we can just make fewer cache files in the first place for some use cases. |
We have a pretty good understanding of the problem now. We think the issue is caused by an Cache initialization requires querying file metadata for every file, which results in a complexity of:
When the N files are instead split equally across M sub-directories, the cost becomes:
If you take slices through
The cost still grows quadratically with Our plan to fix this is:
|
- Increase the default cache file size to 5MB - Recommended a minimum cache file size of 2MB to discourage applications from specifying values small enough such that unreasonably large numbers of cache files are generated - Allow maxCacheFileSize=C.LENGTH_UNSET, equivalent to setting it to MAX_VALUE. This is just for API consistency with other APIs we have that accept LENGTH_UNSET Issue: #4253 PiperOrigin-RevId: 227524233
Calls to File.length() can be O(N) where N is the number of files in the containing folder. This is believed to be true for at least FAT32. Repeated calls for the same file tend to be faster, presumably due to caching in the file system, however are still surprisingly expensive. Hence minimizing the number of calls is preferable. Issue: #4253 PiperOrigin-RevId: 228179921
This is the initialization part of mitigating issue #4253. The remaining work is on the writing side, and is simply a case of having startFile return File instances that are sharded into sub-directories. We still need to decide what scheme we want to use for doing that. Issue: #4253 PiperOrigin-RevId: 228306327
DataSpec.FLAG_ALLOW_CACHE_FRAGMENTATION is added to indicate to the cache when fragmentation is allowed. This flag is set for progressive requests only. To avoid breaking changes, CacheDataSink defaults to ignoring the flag (and enabling fragmentation) for now. Respecting the flag can be enabled manually. DownloaderConstructorHelper enables respecting of the flag. Issue: #4253 PiperOrigin-RevId: 229176835
Issue: #4253 PiperOrigin-RevId: 230497544
Issue: #4253 PiperOrigin-RevId: 232659869
Issue: google#4253 PiperOrigin-RevId: 232659869
This should be much improved now. Please give the |
Hi,
I am working on Nvidia Shield, with and external USB 3 storage, the storage is set as "This device's storage", the apps data is moved to the storage.
I am downloading DASH streams, using DownloadService and DownloadManager.
There is performance issue because of the quantity of files generated by exoplayer Downloaders.
After having downloaded a few GB of data, I now have thousands of .exo files in my download folder. Since then, everytime I boot my device when the external device is mounted, I can see the process /system/bin/sdcard is taking up to 40% of cpu for a while. It's probably indexing all these files.
Same thing when I start my app, the first time I will play a media: the same android indexation process will start and affect my device perfomance, my media will only play after a few seconds.
Once it is indexed every other media will play instantly.
Moreover, exoplayer's actionFile already has its own indexation. So this heavy system indexation process does not really help exoplayer.
Handling thousands of files on the storage is really tough for the system.
Is there any way to optimize that?
Once a media is downloaded, do you have a way to merge or archive all it's .exo files in just 1 file?
Or maybe there is another solution, please advise.
Thanks
The text was updated successfully, but these errors were encountered: