-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
random read access may raise "file not found" #78
Comments
Interesting use-case instead. I quickly checked |
@hasse69 |
@hasse69 |
The problem is most likely the same, just a different signature. The RAR compression algorithm is not made to support random access patterns. It is built on a serialized technology in the form of fixed blocks. However, there are a few things you could try to workaround this depending on the size of the files you need to extract. You most likely also need a patch in rar2fs to avoid the detection of a probable video indexing which will force a dummy read. But for that I need to know your exact version of rar2fs and also if you are building rar2fs yourself or if you are picking it up from some pre-built package. |
@hasse69 |
You can try this patch on master/HEAD. But you need to tweak your I/O buffer for this to work. By default the history size is 50% of the I/O buffer, you should leave it at that and only focus on the actual I/O buffer itself. You need to set it to at least twice the uncompressed size of the largest file in your archive. E.g. if your largest file (uncompressed) is 3.5MiB you need to set your I/O buffer to 8 ( |
I will try, thank you. |
@hasse69 |
I think you need to explain in more detail what you see. And also provide the command line arguments you gave to rar2fs etc. |
Btw, it is expected to get 'defunct' processes if your I/O buffer is large and RAR extraction to it completes before the file is actually closed by the application. My guess here is that the file is still open while you see these zombies because your application has not yet closed it. |
@hasse69 |
Again. The defunct processes are there because the application using the file has not closed it properly (the I/O error is due to that the I/O buffer is not big enough). |
@hasse69 |
I need your great help. 1. How to keep most of files in memory cache 2. How to decrease defuncts? I have about 400 defunct subprocesses. |
Let me get back to your questions later, but first try this patch instead. |
@hasse69 |
How to improve read performance/speed? Whether it is possible to cache all files when the memory is enough? |
Can you elaborate a bit on this? Improved, how? A bus error can not just disappear so it would be interesting to understand why you got it in the first place. Was it rar2fs that caught a bus error or the application?
What I did in last patch is to set SIGCHLD to not require a wait after a child is terminated. Since only a close by design will wait for the child, lack of close calls will obviously increase the number of zombies. As I said before, expected really but in your particular use-case, not really optimal.
This is very hard to answer since I am not even sure what OS you are running! There is also something you need to understand here. Given a large sized I/O buffer is providing rar2fs a chance to actually extract the entire file to RAM, but it is also going to waste RAM, and possibly a lot of it too! Every open() call made will allocate a new buffer of the specified size, used or not. So if you have many files there will be RAM wasted for sure. Adding to that the page-cache that will also consume memory resources. Your use-case is somewhat different from regular usage of rar2fs. Why do you keep .so files in a compressed RAR archive in the first place? I think I need to understand more about your setup and the rationale behind your use-case to better explain what are the current limitations of rar2fs, if any. |
@hasse69 Best Regards |
Maybe, maybe not. An I/O buffer is connected to a file descriptor returned by an open() call. Multiple calls to open() equals multiple I/O buffers. You could otherwise imagine what would happen if the I/O buffer was shared. One access could trash the buffer for others rendering it completely useless.
That one is simple :) When you run out of memory and the memory occupied by the I/O buffer is safe to swap out by the kernel and that there is a backing store available for it to do so. Again, this is nothing that rar2fs has control of, it is OS/MMU specific. |
Btw, I need to look deeper into how to publish my last few changes and if they in any way could affect other more common use cases negatively. So an official patch will not be available for a while. |
@hasse69 |
The rar2fs memory usage is increasing continually, 11GB RES and 18GB virtual. |
My guess is that the number of open files follow the same pattern. Check with 'lsof' and grep for your application. The I/O buffer is released at close and if there is no close calls made but new open calls you will eventually run out of memory or file descriptors what ever comes first.
|
@hasse69 |
@hasse69 |
The size of the archive is irrelevant. It is the number of files in it that matters since every file will allocate at least a 16M I/O buffer when opened. The pipes are created by the spawned child processes. Something is obviously not released properly but I cannot tell what is the root cause here since I have never seen this problem myself. Having 180k entries in lsof simply does not make sense with what I know currently about your use case. Is it possible for me to run your use case / test application? Otherwise I am not sure I would ever be able to explain what is going on. I am really not convinced this has anything to do with rar2fs but rather just a side effect of something else being broken somewhere.
|
Try to run rar2fs in the foreground using the -d flag and count the number of open calls and then compare against the number of release calls.
|
@hasse69 |
There is no release call on the I/O buffer. The fuse "RELEASE" is what is called in rar2fs when a file is closed. It will perform free() on the buffer.
|
You say that resident memory is not released even after your application exits? Is that also true for the lsof entries? Do a 'lsof' after mounting but before accesing anything, then compare the number of entries after your program exit.
|
Yes, I see. Is it possible to allocate I/O buffer size based on the size of individual file, not the fixed iobuf settings? |
Yes it is possible to introduce a somewhat more clever allocation but it will still need to be based on some specified max limit. It is not feasible to allocate simply based on the file size since it might be of several gigabytes! And I do not understand how it would help in anyway? It will not make the memory and open file descriptor leak to go away! Did you check if lsof also shows a leak after your application terminates? |
@hasse69
|
Ok so it makes perfect sense you see some release calls since the number of open files is going down from 311 to 162, but I think if you count the number of open calls they will not match the number of release calls which might explain why they do not drop down to 108. Could it be that after you terminate the main process there are still python related processes running that has not yet called close? If the number of open and release calls are the same I really cannot figure out why there is such a huge resource leak :(
|
Please attach the output from -d here, you can redirect output to a file using 2>d.log.
I need to have a look at it to see if there are any abnormalities.
|
I am labeling this as an enhancement since the RAR compression algorithm is not made for random access. Using a more clever I/O buffer allocation and using the proper settings at mount time would/should however make it possible at least for smaller files. |
I've cloned sources today and built rar2fs. If I try to mount a pdf file inside of a rar file then zathura won't open it saying - 'Document does not contain any pages'. If I copy the pdf file to a regular file system then it can be opened by zathura. I believe this is the same problem. Additionally I put the same pdf into a zip archive which was also fuse mounted using archivemount and zathura opened it normally too. So it really is rar2fs that produces such strange problem |
@flux242 can you please file a new isssue report since I am not really convinced it is related to this very specific use-case. Thanks, Also, if possible, try to attach the problematic archive so that I can try it myself. |
I put some so libs in a compressed archive and let program load from archive. Sometimes, the program will encounter :dl-reloc.c: No such file or directory.
The text was updated successfully, but these errors were encountered: