-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error handling /proc/mounts with long lines #142
Comments
Hello I knew this code would break one day :/ We need the relative openat() so that we can load sys/proc filesystems gathered by hwloc-gather-topology for offline regression testing, instead of the local sys/proc files. openat() makes that code generic and simple everywhere in the Linux backend. That said, in yout case, we always open /proc/mounts, so the generic openat() code isn't that useful. I'll try to just setmnt(filesystemroot+"/proc/mounts") directly. For the record, what makes your /proc/mounts line so long? |
The lines are long because of the way Cray mounts their system images. Here is an example:
|
Ok. Which hwloc or OMPI are you using? (to make sure my upcoming patch will apply) |
Both ompi master and v2.x will need to be patched. |
setmntent() doesn't support root_fd, but manual parsing of /proc/mounts is fragile, and actually buggy for very long mount lines (see open-mpi#142 (comment)). Since we only openat("/proc/mounts") there, just manually concatenate the fsroot_path and use setmntent(). Thanks to Nathan Hjelm for the report. Fixes open-mpi#142. (cherry picked from commit 41833e7)
Here's a patch against hwloc v1.11.2 (pretty close to what's in OMPI iirc) |
setmntent() doesn't support root_fd, but manual parsing of /proc/mounts is fragile, and actually buggy for very long mount lines (see open-mpi/hwloc#142 (comment)). Since we only openat("/proc/mounts") there, just manually concatenate the fsroot_path and use setmntent(). Thanks to Nathan Hjelm for the report. (Cherry-picked from open-mpi/hwloc@d2d07b9) Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
setmntent() doesn't support root_fd, but manual parsing of /proc/mounts is fragile, and actually buggy for very long mount lines (see open-mpi/hwloc#142 (comment)). Since we only openat("/proc/mounts") there, just manually concatenate the fsroot_path and use setmntent(). Thanks to Nathan Hjelm for the report. (Cherry-picked from open-mpi/hwloc@d2d07b9) Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
setmntent() doesn't support root_fd, but manual parsing of /proc/mounts is fragile, and actually buggy for very long mount lines (see open-mpi/hwloc#142 (comment)). Since we only openat("/proc/mounts") there, just manually concatenate the fsroot_path and use setmntent(). Thanks to Nathan Hjelm for the report. (Cherry-picked from open-mpi/hwloc@d2d07b9) (cherry picked from open-mpi/ompi@15007b4) Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
setmntent() doesn't support root_fd, but manual parsing of /proc/mounts is fragile, and actually buggy for very long mount lines (see open-mpi/hwloc#142 (comment)). Since we only openat("/proc/mounts") there, just manually concatenate the fsroot_path and use setmntent(). Thanks to Nathan Hjelm for the report. Fixes chapel-lang#142. (cherry picked from commit 41833e7ac317d80d7cd7ce31f43d9508cea58522)
setmntent() doesn't support root_fd, but manual parsing of /proc/mounts is fragile, and actually buggy for very long mount lines (see open-mpi/hwloc#142 (comment)). Since we only openat("/proc/mounts") there, just manually concatenate the fsroot_path and use setmntent(). Thanks to Nathan Hjelm for the report. Fixes chapel-lang#142. (cherry picked from commit 41833e7ac317d80d7cd7ce31f43d9508cea58522)
Fix hwloc /proc/mounts overflow [ok'd by @gbtitus] On some newer versions of CLE we were encountering an issue where long /proc/mounts entries were overflowing a 512 byte buffer. It turns out this issue was already reported and fixed on hwloc/master, but it won't make it into an official version before our release: open-mpi/hwloc#142 This just cherry-picks that commit and updates our hwloc README to mention it.
This fix breaks thread-safety because of getmntent(). By the way getmntent() is actually internally limited to 4kB lines. getmntent_r() is thread-safe and supports larger buffers, but it cannot report truncation errors anyway. See #194 for more details. |
getmntent() isn't thread safe, it uses a static internal struct mntent. Two concurrent hwloc_topology_load() would break if they call getmntent() (for finding the cgroup directory) at the same time. Use getmntent_r() instead. However we have to specify a buffer for storing mntent strings. And getmntent_r() doesn't actually report an error if the buffer is too small, it silently truncates output strings, so we can't dynamically realloc that buffer. The getmntent() that we used before this commit was internally limited to 4kB. And Linux actually limits mount options to 3 pages (during mount, not when reading /proc/mounts). So use 4 pages to be above both. Thanks to Corentin Rossignon for reporting the issue (using gcc -fsanitize=thread) Fixes #194 Refs #142
getmntent() isn't thread safe, it uses a static internal struct mntent. Two concurrent hwloc_topology_load() would break if they call getmntent() (for finding the cgroup directory) at the same time. Use getmntent_r() instead. However we have to specify a buffer for storing mntent strings. And getmntent_r() doesn't actually report an error if the buffer is too small, it silently truncates output strings, so we can't dynamically realloc that buffer. The getmntent() that we used before this commit was internally limited to 4kB. And Linux actually limits mount options to 3 pages (during mount, not when reading /proc/mounts). So use 4 pages to be above both. Thanks to Corentin Rossignon for reporting the issue (using gcc -fsanitize=thread) Fixes #194 Refs #142 (cherry picked from commit 2337361)
I have a system with unusually long lines in /proc/mounts (larger than 512 characters). This is causing parsing errors in hwloc_find_linux_cpuset_mntpnt. Ideally this could be fixed by using setmntent/getmntent/etc. Is there a reason why (as the comment states) that hwloc needs a relative open here? From what I understand the behavior of open and openat is identical when given an absolute path (such as /proc/mounts).
The text was updated successfully, but these errors were encountered: