-
Notifications
You must be signed in to change notification settings - Fork 567
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Very long cataloging process #1328
Comments
Hey @erik-bershel, I'm trying to reproduce this locally. Can you try running that same syft command with "-vv" for some extra verbosity? Maybe it will give us a hint as to where it is stopping. |
Sure. I'll send you additional info a bit later today. |
@tgerla |
Hello @tgerla!
Nothing relevant included in verbose output. Just regular info/debug messages: Which information from these VMs might be helpful? |
Hi @erik-bershel, thanks for the details. A couple of questions for you:
We suspect that we're hitting some device files or some other special file on the VM that is slowing things down. We just added a "trace" level of verbosity to Syft (-vvv) which may help us identify where the slowdown is happening. I would be happy to do some experimentation on my side, if it is possible to get the VMs. Thanks! -Tim |
Hi @tgerla, thanks for response. |
Have noticed “slowness” with some of the performance too. (Albeit, not in the order of magnitude of Days) Have created #1353: the summary is that each cataloger runs serially and each cataloger seems to do glob pattern searches (which may be slow depending on the file indexer which I don’t have the full details for) against the file system. Judging from the time/large sbom sizes/number of packages found/ruining syft against the root directory, the OP’s VM may have a large file system and this is just the amount of time it takes to go though each cataloger serially? An unknown bug aside: issues:
Have a pr which attempts to address the serial running of catalogers: #1355 ——— ( notamaintainer ) |
Hey @tgerla @kzantow!
Logs and sbom-files: The same image was used to create both macOS 11 and another one for both macOS 12 VMs. |
nice. out of interest @erik-bershel have you tried setting Would (personally) be interested in seeing comparisions if so. (#1355 ) |
@Mikcl hmm. I'll try couple different options. Will return with results in two-three days. |
Coincidentally, I'm working on anchore/stereoscope#154 and #1510 , which dramatically speeds up searching the index (by leveraging more indexes). This won't help build the index any faster, so it will still be rather slow for something as large as the GitHub mac runners... but I've applied the same principle from #1510 to the directory resolvers by adding a stereoscope FileCatalog object and leveraging those new indexes instead of glob calls in a prototype branch: I'm seeing a dramatic speedup.
(this had been taking several... several hours before hand) I'll try and polish this up and get it in after these two PRs I have open. |
Please provide a set of steps on how to reproduce the issue
What happened:
Process freeze on "Cataloging packages" step. The longest process was started on November 5 at 2:46 PM (GMT) and is still ongoing.
What you expected to happen:
Any result or error message to figure out what we can do with that.
Anything else we need to know?:
SYFT-tool runs on GitHub Actions macOS runner images.
Environment:
syft version
: syft 0.60.3cat /etc/os-release
or similar): macOS 10.15.7 (19H2026), macOS 11.7.1 (20G918), macOS 12.6.1 (21G217)The text was updated successfully, but these errors were encountered: