-
-
Notifications
You must be signed in to change notification settings - Fork 398
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix stack overflow for large projects #1484
base: main
Are you sure you want to change the base?
Conversation
parser.parse_remaining_files | ||
retries += 1 | ||
if globals.ordered_parser | ||
retryable_file = parser.file == "(stdin)" ? StringIO.new("void Init_Foo() { #{statement.source} }") : parser.file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgive my sin here. I wrote it this way because of tests that fail when we only try and use parser.file
like in lib/yard/handlers/base.rb
.
An example test that fails is
it "resolves namespace variable names across multiple files" do |
Here's what happens/reasoning behind this weird hack:
- Foo::Bar is not resolved and is instead retried in the first string of that test
- Because this comes from stdin and not a file,
parser.file
is(stdin)
. - If we push the string
(stdin)
onto theglobals.ordered_parser.files_to_retry
it's not able to be parsed by OrderedParser#parse. It should instead be aStringIO
with the C code contents. - The C code contents from
statement.source
are missing the wrappervoid Init_Foo() { ... }
and needs to be manually added.
An alternative to this was to save the contents in globals but that felt hacky as well for a couple reasons:
- I was afraid pushing contents of files on globals could result in runaway mem usage (though now that I think about it, each new file probably clobbers the last one)
- This is only useful in the niche case of C code from STDIN, so having a global for that felt like overkill and potentially confusing code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is definitely not mergeable with a hack like this, especially one that artificially causes a failing test to pass targeted specifically for that test.
Notably, (stdin) parsing is entirely common and not specific to C. What you're suggesting here is this PR does not support this use case. That would be a problem and highlights a possible breaking change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent feedback, thank you. I'll let this one simmer a bit and see if I can come up with a less-hacky patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to get around to this so late. To be honest, the use of the specific workaround to C parsing is definitely concerning and makes me wonder if this will cause a breaking change, and thus I have not looked too deeply at merging.
Unrolling the recursive loop might be useful here, but it would have to be done in a way that respects the existing order of operations and API capabilities. Supporting StringIOs is definitely one of those API capabilities.
I think this PR would need another pass to make it compatible.
parser.parse_remaining_files | ||
retries += 1 | ||
if globals.ordered_parser | ||
retryable_file = parser.file == "(stdin)" ? StringIO.new("void Init_Foo() { #{statement.source} }") : parser.file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is definitely not mergeable with a hack like this, especially one that artificially causes a failing test to pass targeted specifically for that test.
Notably, (stdin) parsing is entirely common and not specific to C. What you're suggesting here is this PR does not support this use case. That would be a problem and highlights a possible breaking change.
When large projects have a lot of missing objects in the first pass of parsing Ruby files we would run into stack overflows. This happens because, when there was a missing object in a parsed Ruby file, we would recursively call `#parse_remaining_files`. When this happens a lot the stack would get huge. We fix this by instead keeping a list of files that we want to retry and re-parse them in another pass. When we can no longer resolve any more files we break the loop.
ec5be9e
to
8a14e13
Compare
Description
When large projects have a lot of missing objects in the first pass of parsing Ruby files we would run into stack overflows. This happens because, when there was a missing object in a parsed Ruby file, we would recursively call
#parse_remaining_files
. When this happens a lot the stack would get huge.We fix this by instead keeping a list of files that we want to retry and re-parse them in another pass. When we can no longer resolve any more files we break the loop.
Fixes #1375
Completed Tasks
bundle exec rake
locally (if code is attached to PR).It's difficult to write a test for this because it depended on a lot of files until it would break.
I tested this by running
yard doc
on my company's internal monolith and having it working, whereas the current release of yard wouldSystemStackError