-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Detect infinite recursions in nickel doc
#2055
Conversation
/// Generate an update frame from this thunk and set the state to `Blackholed`. Return an | ||
/// error if the thunk was already black-holed. | ||
pub fn mk_update_frame(&mut self) -> Result<ThunkUpdateFrame, BlackholedError> { | ||
if self.data.borrow().state == ThunkState::Blackholed { | ||
let mut data_ref = self.data.borrow_mut(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes in this function are cosmetic (avoid two calls to borrow()
/borrow_mut()
in a row: just borrow mut from the beginning)
} | ||
} else { | ||
Ok(None) | ||
// If the thunk is already evaluated, we don't return an update frame. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes are a bit cosmetic, but not only: previously, if idx.should_update()
was false
, we would never look at the state of the thunk and always proceed with evaluation. This is actually not correct, because if the thunk is blackholed, we should raise an infinite recursion error. For eval record spine in particular, the parents thunk are in weak head normal form (they are records), so should_update()
is false and without this additional change, the detection of the infinite cycle fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this code path is no longer triggered in eval_record_spine
? But maybe it makes sense to do this anyway...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I suspect the previous code could miss some infinite recursion errors as well because of that.
|
Branch | 2055/merge |
Testbed | ubuntu-latest |
⚠️ WARNING: The following Measure does not have a Threshold. Without a Threshold, no Alerts will ever be generated!Click here to create a new Threshold
For more information, see the Threshold documentation.
To only post results if a Threshold exists, set the--ci-only-thresholds
CLI flag.
Click to view all benchmark results
Benchmark | Latency | nanoseconds (ns) |
---|---|---|
fibonacci 10 | 📈 view plot | 495,290.00 |
pidigits 100 | 📈 view plot | 3,231,100.00 |
product 30 | 📈 view plot | 847,830.00 |
scalar 10 | 📈 view plot | 1,537,800.00 |
sum 30 | 📈 view plot | 840,870.00 |
I think the recursion check has some false positives: with this PR, the input:
doesn't generate documentation for |
I wonder if we can do much better than that, at least using the thunk machinery, because when
Which is looping. So either we think it's ok to not print documentation in this case because the structure of the configuration is a bit intricate - do we expect configuration from which we extract doc to exhibit this sort of shape? I honestly don't know. Or we can lower the detection power and accept to have more false negatives, but it becomes very ad-hoc. For example, we could only detect unguarded recursion, that is when a field directly hosts a naked variable as in Another possible solution is to try to count the number of times we unwrap the same thunk, and instead to halt at one, halt at some other higher empirical value (with the idea that if you recurse 10 times in the same thunk, there is a great change that this is an infinite loop). It has to be baked in the interpreter though, because |
What makes me wonder if this example is really a good working example is that you would rather refer to However, while it might be ok for documentation, the false positives preclude the usage of this technique for deep sequing and forcing, I believe. |
Maybe one last possible approach, which is I believe what you hinted at in #1815 (comment), would be to detect regular trees, that is when a contract or a field evaluates to a parent thunk directly, rather than this PR that is more sensitive and detects mere usage. I would indeed require an additional stack/set of active thunks, but shouldn't be very hard to implement either. Le met give it a try |
a0b0568
to
cb65d79
Compare
@jneem I implemented the last proposal, which seems to handle both your example and the original repro. I'm not sure it's a good idea to do that for In any case, it's read for a review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, except that I think there are some left-overs from the previous version.
I haven't thought too much about the semantics, but if performance is a concern during normal eval then I think the state could be moved into the thunks instead of having an external hash set: give each thunk a bit that says whether or not it's active.
data_ref.state = ThunkState::Evaluated; | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This lock/unlock are no longer used, right?
} | ||
} else { | ||
Ok(None) | ||
// If the thunk is already evaluated, we don't return an update frame. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this code path is no longer triggered in eval_record_spine
? But maybe it makes sense to do this anyway...
This is a very good point; we could just "mix" the two solutions to get the best of both worlds (no false positive and fast check). For now I'm going to proceed with the current version, but it's good to keep in mind for the |
cb65d79
to
f774618
Compare
@jneem Sorry I changed my mind 😛 as the The only drawback is that cleaning up after an eval error is a bit more involved (the external hashset was just dropped upon error, but here I reckon we have to make sure the thunk's state is properly reset, or it could interfere with future evaluation e.g. when The PR could deserve a new review though. |
This commit fixes an issue where legit recursive configurations cause infinite recursion when trying to extract their documentation. Indeed, `nickel doc` calls to `eval_record_spine`, which is unable to detect loops that span multiple evaluation step (an evaluation step is evaluating one field to a weak head normal form). This commit changes `eval_record_spine` and related methods to lock (in practice just flip a bit in the thunk's state) the active thunks, that is the thunks of parents of the field that we are currently evaluating. If we ever come across a thunk that is already locked, the field is left unevaluated, avoiding infinite recursion.
f774618
to
ee0d914
Compare
Closes #1967.
This commit fixes an issue where legit recursive configurations cause infinite recursion when trying to extract their documentation. Indeed,
nickel doc
calls toeval_record_spine
, which is unable to detect loops that spans multiple evaluation steps (an evaluation step is evaluating a field to a WHNF).This commit adds a
lock
andunlock
method to the interface of thunks, which amounts to blackholing, and use them in the implementation ofeval_record_spine
such that the thunks of the parents of the field that we are currently evaluating - the active parents - are always locked. If there is a cycle, this will raise an infinite recursion error, that we ignore - we can still print the unevaluated value for the user - but will prevent the evaluation from going on forever.This technique should be used to fix other similar issues with
force
anddeep_seq
, but this is left for future work.