-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batched Mode : Issue with Conditional Assignment of Closure With Varying Weight #1637
Batched Mode : Issue with Conditional Assignment of Closure With Varying Weight #1637
Conversation
9dca197
to
e9e7707
Compare
OK, I think I have actually figured out the fix: in batched_llvm_gen, llvm_gen_closure explicitly loops through each lane of the closure pointers, but there didn't seem to be anything to prevent it from running on lanes that aren't part of the mask. I added a test_mask_lane to the existing branch that checks for null closures, and that was enough to get the test passing ( though definitely make sure that someone who knows this codebase better than I looks at this before merging it ). |
src/liboslexec/batched_llvm_gen.cpp
Outdated
llvm::Value* cond = rop.ll.op_and( | ||
rop.ll.op_ne(rop.ll.ptr_cast(comp_ptr, rop.ll.type_void_ptr()), | ||
rop.ll.void_ptr_null()), | ||
rop.ll.test_mask_lane(mask, rop.ll.constant(lane_index))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
" rop.ll.constant(lane_index)" isn't wrong, but you should be able to pass lane_index in directly without creating the constant.
This appears more correct. Initially lanes may not of been allocated so the nullptr test would of prevented execution, but as soon as the same symbol gets populated from different masks (as it does in this if/else case) then nullptr is no longer enough. So yes checking if the lane is active is correct. |
Looking at the test example, And following the code, I see the underlying I see the underlying closure_component_allot would get called twice, once for each branch, the resulting wide Ci symbol would have pointers to two different blocks, which I suppose is fine. Perhaps a bit wasteful, but fine. Might be worth reviewing who/how that closure memory gets cleaned up/resused. |
Excellent find, please add more complex unit tests to more fully exercise complex closure behaviors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
e9e7707
to
2c92ef8
Compare
I made a very minor correction, replacing I also tried to understand Alex's comment about checking up on how the memory gets released, and whether having different lanes pointing to different blocks would create issues. As far as I can tell, closures are just allocated from a SimplePool which never frees any elements, it only has a clear() method to start back at the beginning reusing all memory ... since there is no tracking of individual elements, I don't see any problems arising from pointing to two different blocks ( though it is a little bit inefficient ). |
Is this fix likely to make it into an official release soonish? We'd love to include it in the next major version of Gaffer so we can enable batched mode for our OSL-based geometry and image processing nodes. Does anything else need to happen from our side? |
Signed-off-by: Daniel Dresser <danield@image-engine.com>
2c92ef8
to
29a8ab5
Compare
The Mac CI failure appears unrelated to these changes, so I'm going to merge. |
@johnhaddon Merged now, I will immediately backport to the release branch. I can cut a tagged release soon, but want to get a couple other outstanding PRs merged finalized and merged first. |
Actually, I just noticed that it's already the 25th (my how time gets away from us). There's already a release scheduled for Feb 1. @johnhaddon is that adequate, or do you need a tagged release before then? |
Thanks for the speedy response @lgritz! Feb 1st will be great for us... |
…n#1637) Signed-off-by: Daniel Dresser <danield@image-engine.com>
I might need some help again to actually fix this one.
I've added a test where Ci is set to an emission closure with a varying weight on one side of a conditional, and a uniform weight on the other. In batched execution with optimization on, this results in corrupted closure pointers.
I've tracked the trigger down to the final case in RuntimeOptimizer::peephole2, which converts from a closure multiplied to a weight to a closure-with-scale constructor. If I comment out this one optimization, the tests pass, but I don't think the bug is in this function ... as far as I can tell, this optimization is the only way to call the closure-with-scale constructor, so the optimization is probably fine, but something else is failing to handle the closure-with-scale constructor - or the resulting wide closure pointer, where half of it points closures with uniform weights and half points to closures with varying weights?
Checklist: