-
-
Notifications
You must be signed in to change notification settings - Fork 153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement work-around for numba issue causing a segfault on M1 when using literal_unroll() with bools. #1027
Implement work-around for numba issue causing a segfault on M1 when using literal_unroll() with bools. #1027
Conversation
191244f
to
ff82567
Compare
@@ -198,11 +198,14 @@ def makevector({", ".join(input_names)}): | |||
|
|||
@numba_funcify.register(Rebroadcast) | |||
def numba_funcify_Rebroadcast(op, **kwargs): | |||
op_axis = tuple(op.axis.items()) | |||
# GH issue https://github.com/numba/numba/issues/8215 | |||
op_axis = tuple((axis, int(value)) for axis, value in op.axis.items()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is homogenizing the axis type as ((int, int), ^ N )
is it necessary to version the loop body with the unroller? Is there a chance axis can be None
perhaps, in which case this seems valid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
version the loop body with the unroller
I don't understand what you mean here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
version the loop body with the unroller
I don't understand what you mean here.
Apologies, I'll expand on this.
The need for doing something "special" to handle mixed type containers comes from the difficulties associated with working out the types of variables in the loop body in such a case, for example:
for x in (1, 'a', 2j, 3): # types = (int, char, cmplx, int)
print(x) # what type is `x`?! It changes on each iteration.
As a result, literal_unroll
was created to help:
for x in literal_unroll((1, 'a', 2j, 3)): # types = (int, char, cmplx, int)
print(x)
which translates to something like (it does all this in LLVM IR and isn't quite like this, but for the purposes of explanation, it's close enough!):
tup = (1, 'a', 2j, 3)
for idx in range(len(tup)):
if idx in (0, 3):
print(x) # This is the 'int' print
elif idx in (1,):
print(x) # This is the 'char' print
elif idx in (2,):
print(x) # This is the 'cmplx' print
i.e. the loop body is "versioned" and this lets Numba "iterate" over a mixed type containers.
If you have a tuple that is already homogeneous in type, then you don't need to do this translation as the loop is type stable.
For example:
for x in (1, 2, 4, 5): # types = (int, int, int, int)
print(x) # x is always int
Applying this to the present case. Is op_axis
now always going to contain some number of tuples which are of type (int, int)
, if so, standard iteration should "just work".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks for explaining!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks for explaining!
No problem. I wasn't sure in the case above whether you'll sometimes have e.g. a None
for axis
, or perhaps you need to preserve the literal nature of values captured scope of op_axis
, but if you don't, I think you might be able to get away with standard iteration, which will get you a faster loop (as there's no switch table in the body).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the literal_unroll
for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. For performance, if you can invent a way to make a loop type stable, that's usually a good approach as it means there's no need for a switch table and duplicated loop bodies. LLVM can often "see" through the "switch table" pattern but it's better, if possible, to avoid creating the risk of it not being able to optimising it away by simply avoiding it in the program source.
Thanks for working through this issue. The Numba folks will take a look at the segfault(s) next week!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stuartarchibald Thank you for the super fast help!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stuartarchibald Thank you for the super fast help!
No problem! Thanks for using Numba!
…ault on M1 when using literal_unroll() with bools. Closes aesara-devs#1023.
ff82567
to
b708213
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1027 +/- ##
==========================================
- Coverage 79.23% 79.23% -0.01%
==========================================
Files 152 152
Lines 47954 47953 -1
Branches 10918 10919 +1
==========================================
- Hits 37996 37995 -1
Misses 7449 7449
Partials 2509 2509
|
Closes #1023, which is actually several numba issues described in numba/numba#8215.
Implemented with help from @aseyboldt and @stuartarchibald.
Here are a few important guidelines and requirements to check before your PR can be merged:
pre-commit
is installed and set up.Don't worry, your PR doesn't need to be in perfect order to submit it. As development progresses and/or reviewers request changes, you can always rewrite the history of your feature/PR branches.
If your PR is an ongoing effort and you would like to involve us in the process, simply make it a draft PR.