-
-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for doing a masked fill from a tensor #612
Add support for doing a masked fill from a tensor #612
Conversation
src/arraymancer/tensor/selectors.nim
Outdated
inc n | ||
except IndexDefect: | ||
raise newException(IndexDefect, "The size of the value tensor (" & $value.size & | ||
") is smaller than the number of true elements in the mask (" & $mask.size & ")") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have tried this on Nim 1.6 and Nim 2.0?
This for sure crashes, likely with an incomprehensible error message on Nim 1.6 because exceptions are allocated in a thread-local manner.
In general, OpenMP sections should not allocate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. I had tried the "happy path" on both 2.0 and 1.6 but I had only tried the unhappy path in nim 2.0 (in which it works fine, at least with the small tensors that I tried it with). In 1.6 it does cause a crash.
What is the solution? I don't see any error handling on any of the existing OpenMP sections. Even the following (stupid) code in which I create the message and the exception beforehand crashes in 1.6:
let error_msg = "The size of the value tensor (" & $value.size &
") is smaller than the number of true elements in the mask (" & $mask.size & ")"
let indexDefectException = newException(IndexDefect, error_msg)
omp_parallel_blocks(block_offset, block_size, t.size):
var n = block_offset
try:
for tElem, maskElem in mzip(t, mask, block_offset, block_size):
if maskElem:
tElem = value[n]
inc n
except IndexDefect:
raise indexDefectException
I guess I could have 2 versions of the function, one for 2.0 with error checking and one for 1.6 without it. Is that the way to go?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way to go is to have no exceptions in parallel sections.
And in general, no seq or strings allocation, using only preallocated stuff, otherwise the parallel section needs this: https://github.com/mratsim/laser/blob/e23b5d6/laser/openmp.nim#L88-L110.
template attachGC*(): untyped =
## If you are allocating reference types, sequences or strings
## in a parallel section, you need to attach and detach
## a GC for each thread. Those should be thread-local temporaries.
##
## This attaches the GC.
##
## Note: this creates too strange error messages
## when --threads is not on: https://github.com/nim-lang/Nim/issues/9489
if(omp_get_thread_num()!=0):
setupForeignThreadGc()
template detachGC*(): untyped =
## If you are allocating reference types, sequences or strings
## in a parallel section, you need to attach and detach
## a GC for each thread. Those should be thread-local temporaries.
##
## This detaches the GC.
##
## Note: this creates too strange error messages
## when --threads is not on: https://github.com/nim-lang/Nim/issues/9489
if(omp_get_thread_num()!=0):
teardownForeignThreadGc()
I'm not sure what's the overhead of attacjing/detaching the GC.
This is the main motivation behind Nim switching to arc/orc GCs.
In the future it will be fine to require Nim 2.0, but I need a better Cuda story (it only works on Nim v1.2 at the moment unless you rebuild a nim.cfg from scratch that replaces --std:gnu++14 by --std:stdc++14.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made a change that ensures that when non ARC/ORC memory management the exception error message is shown. I have also added a test for the exception code when using ARC or ORC.
if too_few_values: | ||
let error_msg = "masked_fill error: the size of the value tensor (" & $value.size & | ||
") is smaller than the number of true elements in the mask" | ||
when not(compileOption("mm", "arc") or compileOption("mm", "orc")): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alterative that could work for older Nim would be:
when not(compileOption("mm", "arc") or compileOption("mm", "orc")): | |
when not defined(gcDestructors): |
but I don't mind to just drop 1.4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can try this change. In any case I think dropping 1.4 makes sense.
It looks fine to me now. One could clarify a bit more in the docstrings how different ranks are handled and specify that the mask and fill value tensors must have the same shape. The next thing 'missing' thing would then be a single element fill I guess (which could be convenient e.g. precisely for the type of NaN test case). Anyway, I think this fine to merge. Please just drop the Nim 1.4 from the CI. I don't see a purpose in continuing to support it officially. Most of it will work regardless. |
How would I drop Nim 1.4 from the CI? Is that something that I can do myself? |
Ah, sorry. I wanted to add that to my response, but I forgot. Yes, just remove it here in this line: https://github.com/mratsim/Arraymancer/blob/master/.github/workflows/ci.yml#L17 |
When using non ARC/ORC memory management modes, the exception message was not shown.
8d9cdec
to
61fb5f0
Compare
I've pushed a new commit that improves the documentation of all the masked_fill procedures (including the ones that fill with a scalar value).
Nim 1.4 has been dropped on a separate PR. |
Document that boolean masks can be used to access and change the elements of a tensor.
docs/tuto.slicing.rst
Outdated
foo[foo >. 27] = -arange(9) | ||
|
||
# Tensor[system.int] of shape "[5, 5]" on backend "Cpu" | ||
# |1 1 1 1 1| | ||
# |2 4 8 16 -1| | ||
# |3 9 27 -1 -1| | ||
# |4 16 -1 -1 -1| | ||
# |5 25 -1 -1 -1| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the output all -1
? Shouldn't the numbers decrease? Or is it picking the wrong overload and only uses the first element for some reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the latter might be happening, because this (on current master):
import arraymancer
var t = arange(25).reshape([5, 5])
t[t >=. 8] = 111
echo t
yields
Tensor[system.int] of shape "[5, 5]" on backend "Cpu"
|0 1 2 3 4|
|5 6 7 111 111|
|111 111 111 111 111|
|111 111 111 111 111|
|111 111 111 111 111|
(but note that the code in your new example of course does not even compile on master. So I'm not exactly sure why this behavior is as it is)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I just checked out your branch. It seems to work fine on my end. I assume you maybe just copied the wrong output tensor?
echo error_msg | ||
raise newException(IndexDefect, error_msg) | ||
|
||
proc masked_fill*[T](t: var Tensor[T], mask: openArray, value: Tensor[T]) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it openArray
and not openArray[bool]
for a reason?
Really nice work! Could you update the tutorial vandermonde boolean mask example? I'll merge it after. 🥳 |
48eaab3
to
7f8eb76
Compare
This is supported by numpy but was not supported by Arraymancer yet.
I'm hoping I've implemented it in a sufficiently performant way. If I can do it in a better way, please let me know.