Add support for doing a masked fill from a tensor #612

AngelEzquerra · 2023-11-01T13:12:06Z

This is supported by numpy but was not supported by Arraymancer yet.

I'm hoping I've implemented it in a sufficiently performant way. If I can do it in a better way, please let me know.

mratsim · 2023-12-30T14:25:32Z

src/arraymancer/tensor/selectors.nim

+          inc n
+    except IndexDefect:
+      raise newException(IndexDefect, "The size of the value tensor (" & $value.size &
+        ") is smaller than the number of true elements in the mask (" & $mask.size & ")")


Have tried this on Nim 1.6 and Nim 2.0?

This for sure crashes, likely with an incomprehensible error message on Nim 1.6 because exceptions are allocated in a thread-local manner.

In general, OpenMP sections should not allocate.

You are right. I had tried the "happy path" on both 2.0 and 1.6 but I had only tried the unhappy path in nim 2.0 (in which it works fine, at least with the small tensors that I tried it with). In 1.6 it does cause a crash.
What is the solution? I don't see any error handling on any of the existing OpenMP sections. Even the following (stupid) code in which I create the message and the exception beforehand crashes in 1.6:

let error_msg = "The size of the value tensor (" & $value.size & ") is smaller than the number of true elements in the mask (" & $mask.size & ")" let indexDefectException = newException(IndexDefect, error_msg) omp_parallel_blocks(block_offset, block_size, t.size): var n = block_offset try: for tElem, maskElem in mzip(t, mask, block_offset, block_size): if maskElem: tElem = value[n] inc n except IndexDefect: raise indexDefectException

I guess I could have 2 versions of the function, one for 2.0 with error checking and one for 1.6 without it. Is that the way to go?

The way to go is to have no exceptions in parallel sections.

And in general, no seq or strings allocation, using only preallocated stuff, otherwise the parallel section needs this: https://github.com/mratsim/laser/blob/e23b5d6/laser/openmp.nim#L88-L110.

template attachGC*(): untyped = ## If you are allocating reference types, sequences or strings ## in a parallel section, you need to attach and detach ## a GC for each thread. Those should be thread-local temporaries. ## ## This attaches the GC. ## ## Note: this creates too strange error messages ## when --threads is not on: https://github.com/nim-lang/Nim/issues/9489 if(omp_get_thread_num()!=0): setupForeignThreadGc() template detachGC*(): untyped = ## If you are allocating reference types, sequences or strings ## in a parallel section, you need to attach and detach ## a GC for each thread. Those should be thread-local temporaries. ## ## This detaches the GC. ## ## Note: this creates too strange error messages ## when --threads is not on: https://github.com/nim-lang/Nim/issues/9489 if(omp_get_thread_num()!=0): teardownForeignThreadGc()

I'm not sure what's the overhead of attacjing/detaching the GC.

This is the main motivation behind Nim switching to arc/orc GCs.

In the future it will be fine to require Nim 2.0, but I need a better Cuda story (it only works on Nim v1.2 at the moment unless you rebuild a nim.cfg from scratch that replaces --std:gnu++14 by --std:stdc++14.

I have made a change that ensures that when non ARC/ORC memory management the exception error message is shown. I have also added a test for the exception code when using ARC or ORC.

Vindaar · 2024-02-11T15:44:34Z

src/arraymancer/tensor/selectors.nim

+  if too_few_values:
+    let error_msg = "masked_fill error: the size of the value tensor (" & $value.size &
+      ") is smaller than the number of true elements in the mask"
+    when not(compileOption("mm", "arc") or compileOption("mm", "orc")):


An alterative that could work for older Nim would be:

Suggested change

when not(compileOption("mm", "arc") or compileOption("mm", "orc")):

when not defined(gcDestructors):

but I don't mind to just drop 1.4.

I can try this change. In any case I think dropping 1.4 makes sense.

Vindaar · 2024-02-11T15:44:57Z

It looks fine to me now. One could clarify a bit more in the docstrings how different ranks are handled and specify that the mask and fill value tensors must have the same shape. The next thing 'missing' thing would then be a single element fill I guess (which could be convenient e.g. precisely for the type of NaN test case).

Anyway, I think this fine to merge. Please just drop the Nim 1.4 from the CI. I don't see a purpose in continuing to support it officially. Most of it will work regardless.

AngelEzquerra · 2024-02-11T23:05:51Z

It looks fine to me now. One could clarify a bit more in the docstrings how different ranks are handled and specify that the mask and fill value tensors must have the same shape. The next thing 'missing' thing would then be a single element fill I guess (which could be convenient e.g. precisely for the type of NaN test case).

Anyway, I think this fine to merge. Please just drop the Nim 1.4 from the CI. I don't see a purpose in continuing to support it officially. Most of it will work regardless.

How would I drop Nim 1.4 from the CI? Is that something that I can do myself?

Vindaar · 2024-02-12T07:14:31Z

It looks fine to me now. One could clarify a bit more in the docstrings how different ranks are handled and specify that the mask and fill value tensors must have the same shape. The next thing 'missing' thing would then be a single element fill I guess (which could be convenient e.g. precisely for the type of NaN test case).
Anyway, I think this fine to merge. Please just drop the Nim 1.4 from the CI. I don't see a purpose in continuing to support it officially. Most of it will work regardless.

How would I drop Nim 1.4 from the CI? Is that something that I can do myself?

Ah, sorry. I wanted to add that to my response, but I forgot. Yes, just remove it here in this line:

https://github.com/mratsim/Arraymancer/blob/master/.github/workflows/ci.yml#L17

When using non ARC/ORC memory management modes, the exception message was not shown.

AngelEzquerra · 2024-02-12T21:31:46Z

It looks fine to me now. One could clarify a bit more in the docstrings how different ranks are handled and specify that the mask and fill value tensors must have the same shape. The next thing 'missing' thing would then be a single element fill I guess (which could be convenient e.g. precisely for the type of NaN test case).

I've pushed a new commit that improves the documentation of all the masked_fill procedures (including the ones that fill with a scalar value).

Anyway, I think this fine to merge. Please just drop the Nim 1.4 from the CI. I don't see a purpose in continuing to support it officially. Most of it will work regardless.

Nim 1.4 has been dropped on a separate PR.

Document that boolean masks can be used to access and change the elements of a tensor.

Vindaar · 2024-02-15T09:54:05Z

docs/tuto.slicing.rst

+    foo[foo >. 27] = -arange(9)
+
+    # Tensor[system.int] of shape "[5, 5]" on backend "Cpu"
+    # |1      1     1     1     1|
+    # |2      4     8    16    -1|
+    # |3      9    27    -1    -1|
+    # |4     16    -1    -1    -1|
+    # |5     25    -1    -1    -1|


Why is the output all -1? Shouldn't the numbers decrease? Or is it picking the wrong overload and only uses the first element for some reason?

I think the latter might be happening, because this (on current master):

import arraymancer var t = arange(25).reshape([5, 5]) t[t >=. 8] = 111 echo t

yields

Tensor[system.int] of shape "[5, 5]" on backend "Cpu" |0 1 2 3 4| |5 6 7 111 111| |111 111 111 111 111| |111 111 111 111 111| |111 111 111 111 111|

(but note that the code in your new example of course does not even compile on master. So I'm not exactly sure why this behavior is as it is)

Ah, I just checked out your branch. It seems to work fine on my end. I assume you maybe just copied the wrong output tensor?

Vindaar · 2024-02-15T10:01:05Z

src/arraymancer/tensor/selectors.nim

+      echo error_msg
+    raise newException(IndexDefect, error_msg)
+
+proc masked_fill*[T](t: var Tensor[T], mask: openArray, value: Tensor[T]) =


Is it openArray and not openArray[bool] for a reason?

Vindaar · 2024-02-15T10:09:09Z

Really nice work! Could you update the tutorial vandermonde boolean mask example? I'll merge it after. 🥳

mratsim approved these changes Dec 30, 2023

View reviewed changes

Vindaar reviewed Feb 11, 2024

View reviewed changes

AngelEzquerra added 2 commits February 12, 2024 21:43

Add support for doing a masked fill from a tensor

78854f4

Improve handling of cases in which masked_fill runs out of values

61fb5f0

When using non ARC/ORC memory management modes, the exception message was not shown.

AngelEzquerra force-pushed the multi_value_masked_fill branch from 8d9cdec to 61fb5f0 Compare February 12, 2024 20:43

Improve the documentation of the masked_fill procedures

572b7db

Document "boolean mask indexing"

7f8eb76

Document that boolean masks can be used to access and change the elements of a tensor.

Vindaar reviewed Feb 15, 2024

View reviewed changes

AngelEzquerra force-pushed the multi_value_masked_fill branch from 48eaab3 to 7f8eb76 Compare February 15, 2024 11:13

Vindaar merged commit 40ceb53 into mratsim:master Feb 15, 2024
6 checks passed

AngelEzquerra deleted the multi_value_masked_fill branch February 15, 2024 12:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for doing a masked fill from a tensor #612

Add support for doing a masked fill from a tensor #612

AngelEzquerra commented Nov 1, 2023

mratsim Dec 30, 2023

AngelEzquerra Dec 30, 2023

mratsim Dec 31, 2023

AngelEzquerra Jan 4, 2024

Vindaar Feb 11, 2024 •

edited

Loading

AngelEzquerra Feb 11, 2024

Vindaar commented Feb 11, 2024

AngelEzquerra commented Feb 11, 2024

Vindaar commented Feb 12, 2024

AngelEzquerra commented Feb 12, 2024

Vindaar Feb 15, 2024

Vindaar Feb 15, 2024 •

edited

Loading

Vindaar Feb 15, 2024

Vindaar Feb 15, 2024

Vindaar commented Feb 15, 2024

	when not(compileOption("mm", "arc") or compileOption("mm", "orc")):
	when not defined(gcDestructors):

Add support for doing a masked fill from a tensor #612

Add support for doing a masked fill from a tensor #612

Conversation

AngelEzquerra commented Nov 1, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Vindaar Feb 11, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Vindaar commented Feb 11, 2024

AngelEzquerra commented Feb 11, 2024

Vindaar commented Feb 12, 2024

AngelEzquerra commented Feb 12, 2024

Choose a reason for hiding this comment

Vindaar Feb 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Vindaar commented Feb 15, 2024

Vindaar Feb 11, 2024 •

edited

Loading

Vindaar Feb 15, 2024 •

edited

Loading