Default integers to 32-bit precision #1524

jonatanklosko · 2024-09-03T06:16:02Z

Nx and EXLA passes, but there are Torchx segfaults that I need to debug.

josevalim · 2024-09-03T06:50:58Z

Our torchx is also old. Those may be fixed if we update it.

jonatanklosko · 2024-09-03T17:27:12Z

One VM crash was related to specific dot product with u32:

u = Nx.tensor([[[1]], [[2]]])
v = Nx.tensor([[[3]], [[4]]])
Nx.dot(u, [2], [0], v, [2], [0])

I changed the default libtorch version from 2.0.0 to 2.1.0, and it's fixed.

The other crashes I fixed by casing in appropriate places.

jonatanklosko · 2024-09-03T17:29:33Z

torchx/lib/torchx/backend.ex

- defp maybe_broadcast_bin_args(_out_shape, %{shape: {}} = l, r), do: {from_nx(l), from_nx(r)}
- defp maybe_broadcast_bin_args(_out_shape, l, %{shape: {}} = r), do: {from_nx(l), from_nx(r)}


@polvalente after upgrading to libtroch 2.1.0, I saw a bunch of warnings like this:

[W Resize.cpp:35] Warning: An output with one or more elements was resized since it had shape [], which does not match the required output shape [3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize_(0). (function _resize_output_check)

An example where this happens is:

Nx.logical_and(Nx.tensor(1), Nx.tensor([1, 2, 3]))

The warning only happens if the first operand is a scalar. Doing the actual broadcasting removes the warning.

We do the broadcasting explicitly in StableHLO too, so that's fine I think.

jonatanklosko · 2024-09-03T17:33:09Z

I noticed that Nx.tensor(0xFFFFFFFF) now crashes the VM with:

libc++abi: terminating due to uncaught exception of type std::runtime_error: value cannot be converted to type int without overflow

It was already the case before with Nx.s32(0xFFFFFFFF), it's just that now it's more likely to happen by default.

Perhaps there's a way to catch it and raise an elixir error instead, but that's not related to this PR.

jonatanklosko · 2024-09-03T17:42:25Z

@polvalente feel free to merge, if you are ok with the torchx changes :)

polvalente · 2024-09-03T19:48:56Z

exla/lib/exla.ex

@@ -220,7 +220,7 @@ defmodule EXLA do



EXLA has a few intermediate tensors being built as s64 (see Value.eigh for example). We should also check if those can be changed to s32 as well. Not a blocker for this PR.

I think those are unrelated, we just want to pass fixed integers as XLA inputs. We could actually make those unsigned (and change to uint on the c++ side), since all of those are non-negative sizes. Doesn't matter much in this case, your call!

I opened a PR to change to u64 #1526.

polvalente · 2024-09-03T20:18:55Z

torchx/lib/torchx/backend.ex

- defp maybe_broadcast_bin_args(_out_shape, %{shape: {}} = l, r), do: {from_nx(l), from_nx(r)}
- defp maybe_broadcast_bin_args(_out_shape, l, %{shape: {}} = r), do: {from_nx(l), from_nx(r)}


We do the broadcasting explicitly in StableHLO too, so that's fine I think.

polvalente · 2024-09-03T20:19:32Z

torchx/test/torchx/random_test.exs

@@ -65,7 +65,7 @@ defmodule Torchx.Nx.RandomTest do
 # Output does not match Nx because of the sign of the remainder.
 distribution_case(:randint_split,
 args: [0, 10, [shape: {5}]],
- expected: Nx.tensor([3, 2, 6, 0, 0], type: :s64)
+ expected: Nx.tensor([1, 1, 4, 1, 9], type: :s64)


I didn't really understand why these values changed, but that's ok.

It's because in args: [0, 10, [shape: {5}]] passed to defn the numbers 0 and 10 are now passed as s32 rather than s64. We could maintain the behaviour with args: [Nx.s64(0), Nx.s64(10), [shape: {5}]], but changing the test is equally fine.

Default integers to 32-bit precision

6cc049f

Fix torchx

058fe31

jonatanklosko commented Sep 3, 2024

View reviewed changes

josevalim approved these changes Sep 3, 2024

View reviewed changes

polvalente approved these changes Sep 3, 2024

View reviewed changes

polvalente merged commit 3ef1c7a into main Sep 3, 2024
8 checks passed

polvalente deleted the jk-s32 branch September 3, 2024 20:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default integers to 32-bit precision #1524

Default integers to 32-bit precision #1524

jonatanklosko commented Sep 3, 2024

josevalim commented Sep 3, 2024

jonatanklosko commented Sep 3, 2024

jonatanklosko Sep 3, 2024

polvalente Sep 3, 2024

jonatanklosko commented Sep 3, 2024

jonatanklosko commented Sep 3, 2024

polvalente Sep 3, 2024

jonatanklosko Sep 4, 2024 •

edited

Loading

jonatanklosko Sep 4, 2024

polvalente Sep 3, 2024

polvalente Sep 3, 2024

jonatanklosko Sep 4, 2024

		defp maybe_broadcast_bin_args(_out_shape, %{shape: {}} = l, r), do: {from_nx(l), from_nx(r)}
		defp maybe_broadcast_bin_args(_out_shape, l, %{shape: {}} = r), do: {from_nx(l), from_nx(r)}

Default integers to 32-bit precision #1524

Default integers to 32-bit precision #1524

Conversation

jonatanklosko commented Sep 3, 2024

josevalim commented Sep 3, 2024

jonatanklosko commented Sep 3, 2024

jonatanklosko Sep 3, 2024

Choose a reason for hiding this comment

polvalente Sep 3, 2024

Choose a reason for hiding this comment

jonatanklosko commented Sep 3, 2024

jonatanklosko commented Sep 3, 2024

polvalente Sep 3, 2024

Choose a reason for hiding this comment

jonatanklosko Sep 4, 2024 • edited Loading

Choose a reason for hiding this comment

jonatanklosko Sep 4, 2024

Choose a reason for hiding this comment

polvalente Sep 3, 2024

Choose a reason for hiding this comment

polvalente Sep 3, 2024

Choose a reason for hiding this comment

jonatanklosko Sep 4, 2024

Choose a reason for hiding this comment

jonatanklosko Sep 4, 2024 •

edited

Loading