`math_rem` test fails on the i686 target #243

tranzystorekk · 2024-12-02T23:49:39Z

When testing jaq 2.0.0 for Void Linux, the test run fails on the i686 platform: https://github.com/void-linux/void-packages/actions/runs/12129949574/job/33819327834?pr=50494

tranzystorekk · 2024-12-03T00:04:06Z

Backtrace after recompiling tests in debug:

---- math_rem stdout ----
thread 'math_rem' panicked at /builddir/jaq-2.0.0/jaq-core/src/lib.rs:221:9:
assertion failed: out.eq(ys)
stack backtrace:
   0: rust_begin_unwind
   1: core::panicking::panic_fmt
   2: core::panicking::panic
   3: jaq_core::Filter<F>::yields
             at /builddir/jaq-2.0.0/jaq-core/src/lib.rs:221:9
   4: funs::common::yields
             at ./tests/common/mod.rs:14:5
   5: funs::common::give
             at ./tests/common/mod.rs:22:5
   6: funs::math_rem
             at ./tests/funs.rs:134:5
   7: funs::math_rem::{{closure}}
             at ./tests/funs.rs:82:14
   8: core::ops::function::FnOnce::call_once
             at /builddir/rust-1.82.0/library/core/src/ops/function.rs:250:5

wader · 2024-12-03T13:32:23Z

Looks like this test

jaq/jaq-json/tests/funs.rs

Lines 134 to 138 in 1ac39f8

    
           give( 
        
               json!(null), 
        
               "try (4000000001 % 0) catch .", 
        
               json!("cannot calculate 4000000001 % 0"), 
        
           );

hmm guessing 4000000001 overflows etc 32 bit system? catches error message with different number?

tranzystorekk · 2024-12-03T13:38:48Z

I also suspected that, but 4000000001 should fit in 32 bits just fine, and the previous tests also deal with this value without failure;

so it seems this is about the logic behind div-by-zero error catching?

wader · 2024-12-03T14:36:14Z

jaq seems to use isize so i think 4000000001 won't fitt signed 32 bit, that might be the issue. Could you try change 4000000001 to something smaller? (and also change in the error string)

tranzystorekk · 2024-12-03T15:22:52Z

Ah, you're right, I didn't take signed integer into account.

I patched the numbers to make them much smaller in that test and now it succeeds, but it makes me worry that the other tests with that value are only passing by accident...

01mf02 · 2024-12-03T16:08:42Z

@tranzystorekk, I would be happy to accept a PR from you with the smaller numbers.

tranzystorekk · 2024-12-03T16:12:44Z

I'm wondering if switching the int repr in jaq-json to i64 would be a viable option (I saw the comment that isize allows ints to easily index arrays, but it seems the trade-off isn't as good if int is also comprehensively tested against arithmetic operations)

Reduces large isize to remain below maximum for signed 32 bit int, resolves 01mf02#243

GaetanLepage · 2024-12-10T10:48:12Z

It fails on x86_64-darwin too.

wader · 2024-12-10T11:37:59Z

@GaetanLepage does a smaller number work? could try #245

GaetanLepage · 2024-12-10T13:02:14Z

@GaetanLepage does a smaller number work? could try #245

Well, after more testing, I found out that the problematic line was give(json!(null), "2.1 % 0 | isnan", json!(true));.
Commenting this one made the whole test succeed.

wader · 2024-12-10T13:25:03Z

Oh ok, what does cargo run -- -n '2.1 % 0 | ., isnan, isinfinite' say?

GaetanLepage · 2024-12-10T13:34:28Z

Oh ok, what does cargo run -- -n '2.1 % 0 | ., isnan, isinfinite' say?

null
false
false

GaetanLepage · 2024-12-10T13:36:18Z

Whereas it is

null
true
false

on aarch64-darwin

wader · 2024-12-10T14:35:11Z

Huh, is this macOS or some other darwin based os?

@01mf02 any idea what is going on? known float behaviour?

GaetanLepage · 2024-12-10T14:36:44Z

Huh, is this macOS or some other darwin based os?

MacOS, running on an aarch64 M2 CPU, but with rosetta2 emulation of x86.

01mf02 · 2024-12-10T16:25:03Z

@01mf02 any idea what is going on? known float behaviour?

I'm sorry, I have no idea what is happening here. If it helps, the documentation for what should happen is here: https://doc.rust-lang.org/std/primitive.f64.html#impl-Rem-for-f64. Perhaps, @GaetanLepage, could you tell us what 2.1 / 0 | ., isnan, isinfinite yields on both architectures that you tested?

wader · 2024-12-10T18:44:16Z

Hmm confusing, i got i M3 macbook so i guess i could install rosetta somehow. I wonder what other language implementation does, could you try this with rosetta?

$ node -p '2.1 % 0'
NaN
$ node -p '2.1 / 0'
Infinity

GaetanLepage · 2024-12-11T08:26:26Z

Perhaps, @GaetanLepage, could you tell us what 2.1 / 0 | ., isnan, isinfinite yields on both architectures that you tested?

In the meantime, our darwin builder got updated to an M4 machine.
Here are the results on this new machine:

$ cargo run -- -n '2.1 % 0 | ., isnan, isinfinite

# aarch64-darwin
null
true
false

# x86_64-darwin (rosetta2)
null
false
false

and

$ cargo run -- -n '2.1 / 0 | ., isnan, isinfinite

# aarch64-darwin
null
false
false

# x86_64-darwin (rosetta2)
null
false
true

wader · 2024-12-11T14:25:41Z

Tried to expand and translate the jq code into what it will end up in rust:

2.1 % 0 | isnan
2.1 % 0 | . == nan
2.1 % 0 | . == 0 / 0
(2.1 % 0) == 0 / 0
(2.1 % 0.0) == 0.0 / 0.0

$ cargo run -- -n '(2.1 % 0.0) == 0.0 / 0.0'
true
$ cargo run --target x86_64-apple-darwin -- -n '(2.1 % 0.0) == 0.0 / 0.0'
false

But seems i'm missing something:

$ rustc -o test <(echo 'fn main() { dbg!(f64::total_cmp(&(2.1 % 0.0), &(0.0 / 0.0))); }') && ./test
[/dev/fd/11:1:13] f64::total_cmp(&(2.1 % 0.0), &(0.0 / 0.0)) = Equal
$ rustc --target x86_64-apple-darwin -o test <(echo 'fn main() { dbg!(f64::total_cmp(&(2.1 % 0.0), &(0.0 / 0.0))); }') && ./test
[/dev/fd/11:1:13] f64::total_cmp(&(2.1 % 0.0), &(0.0 / 0.0)) = Equal

01mf02 · 2024-12-13T09:56:41Z

@wader, this rustc <(...) pattern is really nice.

@GaetanLepage: Can you post the output of rustc -o test <(echo 'fn main() { dbg!(2.1 / 0.0); }') && ./test on your two platforms, just to make sure that the problem is on the Rust side and not on the jaq side? If they yield different outputs on your two platforms, could you perhaps file an issue on the Rust issue tracker, given that multiplication and division should be consistent across platforms?

GaetanLepage · 2024-12-13T11:18:20Z

@GaetanLepage: Can you post the output of rustc -o test <(echo 'fn main() { dbg!(2.1 / 0.0); }') && ./test on your two platforms, just to make sure that the problem is on the Rust side and not on the jaq side? If they yield different outputs on your two platforms, could you perhaps file an issue on the Rust issue tracker, given that multiplication and division should be consistent across platforms?

This gives me inf on all x86_64-linux, aarch64-linux, x86_64-darwin and aarch64-darwin.

wader · 2024-12-13T15:19:02Z

Dugg a bit more

With this float_cmp

fn float_cmp(left: f64, right: f64) -> Ordering {
    if left == 0. && right == 0. {
        Ordering::Equal
    } else {
        let lb = left.to_bits() as i64;
        let rb = right.to_bits() as i64;
        dbg!(&left, &right);
        dbg!(lb, rb);
        f64::total_cmp(&left, &right)
    }
}

I get this:

$cargo run --target x86_64-apple-darwin -- -n '0.0/0.0 == 0.0%0.0'
[jaq-json/src/lib.rs:892:9] &left = NaN
[jaq-json/src/lib.rs:892:9] &right = NaN
[jaq-json/src/lib.rs:893:9] lb = -2251799813685248 # NOTE: on aarch64 this is same value as below
[jaq-json/src/lib.rs:894:9] rb = 9221120237041090560
false

So i wonder if one of the NaN-values gets "normalized" etc? ex the NaN payload/flags gets zeroed or something?

Still can't reproduce with same-ish code as simple rust code.

- bump from 1.6.0 to 2.0.1 - remove cargo patch - add patch for tests on 32bit architecture, see: 01mf02/jaq#243 - package MIT license

01mf02 · 2024-12-16T08:44:15Z

@wader, thanks for digging! Taking on your approach, may I ask you, @GaetanLepage, to post the output of the following?

rustc -o test <(echo 'fn main() { dbg!(f64::total_cmp(&(2.1 / 0.0), &f64::INFINITY)); }') && ./test

I hope that this will yield different outputs for different architectures ... that would explain jaq's behaviour, and we could file an issue at rust-lang.

wader · 2024-12-16T09:02:00Z

One thought that struck me: could compiling a simple program with rustc and building via cargo differ some way that changes float behaviour? some jaq crates dependency pokes at something? different compile optimization levels?

GaetanLepage · 2024-12-16T09:48:06Z

may I ask you, @GaetanLepage, to post the output of the following?

On all architectures, I get:

[/dev/fd/63:1:13] f64::total_cmp(&(2.1 / 0.0), &f64::INFINITY) = Equal

wader · 2024-12-16T12:42:02Z

More digging

The two different NaNs as u64 are these:

[jaq-json/src/lib.rs:890:5] lb = 18444492273895866368
[jaq-json/src/lib.rs:891:5] rb = 9221120237041090560

If one inspect them using my favorite website they are:
https://float.exposed/0xfff8000000000000 signed NaN
https://float.exposed/0x7ff8000000000000 "positive" NaN

Update: seems float.exposed it buggy, for 18444492273895866368 one have to provide it in the "Raw Decimal Integer Value" input field.

I don't really follow where and how in the jaq code we end up with a signed NaN but f64::total_cmp will see them as no equal so maybe a solution to this and possible other NaN-differences is to do:

fn float_cmp(left: f64, right: f64) -> Ordering {
    if (left == 0. && right == 0.) || (left.is_nan() && right.is_nan()) {
        Ordering::Equal
    } else {
        f64::total_cmp(&left, &right)
    }
}

Are the more places to look for this? and it is ok to normalize NaN values like this? maybe only consider signness difference equal?

01mf02 · 2024-12-16T16:07:46Z

One thought that struck me: could compiling a simple program with rustc and building via cargo differ some way that changes float behaviour? some jaq crates dependency pokes at something? different compile optimization levels?

I am not aware of such behaviour. On the other hand, for integers, I know that compiling with --release disables overflow checks.

01mf02 · 2024-12-16T16:31:19Z

Are the more places to look for this? and it is ok to normalize NaN values like this? maybe only consider signness difference equal?

It may be OK to normalize NaN values like this, but first, I would like to understand what is going on here. I'd really like to reproduce the platform-specific difference without jaq-related code.

01mf02 · 2024-12-16T16:32:27Z

Let's get the big cannon out. Can you post the output of the following program?

fn main() {
    let t: f64 = "2".parse().unwrap();
    let z: f64 = "0".parse().unwrap();
    let vals = [t / z, z / z, t % z, z % z];
    for v in vals {
        println!("{v} (bytes = {})", v.to_bits());
    }
    for v1 in &vals {
        for v2 in &vals {
            print!("{:?}\t", v1.total_cmp(v2));
        }
        println!();
    }
}

On my computer, this yields:

inf (bytes = 9218868437227405312)
NaN (bytes = 18444492273895866368)
NaN (bytes = 18444492273895866368)
NaN (bytes = 18444492273895866368)
Equal	Greater	Greater	Greater	
Less	Equal	Equal	Equal	
Less	Equal	Equal	Equal	
Less	Equal	Equal	Equal

wader · 2024-12-16T17:35:23Z

On macbook M3

$ rustc --target x86_64-apple-darwin  -o test test.rs && ./test
inf (bytes = 9218868437227405312)
NaN (bytes = 18444492273895866368)
NaN (bytes = 9221120237041090560)
NaN (bytes = 9221120237041090560)
Equal	Greater	Less	Less
Less	Equal	Less	Less
Greater	Greater	Equal	Equal
Greater	Greater	Equal	Equal

$ rustc --target aarch64-apple-darwin  -o test test.rs && ./test
inf (bytes = 9218868437227405312)
NaN (bytes = 9221120237041090560)
NaN (bytes = 9221120237041090560)
NaN (bytes = 9221120237041090560)
Equal	Less	Less	Less
Greater	Equal	Equal	Equal
Greater	Equal	Equal	Equal
Greater	Equal	Equal	Equal

(not exactly sure how x86_64 gets executed on arm but i guess it's transparent thru rosetta, i get same result if do arch --x86_64 ./test which seems to the explicit way of doing it)

01mf02 · 2024-12-17T07:27:25Z

@wader: YES! I think we've got it. Just to be sure, can you check one last thing, namely whether replacing t and z with 2.0 and 0.0 yields the same results?

wader · 2024-12-17T08:17:37Z

You mean like this?

    let t: f64 = 2.0;
    let z: f64 = 0.0;

hmm then i get:

$  rust_float rustc --target x86_64-apple-darwin  -o test test.rs && ./test
inf (bytes = 9218868437227405312)
NaN (bytes = 9221120237041090560)
NaN (bytes = 9221120237041090560)
NaN (bytes = 9221120237041090560)
Equal	Less	Less	Less
Greater	Equal	Equal	Equal
Greater	Equal	Equal	Equal
Greater	Equal	Equal	Equal

$  rust_float rustc --target aarch64-apple-darwin  -o test test.rs && ./test
inf (bytes = 9218868437227405312)
NaN (bytes = 9221120237041090560)
NaN (bytes = 9221120237041090560)
NaN (bytes = 9221120237041090560)
Equal	Less	Less	Less
Greater	Equal	Equal	Equal
Greater	Equal	Equal	Equal
Greater	Equal	Equal	Equal

Now i'm confused. After some poking around it seems this is what differs:

$ rustc --target x86_64-apple-darwin -o test <(echo 'fn main() { dbg!(("0".parse::<f64>().unwrap() / "0".parse::<f64>().unwrap()).to_bits()); }') && ./test
[/dev/fd/11:1:13] ("0".parse::<f64>().unwrap() / "0".parse::<f64>().unwrap()).to_bits() = 18444492273895866368

$ rustc --target x86_64-apple-darwin -o test <(echo 'fn main() { dbg!((0.0f64 / 0.0f64).to_bits()); }') && ./test
[/dev/fd/11:1:13] (0.0f64 / 0.0f64).to_bits() = 9221120237041090560

but i don't see any difference in float etc:

$ rustc --target aarch64-apple-darwin -o test <(echo 'fn main() { dbg!("0".parse::<f64>().unwrap().to_bits()); }') && ./test
[/dev/fd/11:1:13] "0".parse::<f64>().unwrap().to_bits() = 0

$t rustc --target aarch64-apple-darwin -o test <(echo 'fn main() { dbg!(0.0f64.to_bits()); }') && ./test
[/dev/fd/11:1:13] 0.0f64.to_bits() = 0

Hmm is this the compiler doing some transformations etc?

01mf02 · 2024-12-17T11:14:20Z

@wader, great! So parsing seems to have an impact here.

I opened a bug report on rust-lang. Let's see what people have to say there.

workingjubilee · 2024-12-17T19:36:42Z

The aarch64 and x86-64 CPUs use different NaNs.

workingjubilee · 2024-12-17T20:59:39Z

This is hidden from you by LLVM doing constant propagation, as LLVM normalizes to a particular NaN, irrespective of architecture, if you use the literal float. Otherwise, it is not.

saethlin · 2024-12-17T22:28:35Z

I think this was misdiagnosed/mis-minimized, and the actual bug here is that jaq is using total_cmp (not that parsing is funky, the examples here are just poking codegen in ways that induce different NaN generation). NaN generation in Rust is effectively nondeterminstic (https://rust-lang.github.io/rfcs/3514-float-semantics.html or https://doc.rust-lang.org/stable/std/primitive.f32.html#nan-bit-patterns) and since total_cmp cares about which NaN is being compared, you're going to have problems like this one.

I don't know what your floating point comparison semantics are supposed to be, but I think you want all NaNs to compare less than all NaNs, or something like that.

workingjubilee · 2024-12-17T23:29:03Z

It seems you attempted to use total_cmp as a way of getting out of checking for NaN, but you should probably just handle cases explicitly. It's possible there's some other handling of the following missing truth table entries that you may want, but I don't know what they would be without knowing your other constraints.

fn float_cmp(left: f64, right: f64) -> Ordering {
    match (left.partial_cmp(&right), left.is_nan(), right.is_nan()) {
        (None, true, true) => Ordering::Equal,
        (None, false, true) => todo!(),
        (None, true, false) => todo!(),
        (Some(v), false, false) => v,
    }
}

01mf02 · 2024-12-19T12:09:16Z

I don't know what your floating point comparison semantics are supposed to be, but I think you want all NaNs to compare less than all NaNs, or something like that.

This is actually what happens in jq. I have just made a PR (#248) that proposes to adopt this behaviour after all.

01mf02 · 2024-12-19T17:24:33Z

@workingjubilee, thanks for your suggestion, that's what I settled on in the end.

simonrupf pushed a commit to simonrupf/jaq that referenced this issue Dec 9, 2024

Changing test to pass on 32 bit architecture.

0505fae

Reduces large isize to remain below maximum for signed 32 bit int, resolves 01mf02#243

simonrupf mentioned this issue Dec 9, 2024

Changing test to pass on 32 bit architecture. #245

Merged

01mf02 closed this as completed in #245 Dec 13, 2024

01mf02 closed this as completed in 18aa0a8 Dec 13, 2024

01mf02 reopened this Dec 13, 2024

algitbot pushed a commit to alpinelinux/aports that referenced this issue Dec 14, 2024

testing/jaq: upgrade to 2.0.1

977b957

- bump from 1.6.0 to 2.0.1 - remove cargo patch - add patch for tests on 32bit architecture, see: 01mf02/jaq#243 - package MIT license

01mf02 mentioned this issue Dec 17, 2024

LLVM is allowed to... be creative... with NANs according to Rust float semantics rust-lang/rust#134417

Closed

01mf02 mentioned this issue Dec 19, 2024

NaN compatibility #248

Merged

01mf02 closed this as completed in ba24e38 Dec 19, 2024

01mf02 mentioned this issue Dec 20, 2024

BigInt support #249

Open

math_rem test fails on the i686 target #243

math_rem test fails on the i686 target #243

Comments

tranzystorekk commented Dec 2, 2024

tranzystorekk commented Dec 3, 2024

wader commented Dec 3, 2024

tranzystorekk commented Dec 3, 2024

wader commented Dec 3, 2024

tranzystorekk commented Dec 3, 2024

01mf02 commented Dec 3, 2024

tranzystorekk commented Dec 3, 2024 • edited Loading

GaetanLepage commented Dec 10, 2024

wader commented Dec 10, 2024 • edited Loading

GaetanLepage commented Dec 10, 2024 • edited Loading

wader commented Dec 10, 2024

GaetanLepage commented Dec 10, 2024

GaetanLepage commented Dec 10, 2024

wader commented Dec 10, 2024

GaetanLepage commented Dec 10, 2024

01mf02 commented Dec 10, 2024

wader commented Dec 10, 2024

GaetanLepage commented Dec 11, 2024 • edited Loading

wader commented Dec 11, 2024 • edited Loading

01mf02 commented Dec 13, 2024

GaetanLepage commented Dec 13, 2024 • edited Loading

wader commented Dec 13, 2024

01mf02 commented Dec 16, 2024

wader commented Dec 16, 2024 • edited Loading

GaetanLepage commented Dec 16, 2024

wader commented Dec 16, 2024 • edited Loading

01mf02 commented Dec 16, 2024

01mf02 commented Dec 16, 2024

01mf02 commented Dec 16, 2024 • edited Loading

wader commented Dec 16, 2024

01mf02 commented Dec 17, 2024 • edited Loading

wader commented Dec 17, 2024

01mf02 commented Dec 17, 2024

workingjubilee commented Dec 17, 2024

workingjubilee commented Dec 17, 2024 • edited Loading

saethlin commented Dec 17, 2024 • edited Loading

workingjubilee commented Dec 17, 2024 • edited Loading

01mf02 commented Dec 19, 2024 • edited Loading

01mf02 commented Dec 19, 2024

`math_rem` test fails on the i686 target #243

`math_rem` test fails on the i686 target #243

tranzystorekk commented Dec 3, 2024 •

edited

Loading

wader commented Dec 10, 2024 •

edited

Loading

GaetanLepage commented Dec 10, 2024 •

edited

Loading

GaetanLepage commented Dec 11, 2024 •

edited

Loading

wader commented Dec 11, 2024 •

edited

Loading

GaetanLepage commented Dec 13, 2024 •

edited

Loading

wader commented Dec 16, 2024 •

edited

Loading

wader commented Dec 16, 2024 •

edited

Loading

01mf02 commented Dec 16, 2024 •

edited

Loading

01mf02 commented Dec 17, 2024 •

edited

Loading

workingjubilee commented Dec 17, 2024 •

edited

Loading

saethlin commented Dec 17, 2024 •

edited

Loading

workingjubilee commented Dec 17, 2024 •

edited

Loading

01mf02 commented Dec 19, 2024 •

edited

Loading