Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support aarch64-unknown-linux-gnu #422

Closed
epgts opened this issue May 16, 2022 · 6 comments
Closed

Support aarch64-unknown-linux-gnu #422

epgts opened this issue May 16, 2022 · 6 comments
Assignees

Comments

@epgts
Copy link
Contributor

epgts commented May 16, 2022

We're seeing increased demand to support aarch64 for Linux platforms, in both private and public (e.g. issue #382). We plan to support a binary release (e.g. Docker image), but a first step is to make the toolkit work on aarch64-unknown-linux-gnu at all.

Acceptance criteria:

  • github action installed to test every merge request under Ubuntu 20.04 / aarch64

Out of scope:

  • Production of any binary release
@epgts epgts self-assigned this May 16, 2022
@epgts
Copy link
Contributor Author

epgts commented May 16, 2022

Current status:

Many tests crash.

I arbitrarily picked uddsketch::tests::pg_uddsketch_io_test to start with. Backtrace:

#0  core::fmt::Formatter::alternate () at library/core/src/fmt/mod.rs:1880
#1  core::fmt::Formatter::pad_integral () at library/core/src/fmt/mod.rs:1354
#2  0x0000fffc8116d30c in core::fmt::num::fmt_u128 () at library/core/src/fmt/num.rs:641
#3  0x0000fffc8116da60 in core::fmt::write () at library/core/src/fmt/mod.rs:1190
#4  0x0000fffc811ed5c0 in std::io::Write::write_fmt (self=0xffffe2581998, fmt=...)
    at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/std/src/io/mod.rs:1657
#5  0x0000fffc80b1f53c in <&mut ron::ser::Serializer<W> as serde::ser::Serializer>::serialize_u128 (self=0xffffe2581998, v=1)
    at /home/epg/.cargo/registry/src/github.com-1ecc6299db9ec823/ron-0.7.0/src/ser/mod.rs:409
#6  0x0000fffc80b1f40c in <&mut ron::ser::Serializer<W> as serde::ser::Serializer>::serialize_u8 (self=0xffffe2581998, v=1)
    at /home/epg/.cargo/registry/src/github.com-1ecc6299db9ec823/ron-0.7.0/src/ser/mod.rs:393
#7  0x0000fffc80b94bc8 in serde::ser::impls::<impl serde::ser::Serialize for u8>::serialize (
    self=0xffffe2581ea0 "\001)X\342\377\377\000", serializer=0xffffe2581998)
    at /home/epg/.cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.137/src/ser/impls.rs:15
#8  0x0000fffc80b25678 in <ron::ser::Compound<W> as serde::ser::SerializeStruct>::serialize_field (self=0xffffe2581398, key=..., 
    value=0xffffe2581ea0 "\001)X\342\377\377\000")
    at /home/epg/.cargo/registry/src/github.com-1ecc6299db9ec823/ron-0.7.0/src/ser/mod.rs:876
#9  0x0000fffc81380e00 in timescaledb_toolkit::uddsketch::_::<impl serde::ser::Serialize for timescaledb_toolkit::uddsketch::ReadableUddSketch>::serialize (self=0xffffe2581e60, __serializer=0xffffe2581998) at extension/src/uddsketch.rs:197
#10 0x0000fffc80b159cc in ron::ser::to_string (value=0xffffe2581e60)
    at /home/epg/.cargo/registry/src/github.com-1ecc6299db9ec823/ron-0.7.0/src/ser/mod.rs:44
#11 0x0000fffc80d77b18 in <timescaledb_toolkit::uddsketch::UddSketch as pgx::inoutfuncs::InOutFuncs>::output (self=0xffffe2582168, 
    buffer=0xffffe2581f38) at extension/src/uddsketch.rs:264
#12 0x0000fffc80d89090 in timescaledb_toolkit::uddsketch::uddsketch_out (input=...) at extension/src/type_builder.rs:102
#13 0x0000fffc80d896ec in timescaledb_toolkit::uddsketch::uddsketch_out_wrapper::uddsketch_out_wrapper_inner (fcinfo=0x397635c0)
    at extension/src/type_builder.rs:102
#14 0x0000fffc81380a7c in timescaledb_toolkit::uddsketch::uddsketch_out_wrapper::{{closure}} () at extension/src/uddsketch.rs:176
#15 0x0000fffc8100c64c in std::panicking::try::do_call (data=0xffffe2582340 "\200$X\342\377\377\000")
    at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/std/src/panicking.rs:492
#16 0x0000fffc8102dbfc in __rust_try.llvm.16474678935699062497 ()
   from /home/epg/.pgx/14.3/pgx-install/lib/postgresql/timescaledb_toolkit.so
#17 0x0000fffc80ff8284 in std::panicking::try (f=...)
    at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/std/src/panicking.rs:456
#18 0x0000fffc80a6c2e4 in std::panic::catch_unwind (f=...)
    at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/std/src/panic.rs:137
#19 0x0000fffc810a7590 in pgx_pg_sys::submodules::guard::try_guard (try_func=...)
    at /home/epg/.cargo/registry/src/github.com-1ecc6299db9ec823/pgx-pg-sys-0.4.4/src/submodules/guard.rs:243
#20 0x0000fffc8109bbe0 in pgx_pg_sys::submodules::guard::pg_try (try_func=...)
    at /home/epg/.cargo/registry/src/github.com-1ecc6299db9ec823/pgx-pg-sys-0.4.4/src/submodules/guard.rs:233
#21 0x0000fffc81095a7c in pgx_pg_sys::submodules::guard::guard (f=...)
    at /home/epg/.cargo/registry/src/github.com-1ecc6299db9ec823/pgx-pg-sys-0.4.4/src/submodules/guard.rs:224
#22 0x0000fffc80d89684 in timescaledb_toolkit::uddsketch::uddsketch_out_wrapper (fcinfo=0x397635c0)
    at extension/src/type_builder.rs:102
#23 0x000000000066ff08 in ExecInterpExpr (state=0x39762c68, econtext=0x397625c8, isnull=<optimized out>) at execExprInterp.c:1162
#24 0x0000000000689a3c in ExecEvalExprSwitchContext (isNull=0xffffe258259f, econtext=0x397625c8, state=0x39762c68)
    at ../../../src/include/executor/executor.h:339
#25 ExecProject (projInfo=0x39762c60) at ../../../src/include/executor/executor.h:373
#26 project_aggregates (aggstate=<optimized out>) at nodeAgg.c:1385
#27 project_aggregates (aggstate=<optimized out>) at nodeAgg.c:1372
#28 0x000000000068a6d8 in agg_retrieve_direct (aggstate=0x39762008) at nodeAgg.c:2527
#29 ExecAgg (pstate=0x39762008) at nodeAgg.c:2179
#30 0x0000000000674b04 in ExecProcNode (node=0x39762008) at ../../../src/include/executor/executor.h:257
#31 ExecutePlan (execute_once=<optimized out>, dest=0xabc3b8 <spi_printtupDR>, direction=<optimized out>, numberTuples=0, 
    sendTuples=<optimized out>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x39762008, estate=0x39761dd0)
    at execMain.c:1551
#32 standard_ExecutorRun (queryDesc=0x39771708, direction=<optimized out>, count=0, execute_once=<optimized out>) at execMain.c:361
#33 0x00000000006b696c in _SPI_pquery (tcount=0, fire_triggers=true, queryDesc=0x39771708) at spi.c:2808
#34 _SPI_execute_plan (plan=plan@entry=0xffffe2582870, options=options@entry=0xffffe2582848, snapshot=snapshot@entry=0x0, 
    crosscheck_snapshot=crosscheck_snapshot@entry=0x0, fire_triggers=fire_triggers@entry=true) at spi.c:2580
#35 0x00000000006b6c9c in SPI_execute (src=0x3968b520 "SELECT uddsketch(10, 0.01, value)::text FROM io_test\000", read_only=false, 
    tcount=0) at spi.c:525
#36 0x0000fffc81312538 in pgx_pg_sys::pg14::SPI_execute::{{closure}} ()
    at /home/epg/timescaledb-toolkit/.o/extension/debug/build/pgx-pg-sys-03a30099fe406bdc/out/pg14.rs:58595
#37 0x0000fffc813124cc in pgx_pg_sys::submodules::setjmp::pg_guard_ffi_boundary (f=...)
    at /home/epg/.cargo/registry/src/github.com-1ecc6299db9ec823/pgx-pg-sys-0.4.4/src/submodules/setjmp.rs:31
#38 pgx_pg_sys::pg14::SPI_execute (arg_src=0x3968b520 "SELECT uddsketch(10, 0.01, value)::text FROM io_test\000", 
    arg_read_only=false, arg_tcount=0)
    at /home/epg/timescaledb-toolkit/.o/extension/debug/build/pgx-pg-sys-03a30099fe406bdc/out/pg14.rs:58587
#39 0x0000fffc812fa350 in pgx::spi::SpiClient::execute (query=..., read_only=false, limit=..., args=...)
    at /home/epg/.cargo/registry/src/github.com-1ecc6299db9ec823/pgx-0.4.4/src/spi.rs:366
#40 0x0000fffc812f9eec in pgx::spi::SpiClient::select (self=0xffffe2582ecf, query=..., limit=..., args=...)
    at /home/epg/.cargo/registry/src/github.com-1ecc6299db9ec823/pgx-0.4.4/src/spi.rs:305
#41 0x0000fffc80acab70 in timescaledb_toolkit::uddsketch::tests::uddsketch_io_test::{{closure}} (client=...)
    at extension/src/uddsketch.rs:841
#42 0x0000fffc81060b5c in pgx::spi::Spi::execute::{{closure}} (client=...)
    at /home/epg/.cargo/registry/src/github.com-1ecc6299db9ec823/pgx-0.4.4/src/spi.rs:193
#43 0x0000fffc810664e0 in pgx::spi::Spi::connect (f=...)
    at /home/epg/.cargo/registry/src/github.com-1ecc6299db9ec823/pgx-0.4.4/src/spi.rs:235
#44 0x0000fffc810605b0 in pgx::spi::Spi::execute (f=...)
    at /home/epg/.cargo/registry/src/github.com-1ecc6299db9ec823/pgx-0.4.4/src/spi.rs:192
#45 0x0000fffc809abff8 in timescaledb_toolkit::uddsketch::tests::uddsketch_io_test () at extension/src/uddsketch.rs:837
#46 0x0000fffc809ac260 in timescaledb_toolkit::uddsketch::tests::uddsketch_io_test_wrapper::uddsketch_io_test_wrapper_inner (
    fcinfo=0x397564b8) at extension/src/uddsketch.rs:835
#47 0x0000fffc80acb190 in timescaledb_toolkit::uddsketch::tests::uddsketch_io_test_wrapper::{{closure}} ()
    at extension/src/uddsketch.rs:835

@epgts
Copy link
Contributor Author

epgts commented May 16, 2022

The crash goes away when I remove the non-standard lto = "thin" flag from the dev profile, introduced in commit d18ebe6 (though the commit log doesn't say why).

Can we just remove it?

Currently rebuilding the world with --release where we use lto = "fat" (which dates to the initial commit; again, can we just remove it?) to see if that works.

This also makes me realize we publish binaries built with --release but we don't test that way... Probably we should always run the github action tests that way. I'll look at making that change next.

@epgts
Copy link
Contributor Author

epgts commented May 17, 2022

@epgts
Copy link
Contributor Author

epgts commented May 17, 2022

Filed ron-rs/ron#379

@epgts
Copy link
Contributor Author

epgts commented May 17, 2022

With LTO flipped back to the default of "off", I'm down to just one test failure:
stats_agg::tests::pg_stats_agg_fuzz

Output didn't match between postgres command: SELECT covar_pop(test_y, test_x) FROM test_table
and stats_agg command: SELECT covariance(stats_agg(test_y, test_x), 'population'),
 stats_agg(test_y, test_x)->toolkit_experimental.covariance('population') FROM test_table
        postgres result: 2829642594277869.5
        statsagg result: 2829642594277930.5
        relative difference:         0.000000000000021557492852049044
        allowed relative difference: 0.0000000000000002220446049250313
Failed after 1 successful iterations, run using 10000 values generated from seed 114482102091803095

epgts added a commit that referenced this issue May 26, 2022
It became "thin" in commit d18ebe6.

Fixes segfaults on aarch64-unknown-linux-gnu .

Also drop redundant default overrides from profiles.

for issue #422
epgts added a commit that referenced this issue May 26, 2022
aarch64 varies a lot more on "covar_pop" and "covar_samp".
We investigated a bit and found that the postgresql equivalents
we were comparing against varied further from amd64 behavior than
toolkit's.  Possibly differences between gcc and llvm code-gen?

closes issue #422
epgts added a commit that referenced this issue May 26, 2022
- Raise timeout from the default of 6 hours to 12.
  aarch64 build seems to have been almost done when it got killed at 6 hours:
  it had gotten all the way to the doctester build which is almost the last
  step (and probably the last compiler run).  A later run took only 4 hours.
  Still, we're emulating a CPU; cut it some slack!
- Only setup QEMU for arm64 not a bunch of other platforms we don't use.
- Stop setting load: true .
- Stop building doctester.
  We've built it from source since commit 0b92688 .

for issue #422
epgts added a commit that referenced this issue May 26, 2022
Nightly because even a null build with a warm cache takes 30 minutes
just to start running the first test.

for issue #422
epgts added a commit that referenced this issue May 26, 2022
Nightly because even a null build with a warm cache takes 30 minutes
just to start running the first test.

for issue #422
epgts added a commit that referenced this issue May 26, 2022
aarch64 varies a lot more on "covar_pop" and "covar_samp".
We investigated a bit and found that the postgresql equivalents
we were comparing against varied further from amd64 behavior than
toolkit's.  Possibly differences between gcc and llvm code-gen?

for issue #422
epgts added a commit that referenced this issue May 26, 2022
It became "thin" in commit d18ebe6.

Fixes segfaults on aarch64-unknown-linux-gnu
(ron-rs/ron#379)

Also drop redundant default overrides from profiles.

for issue #422
epgts added a commit that referenced this issue May 26, 2022
aarch64 varies a lot more on "covar_pop" and "covar_samp".
We investigated a bit and found that the postgresql equivalents
we were comparing against varied further from amd64 behavior than
toolkit's.  Possibly differences between gcc and llvm code-gen?

for issue #422
epgts added a commit that referenced this issue May 27, 2022
- Raise timeout from the default of 6 hours to 12.
  aarch64 build seems to have been almost done when it got killed at 6 hours:
  it had gotten all the way to the doctester build which is almost the last
  step (and probably the last compiler run).  A later run took only 4 hours.
  Still, we're emulating a CPU; cut it some slack!
- Only setup QEMU for arm64 not a bunch of other platforms we don't use.
- Stop setting load: true .
- Stop building doctester.
  We've built it from source since commit 0b92688 .

for issue #422
bors bot added a commit that referenced this issue Jun 1, 2022
434: Make some changes to support aarch64-unknown-linux-gnu r=epgts a=epgts

- Change `lto` back to default of `false`.
  - It became "thin" in commit d18ebe6 .
  - Fixes segfaults on aarch64-unknown-linux-gnu
    (ron-rs/ron#379)
- Change stats_agg::tests::pg_stats_agg_fuzz to allow more variance.
  - aarch64 varies a lot more on "covar_pop" and "covar_samp".
    We investigated a bit and found that the postgresql equivalents
    we were comparing against varied further from amd64 behavior than
    toolkit's.  Possibly differences between gcc and llvm code-gen?
- Also drop redundant default overrides from profiles in Cargo.toml .

for issue #422


Co-authored-by: Eric Gillespie <epg@timescale.com>
@epgts
Copy link
Contributor Author

epgts commented Sep 29, 2022

@epgts epgts closed this as completed Sep 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant