Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SVD test segfaults on Apple M1 #41440

Closed
Keno opened this issue Jul 1, 2021 · 32 comments · Fixed by #43664
Closed

SVD test segfaults on Apple M1 #41440

Keno opened this issue Jul 1, 2021 · 32 comments · Fixed by #43664
Labels
system:apple silicon Affects Apple Silicon only (Darwin/ARM64) - e.g. M1 and other M-series chips system:arm ARMv7 and AArch64

Comments

@Keno
Copy link
Member

Keno commented Jul 1, 2021

      From worker 6:
      From worker 6:	signal (11): Segmentation fault: 11
      From worker 6:	in expression starting at /Users/keno/julia/usr/share/julia/stdlib/v1.8/LinearAlgebra/test/svd.jl:68
      From worker 6:	ntuple at ./ntuple.jl:0
      From worker 6:	unknown function (ip: 0x115d6f2f3)
      From worker 6:	_jl_invoke at /Users/keno/julia/src/gf.c:0 [inlined]
      From worker 6:	jl_apply_generic at /Users/keno/julia/src/gf.c:2427
      From worker 6:	getindex at ./range.jl:373
      From worker 6:	_hvcat_rows at /Users/keno/julia/usr/share/julia/stdlib/v1.8/SparseArrays/src/sparsevector.jl:1114
      From worker 6:	_jl_invoke at /Users/keno/julia/src/gf.c:0 [inlined]
      From worker 6:	jl_apply_generic at /Users/keno/julia/src/gf.c:2427
@Keno Keno added system:arm ARMv7 and AArch64 system:apple silicon Affects Apple Silicon only (Darwin/ARM64) - e.g. M1 and other M-series chips labels Jul 1, 2021
@Keno Keno mentioned this issue Jul 1, 2021
31 tasks
@Keno
Copy link
Member Author

Keno commented Jul 1, 2021

Sigh, this is some sort of nasty something. Goes away under LLDB. Also no rr available 😢 .

@giordano
Copy link
Contributor

giordano commented Jul 2, 2021

This might be related: JuliaStats/Distributions.jl#1344

@giordano
Copy link
Contributor

giordano commented Jul 2, 2021

Also @chriselrod found some weird segmentation faults referencing ntuple at ./ntuple.jl:0.

@Keno
Copy link
Member Author

Keno commented Jul 2, 2021

Yes, ntuple seems to be a recurring theme in these.

@giordano
Copy link
Contributor

giordano commented Jul 2, 2021

What I found most fun in JuliaStats/Distributions.jl#1344 is that it isn't deterministic, one more reason to miss rr 🙃

@giordano
Copy link
Contributor

giordano commented Jul 4, 2021

For the record, I just ran

julia stdlib/LinearAlgebra/test/runtests.jl

on latest master and all tests are successful for me. Also julia stdlib/LinearAlgebra/test/svd.jl wouldn't reproduce the issue, but I can still reproduce JuliaStats/Distributions.jl#1344. This bug is really annoying to track down.


Edit: I can reproduce the issue by running the tests with

% julia runtests.jl LinearAlgebra
Test                          (Worker) | Time (s) | GC (s) | GC % | Alloc (MB) | RSS (MB)
LinearAlgebra/special              (7) |        started at 2021-07-05T00:19:07.868
LinearAlgebra/dense                (4) |        started at 2021-07-05T00:19:07.903
LinearAlgebra/eigen                (8) |        started at 2021-07-05T00:19:07.954
LinearAlgebra/bunchkaufman         (9) |        started at 2021-07-05T00:19:07.954
LinearAlgebra/qr                   (3) |        started at 2021-07-05T00:19:07.954
LinearAlgebra/triangular           (2) |        started at 2021-07-05T00:19:07.955
LinearAlgebra/schur                (6) |        started at 2021-07-05T00:19:07.955
LinearAlgebra/matmul               (5) |        started at 2021-07-05T00:19:07.956
LinearAlgebra/schur                (6) |    36.94 |   1.18 |  3.2 |    3824.11 |   411.69
LinearAlgebra/svd                  (6) |        started at 2021-07-05T00:19:45.171
LinearAlgebra/bunchkaufman         (9) |    40.64 |   1.84 |  4.5 |    5932.33 |   481.34
LinearAlgebra/lapack               (9) |        started at 2021-07-05T00:19:48.722
      From worker 6:
      From worker 6:	signal (11): Segmentation fault: 11
      From worker 6:	in expression starting at /Users/mose/repo/julia/usr/share/julia/stdlib/v1.8/LinearAlgebra/test/svd.jl:68
      From worker 6:	ntuple at ./ntuple.jl:0
      From worker 6:	unknown function (ip: 0x1154ddbb3)
      From worker 6:	_jl_invoke at /Users/mose/repo/julia/src/gf.c:0 [inlined]
      From worker 6:	jl_apply_generic at /Users/mose/repo/julia/src/gf.c:2427
      From worker 6:	getindex at ./range.jl:373
      From worker 6:	_hvcat_rows at /Users/mose/repo/julia/usr/share/julia/stdlib/v1.8/SparseArrays/src/sparsevector.jl:1117
      From worker 6:	_jl_invoke at /Users/mose/repo/julia/src/gf.c:0 [inlined]
      From worker 6:	jl_apply_generic at /Users/mose/repo/julia/src/gf.c:2427
      From worker 6:	jl_apply at /Users/mose/repo/julia/src/./julia.h:1787 [inlined]
      From worker 6:	do_apply at /Users/mose/repo/julia/src/builtins.c:713
      From worker 6:	hvcat at /Users/mose/repo/julia/usr/share/julia/stdlib/v1.8/SparseArrays/src/sparsevector.jl:1107
      From worker 6:	getproperty at /Users/mose/repo/julia/usr/share/julia/stdlib/v1.8/LinearAlgebra/src/svd.jl:482
      From worker 6:	unknown function (ip: 0x1154d928f)
      From worker 6:	_jl_invoke at /Users/mose/repo/julia/src/gf.c:0 [inlined]
      From worker 6:	jl_apply_generic at /Users/mose/repo/julia/src/gf.c:2427
      From worker 6:	macro expansion at /Users/mose/repo/julia/usr/share/julia/stdlib/v1.8/Test/src/Test.jl:445 [inlined]
      From worker 6:	macro expansion at /Users/mose/repo/julia/usr/share/julia/stdlib/v1.8/LinearAlgebra/test/svd.jl:106 [inlined]
      From worker 6:	macro expansion at /Users/mose/repo/julia/usr/share/julia/stdlib/v1.8/Test/src/Test.jl:1282 [inlined]
      From worker 6:	macro expansion at /Users/mose/repo/julia/usr/share/julia/stdlib/v1.8/LinearAlgebra/test/svd.jl:103 [inlined]
      From worker 6:	top-level scope at /Users/mose/repo/julia/usr/share/julia/stdlib/v1.8/Test/src/Test.jl:1357 [inlined]
      From worker 6:	top-level scope at /Users/mose/repo/julia/usr/share/julia/stdlib/v1.8/LinearAlgebra/test/svd.jl:0
      From worker 6:	jl_toplevel_eval_flex at /Users/mose/repo/julia/src/toplevel.c:876
      From worker 6:	jl_eval_module_expr at /Users/mose/repo/julia/src/toplevel.c:196 [inlined]
      From worker 6:	jl_toplevel_eval_flex at /Users/mose/repo/julia/src/toplevel.c:673
      From worker 6:	jl_toplevel_eval_flex at /Users/mose/repo/julia/src/toplevel.c:830
      From worker 6:	jl_toplevel_eval at /Users/mose/repo/julia/src/toplevel.c:894 [inlined]
      From worker 6:	jl_toplevel_eval_in at /Users/mose/repo/julia/src/toplevel.c:944
      From worker 6:	eval at ./boot.jl:373 [inlined]
      From worker 6:	include_string at ./loading.jl:1196
      From worker 6:	_jl_invoke at /Users/mose/repo/julia/src/gf.c:0 [inlined]
      From worker 6:	jl_apply_generic at /Users/mose/repo/julia/src/gf.c:2427
      From worker 6:	_include at ./loading.jl:1253
      From worker 6:	include at ./Base.jl:417 [inlined]
      From worker 6:	macro expansion at /Users/mose/repo/julia/test/testdefs.jl:24 [inlined]
      From worker 6:	macro expansion at /Users/mose/repo/julia/usr/share/julia/stdlib/v1.8/Test/src/Test.jl:1282 [inlined]
      From worker 6:	macro expansion at /Users/mose/repo/julia/test/testdefs.jl:23 [inlined]
      From worker 6:	macro expansion at ./timing.jl:368 [inlined]
      From worker 6:	#runtests#1 at /Users/mose/repo/julia/test/testdefs.jl:21
      From worker 6:	runtests##kw at /Users/mose/repo/julia/test/testdefs.jl:6 [inlined]
      From worker 6:	runtests##kw at /Users/mose/repo/julia/test/testdefs.jl:6
      From worker 6:	unknown function (ip: 0x1152e7fb7)
      From worker 6:	_jl_invoke at /Users/mose/repo/julia/src/gf.c:0 [inlined]
      From worker 6:	jl_apply_generic at /Users/mose/repo/julia/src/gf.c:2427
      From worker 6:	jl_apply at /Users/mose/repo/julia/src/./julia.h:1787 [inlined]
      From worker 6:	do_apply at /Users/mose/repo/julia/src/builtins.c:713
      From worker 6:	#106 at /Users/mose/repo/julia/usr/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:278
      From worker 6:	run_work_thunk at /Users/mose/repo/julia/usr/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:63
      From worker 6:	macro expansion at /Users/mose/repo/julia/usr/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:278 [inlined]
      From worker 6:	#105 at ./task.jl:411
      From worker 6:	unknown function (ip: 0x1152dc69f)
      From worker 6:	_jl_invoke at /Users/mose/repo/julia/src/gf.c:0 [inlined]
      From worker 6:	jl_apply_generic at /Users/mose/repo/julia/src/gf.c:2427
      From worker 6:	jl_apply at /Users/mose/repo/julia/src/./julia.h:1787 [inlined]
      From worker 6:	start_task at /Users/mose/repo/julia/src/task.c:880
      From worker 6:	Allocations: 118139621 (Pool: 118071927; Big: 67694); GC: 129

which I guess is what Keno did.

@gbaraldi
Copy link
Member

gbaraldi commented Jul 6, 2021

Ok, I was running this test and got a kernel panic, I haven't tried to reproduce it yet, the panicked task was WindowServer but I don't know if it was just a coincidence
Edit:
I tried to reproduce and failed. The panic happened while running the tests like this : julia stdlib/LinearAlgebra/test/runtests.jl. I tried running it that way again and it just worked. Running the test the way Mose showed also passed normally.

@ghost
Copy link

ghost commented Jul 31, 2021

could this be related?

dmlc/xgboost#7039

@giordano
Copy link
Contributor

I don't think so, since we we don't use LLVM libomp anywhere.

@gbaraldi
Copy link
Member

Hvcat seems to be a common denominator https://gist.github.com/gbaraldi/56f2d34fe841a182d6f29a0078830f83
I was testing a PR and got 5 ntuple errors, with 4 being related to hvcat.

@anandijain
Copy link
Contributor

Is there anything I can do to help get to the bottom of these errors?

It's very common and pretty significantly limits the amount of the ecosystem usable on M1.

@Keno
Copy link
Member Author

Keno commented Oct 1, 2021

Yes, it's the same issue as #42295. Next step is for somebody to file an upstream issue about it. Probably ping Lang Hames and Tim Northover on the Apple LLVM teams there. We may also need an extension to the MachO spec.

@anandijain
Copy link
Contributor

do you know these people? does it make more sense for you to contact or should i basically just send them links to these issues?

@Keno
Copy link
Member Author

Keno commented Oct 1, 2021

I'd start by filing a bug at https://bugs.llvm.org/, complaining that the large code model is not properly implemented on Darwin Aarch64, causing Orc JIT to crash if the memory allocator does not allocate sections within 4GB of each other (which the default allocator does not). CC the two of them on the issue. Lang is the Orc JIT maintainer, Tim is the Aarch64 backend maintainer. Apple has in general said that they're happy to help with M1 issues, so if the issue is filed there and you get no response I can send it through those channels. If you need more help understanding what the issue is, I can walk you through it.

@anandijain
Copy link
Contributor

okay i think i did this right https://bugs.llvm.org/show_bug.cgi?id=52029

@gbaraldi
Copy link
Member

gbaraldi commented Oct 5, 2021

Couldn't we just change the code model to Small? What would be the implications of doing so? i.e adding

#elif _CPU_AARCH64_ && _OS_DARWIN_
        CodeModel::Small;
#else

at:

julia/src/codegen.cpp

Lines 8275 to 8298 in 690517a

if (!targetFeatures.empty()) {
SubtargetFeatures Features;
for (unsigned i = 0; i != targetFeatures.size(); ++i)
Features.AddFeature(targetFeatures[i]);
FeaturesStr = Features.getString();
}
// Allocate a target...
Optional<CodeModel::Model> codemodel =
#ifdef _P64
// Make sure we are using the large code model on 64bit
// Let LLVM pick a default suitable for jitting on 32bit
CodeModel::Large;
#else
None;
#endif
auto optlevel = CodeGenOptLevelFor(jl_options.opt_level);
jl_TargetMachine = TheTarget->createTargetMachine(
TheTriple.getTriple(), TheCPU, FeaturesStr,
options,
Reloc::Static, // Generate simpler code for JIT
codemodel,
optlevel,
true // JIT
);

I tried adding that and ran the linalg tests and #42295 and didn't get any test failures.

@Keno
Copy link
Member Author

Keno commented Oct 5, 2021

No, that's effectively what it does now.

@vtjnash
Copy link
Member

vtjnash commented Oct 5, 2021

That was true of the old MCJIT, but OrcJIT is supposed to be better able to allocate farcall stubs on-demand if we used the small code model.

@gbaraldi
Copy link
Member

gbaraldi commented Oct 5, 2021

It looks like LLVM sets the CodeModel to Large by default unless I misunderstood

static CodeModel::Model
 getEffectiveAArch64CodeModel(const Triple &TT, Optional<CodeModel::Model> CM,
                              bool JIT) {
   if (CM) {
     if (*CM != CodeModel::Small && *CM != CodeModel::Tiny &&
         *CM != CodeModel::Large) {
       report_fatal_error(
           "Only small, tiny and large code models are allowed on AArch64");
     } else if (*CM == CodeModel::Tiny && !TT.isOSBinFormatELF())
       report_fatal_error("tiny code model is only supported on ELF");
     return *CM;
   }
   // The default MCJIT memory managers make no guarantees about where they can
   // find an executable page; JITed code needs to be able to refer to globals
   // no matter how far away they are.
   // We should set the CodeModel::Small for Windows ARM64 in JIT mode,
   // since with large code model LLVM generating 4 MOV instructions, and
   // Windows doesn't support relocating these long branch (4 MOVs).
   if (JIT && !TT.isOSWindows())
     return CodeModel::Large;
   return CodeModel::Small;
 }

@chriselrod
Copy link
Contributor

chriselrod commented Oct 9, 2021

For fun, confirming it doesn't help the ./ntuple crashes.

diff --git a/src/codegen.cpp b/src/codegen.cpp
index 754499d502..5517365f9d 100644
--- a/src/codegen.cpp
+++ b/src/codegen.cpp
@@ -8289,6 +8289,8 @@ extern "C" void jl_init_llvm(void)
         // Make sure we are using the large code model on 64bit
         // Let LLVM pick a default suitable for jitting on 32bit
         CodeModel::Large;
+#elif _CPU_AARCH64_ && _OS_DARWIN_
+       CodeModel::Small;
 #else
         None;
 #endif
] test Distributions
....
Test mvnormal | 5883   5883

signal (11): Segmentation fault: 11
in expression starting at /Users/chriselrod/.julia/packages/Distributions/1WSG5/test/mvlognormal.jl:114
ntuple at ./ntuple.jl:0

LinearAlgebra tests passed.

@tpgillam
Copy link

tpgillam commented Oct 9, 2021

@chriselrod Probably silly question — but is _P64 defined on aarch64? (If so then we'd still select the CodeModel::Large from the previous block before hitting the lines you added.) From looking at platform.h I suspect it might be, though haven't checked.

@chriselrod
Copy link
Contributor

Ah, yes, simply setting CodeModel::Small; results in a segfault during compilation:

    JULIA usr/lib/julia/corecompiler.ji
/bin/sh: line 1: 21207 Segmentation fault: 11  /Users/chriselrod/Documents/languages/julia/usr/bin/julia -C "armv8.5-a,+crc" --output-ji /Users/chriselrod/Documents/languages/julia/usr/lib/julia/corecompiler.ji.tmp --startup-file=no --warn-overwrite=yes -g0 -O0 compiler/compiler.jl
make[1]: *** [/Users/chriselrod/Documents/languages/julia/usr/lib/julia/corecompiler.ji] Error 139
make: *** [julia-sysimg-ji] Error 2

@vtjnash
Copy link
Member

vtjnash commented Oct 12, 2021

I believe we need to switch to the upcoming llvm::JITLink module (from the old RTDyldMemoryManager code) before that option is feasible.

@aterenin
Copy link

Hi all, I wanted to report that this problem also occurs with WGLMakie and Pluto and the following stack trace.

      From worker 3:	signal (11): Segmentation fault: 11
      From worker 3:	in expression starting at none:1
      From worker 3:	^ at ./math.jl:0 [inlined]
      From worker 3:	bounding_order_of_magnitude at /Users/aterenin/.julia/packages/PlotUtils/VgXdq/src/ticks.jl:14
      From worker 3:	optimize_ticks_typed at /Users/aterenin/.julia/packages/PlotUtils/VgXdq/src/ticks.jl:161
      From worker 3:	#optimize_ticks#42 at /Users/aterenin/.julia/packages/PlotUtils/VgXdq/src/ticks.jl:139 [inlined]
      From worker 3:	optimize_ticks##kw at /Users/aterenin/.julia/packages/PlotUtils/VgXdq/src/ticks.jl:138 [inlined]
      From worker 3:	get_tickvalues at /Users/aterenin/.julia/packages/Makie/gQOQF/src/makielayout/ticklocators/wilkinson.jl:21 [inlined]
      From worker 3:	get_tickvalues at /Users/aterenin/.julia/packages/Makie/gQOQF/src/makielayout/ticklocators/wilkinson.jl:17 [inlined]
      From worker 3:	get_tickvalues at /Users/aterenin/.julia/packages/Makie/gQOQF/src/makielayout/lineaxis.jl:459
      From worker 3:	get_ticks at /Users/aterenin/.julia/packages/Makie/gQOQF/src/makielayout/lineaxis.jl:453
      From worker 3:	unknown function (ip: 0x10fd20efb)
      From worker 3:	jl_apply_generic at /Applications/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
      From worker 3:	#191 at /Users/aterenin/.julia/packages/Makie/gQOQF/src/makielayout/lineaxis.jl:187
      From worker 3:	unknown function (ip: 0x10fd11e7f)
      From worker 3:	jl_apply_generic at /Applications/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
      From worker 3:	do_apply at /Applications/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
      From worker 3:	#lift#61 at /Users/aterenin/.julia/packages/Makie/gQOQF/src/interaction/nodes.jl:13
      From worker 3:	jl_apply_generic at /Applications/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
      From worker 3:	do_apply at /Applications/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
      From worker 3:	lift at /Users/aterenin/.julia/packages/Makie/gQOQF/src/interaction/nodes.jl:10
      From worker 3:	jl_apply_generic at /Applications/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
      From worker 3:	#LineAxis#181 at /Users/aterenin/.julia/packages/Makie/gQOQF/src/makielayout/lineaxis.jl:185
      From worker 3:	Type##kw at /Users/aterenin/.julia/packages/Makie/gQOQF/src/makielayout/lineaxis.jl:3
      From worker 3:	unknown function (ip: 0x10fcd11eb)
      From worker 3:	jl_apply_generic at /Applications/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
      From worker 3:	#layoutable#251 at /Users/aterenin/.julia/packages/Makie/gQOQF/src/makielayout/layoutables/axis.jl:211
      From worker 3:	layoutable at /Users/aterenin/.julia/packages/Makie/gQOQF/src/makielayout/layoutables/axis.jl:10 [inlined]
      From worker 3:	#_layoutable#11 at /Users/aterenin/.julia/packages/Makie/gQOQF/src/makielayout/layoutables.jl:69 [inlined]
      From worker 3:	_layoutable at /Users/aterenin/.julia/packages/Makie/gQOQF/src/makielayout/layoutables.jl:69 [inlined]
      From worker 3:	#_#9 at /Users/aterenin/.julia/packages/Makie/gQOQF/src/makielayout/layoutables.jl:49 [inlined]
      From worker 3:	Layoutable at /Users/aterenin/.julia/packages/Makie/gQOQF/src/makielayout/layoutables.jl:49 [inlined]
      From worker 3:	#plot#948 at /Users/aterenin/.julia/packages/Makie/gQOQF/src/figureplotting.jl:31
      From worker 3:	plot##kw at /Users/aterenin/.julia/packages/Makie/gQOQF/src/figureplotting.jl:18 [inlined]
      From worker 3:	#scatter#43 at /Users/aterenin/.julia/packages/MakieCore/S8PkO/src/recipes.jl:31
      From worker 3:	unknown function (ip: 0x10fc339db)
      From worker 3:	scatter##kw at /Users/aterenin/.julia/packages/MakieCore/S8PkO/src/recipes.jl:31
      From worker 3:	unknown function (ip: 0x10fc183af)
      From worker 3:	jl_apply_generic at /Applications/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
      From worker 3:	##function_wrapped_cell#285 at /Users/aterenin/Documents/Code/Multiscale/notebooks/notebook2.jl#==#100636b4-3a6b-4633-a279-c0cc8194bc3c:3 [inlined]
      From worker 3:	##function_wrapped_cell#285 at ./none:0
      From worker 3:	jl_apply_generic at /Applications/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
      From worker 3:	jl_f__call_latest at /Applications/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
      From worker 3:	jl_apply_generic at /Applications/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
      From worker 3:	do_apply at /Applications/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
      From worker 3:	#invokelatest#2 at ./essentials.jl:716
      From worker 3:	jl_apply_generic at /Applications/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
      From worker 3:	do_apply at /Applications/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
      From worker 3:	invokelatest at ./essentials.jl:714
      From worker 3:	jl_apply_generic at /Applications/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
      From worker 3:	do_apply at /Applications/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
      From worker 3:	compute at /Users/aterenin/.julia/packages/Pluto/7TMtD/src/runner/PlutoRunner.jl:380
      From worker 3:	#27 at /Users/aterenin/.julia/packages/Pluto/7TMtD/src/runner/PlutoRunner.jl:535
      From worker 3:	run_inside_trycatch at /Users/aterenin/.julia/packages/Pluto/7TMtD/src/runner/PlutoRunner.jl:420
      From worker 3:	#run_expression#25 at /Users/aterenin/.julia/packages/Pluto/7TMtD/src/runner/PlutoRunner.jl:535
      From worker 3:	run_expression##kw at /Users/aterenin/.julia/packages/Pluto/7TMtD/src/runner/PlutoRunner.jl:450
      From worker 3:	unknown function (ip: 0x10fbfa0cf)
      From worker 3:	jl_apply_generic at /Applications/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
      From worker 3:	do_call at /Applications/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
      From worker 3:	eval_body at /Applications/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
      From worker 3:	jl_interpret_toplevel_thunk at /Applications/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
      From worker 3:	jl_toplevel_eval_flex at /Applications/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
      From worker 3:	jl_toplevel_eval_in at /Applications/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
      From worker 3:	eval at ./boot.jl:373
      From worker 3:	jl_apply_generic at /Applications/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
      From worker 3:	do_apply at /Applications/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
      From worker 3:	#103 at /Users/administrator/src/julia/usr/share/julia/stdlib/v1.7/Distributed/src/process_messages.jl:274
      From worker 3:	run_work_thunk at /Users/administrator/src/julia/usr/share/julia/stdlib/v1.7/Distributed/src/process_messages.jl:63
      From worker 3:	unknown function (ip: 0x10fabc98b)
      From worker 3:	jl_apply_generic at /Applications/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
      From worker 3:	run_work_thunk at /Users/administrator/src/julia/usr/share/julia/stdlib/v1.7/Distributed/src/process_messages.jl:72
      From worker 3:	#96 at ./task.jl:423
      From worker 3:	unknown function (ip: 0x10fabc36f)
      From worker 3:	jl_apply_generic at /Applications/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
      From worker 3:	start_task at /Applications/Julia-1.7.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.7.dylib (unknown line)
      From worker 3:	Allocations: 172690609 (Pool: 172643111; Big: 47498); GC: 125

staticfloat added a commit that referenced this issue Dec 17, 2021
… Darwin

This change is inspired by this comment [0], switching our linking layer
to the newer one as recommended by [1].  With these changes, I can pass
the `Distributions` test suite on aarch64 darwin, and so it appears it
fixes at least one of the segfault issues noted on apple silicon.

[0] #41440 (comment)
[1] https://llvm.org/docs/JITLink.html#jitlink-and-objectlinkinglayer
@dnadlinger
Copy link
Member

I've got a WIP patch that ports Julia to LLVM Git main and ObjectLinkingLayer/CodeModel::Small – while I'm still working on debug info integration, the Distributions.jl tests pass with it on darwin-aarch64.

@staticfloat
Copy link
Member

I will note that Distributions.jl tests work on the current Julia master for me..... so that may not be the best test case. I haven't found a reliable way to trigger the issues yet, other than running the entire Julia test suite.

@dnadlinger
Copy link
Member

dnadlinger commented Dec 28, 2021

Distributions.jl and LinearAlgebra/svd previously both failed for me and pass now – working on debuginfo so I can get all the backtrace-related failures out of the way.

(I'm assuming "entire Julia test suite" refers to make test.)

@aterenin
Copy link

I will note that Distributions.jl tests work on the current Julia master for me..... so that may not be the best test case. I haven't found a reliable way to trigger the issues yet, other than running the entire Julia test suite.

I have not either, but I have also not been able to use Julia on M1 ARM for more than 15 minutes without this crash occurring somewhere. The failure is generally some low level operation in math.jl or tuple.jl. It is very, very common, which should make it easy to notice when the problem is fixed.

@giordano
Copy link
Contributor

giordano commented Dec 28, 2021

I will note that Distributions.jl tests work on the current Julia master for me

@staticfloat try running them multiple times, they still crash for me:

julia> versioninfo()
Julia Version 1.8.0-DEV.1183
Commit 693b4471fa (2021-12-28 10:36 UTC)
Platform Info:
  OS: macOS (arm64-apple-darwin20.6.0)
  CPU: Apple M1
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-12.0.1 (ORCJIT, cyclone)
Environment:
  JULIA_PKG_USE_CLI_GIT = true

(tmp) pkg> test Distributions
     Testing Distributions
      Status `/private/var/folders/v2/hmy3kzgj4tb3xsy8qkltxd0r0000gn/T/jl_q5qHGj/Project.toml`
  [49dc2e85] Calculus v0.5.1
  [d360d2e6] ChainRulesCore v1.11.2
  [cdddcdb0] ChainRulesTestUtils v1.3.1
  [b429d917] DensityInterface v0.4.0
  [31c24e10] Distributions v0.25.37
  [1a297f60] FillArrays v0.12.7
  [26cc04aa] FiniteDifferences v0.12.20
  [f6369f11] ForwardDiff v0.10.24
  [682c06a0] JSON v0.21.2
  [90014a1f] PDMats v0.11.5
  [1fd47b50] QuadGK v2.4.2
  [276daf66] SpecialFunctions v2.0.0
  [860ef19b] StableRNGs v1.0.0
  [90137ffa] StaticArrays v1.2.13
  [2913bbd2] StatsBase v0.33.14
  [4c63d2b9] StatsFuns v0.9.14
  [8ba89e20] Distributed `@stdlib/Distributed`
  [37e2e46d] LinearAlgebra `@stdlib/LinearAlgebra`
  [de0858da] Printf `@stdlib/Printf`
  [9a3f8284] Random `@stdlib/Random`
  [2f01184e] SparseArrays `@stdlib/SparseArrays`
  [10745b16] Statistics `@stdlib/Statistics`
  [8dfed614] Test `@stdlib/Test`
      Status `/private/var/folders/v2/hmy3kzgj4tb3xsy8qkltxd0r0000gn/T/jl_q5qHGj/Manifest.toml`
  [49dc2e85] Calculus v0.5.1
  [d360d2e6] ChainRulesCore v1.11.2
  [cdddcdb0] ChainRulesTestUtils v1.3.1
  [9e997f8a] ChangesOfVariables v0.1.2
  [bbf7d656] CommonSubexpressions v0.3.0
  [34da2185] Compat v3.41.0
  [9a962f9c] DataAPI v1.9.0
  [864edb3b] DataStructures v0.18.11
  [b429d917] DensityInterface v0.4.0
  [163ba53b] DiffResults v1.0.3
  [b552c78f] DiffRules v1.9.0
  [31c24e10] Distributions v0.25.37
  [ffbed154] DocStringExtensions v0.8.6
  [1a297f60] FillArrays v0.12.7
  [26cc04aa] FiniteDifferences v0.12.20
  [f6369f11] ForwardDiff v0.10.24
  [3587e190] InverseFunctions v0.1.2
  [92d709cd] IrrationalConstants v0.1.1
  [692b3bcd] JLLWrappers v1.3.0
  [682c06a0] JSON v0.21.2
  [2ab3a3ac] LogExpFunctions v0.3.6
  [1914dd2f] MacroTools v0.5.9
  [e1d29d7a] Missings v1.0.2
  [77ba4419] NaNMath v0.3.6
  [bac558e1] OrderedCollections v1.4.1
  [90014a1f] PDMats v0.11.5
  [69de0a69] Parsers v2.1.3
  [21216c6a] Preferences v1.2.3
  [1fd47b50] QuadGK v2.4.2
  [189a3867] Reexport v1.2.2
  [708f8203] Richardson v1.4.0
  [79098fc4] Rmath v0.7.0
  [a2af1166] SortingAlgorithms v1.0.1
  [276daf66] SpecialFunctions v2.0.0
  [860ef19b] StableRNGs v1.0.0
  [90137ffa] StaticArrays v1.2.13
  [82ae8749] StatsAPI v1.2.0
  [2913bbd2] StatsBase v0.33.14
  [4c63d2b9] StatsFuns v0.9.14
  [efe28fd5] OpenSpecFun_jll v0.5.5+0
  [f50d1b31] Rmath_jll v0.3.0+0
  [0dad84c5] ArgTools v1.1.1 `@stdlib/ArgTools`
  [56f22d72] Artifacts `@stdlib/Artifacts`
  [2a0f44e3] Base64 `@stdlib/Base64`
  [ade2ca70] Dates `@stdlib/Dates`
  [8bb1440f] DelimitedFiles `@stdlib/DelimitedFiles`
  [8ba89e20] Distributed `@stdlib/Distributed`
  [f43a241f] Downloads v1.5.1 `@stdlib/Downloads`
  [7b1f6079] FileWatching `@stdlib/FileWatching`
  [b77e0a4c] InteractiveUtils `@stdlib/InteractiveUtils`
  [b27032c2] LibCURL v0.6.3 `@stdlib/LibCURL`
  [76f85450] LibGit2 `@stdlib/LibGit2`
  [8f399da3] Libdl `@stdlib/Libdl`
  [37e2e46d] LinearAlgebra `@stdlib/LinearAlgebra`
  [56ddb016] Logging `@stdlib/Logging`
  [d6f4376e] Markdown `@stdlib/Markdown`
  [a63ad114] Mmap `@stdlib/Mmap`
  [ca575930] NetworkOptions v1.2.0 `@stdlib/NetworkOptions`
  [44cfe95a] Pkg v1.8.0 `@stdlib/Pkg`
  [de0858da] Printf `@stdlib/Printf`
  [3fa0cd96] REPL `@stdlib/REPL`
  [9a3f8284] Random `@stdlib/Random`
  [ea8e919c] SHA v0.7.0 `@stdlib/SHA`
  [9e88b42a] Serialization `@stdlib/Serialization`
  [1a1011a3] SharedArrays `@stdlib/SharedArrays`
  [6462fe0b] Sockets `@stdlib/Sockets`
  [2f01184e] SparseArrays `@stdlib/SparseArrays`
  [10745b16] Statistics `@stdlib/Statistics`
  [4607b0f0] SuiteSparse `@stdlib/SuiteSparse`
  [fa267f1f] TOML v1.0.0 `@stdlib/TOML`
  [a4e569a6] Tar v1.10.0 `@stdlib/Tar`
  [8dfed614] Test `@stdlib/Test`
  [cf7118a7] UUIDs `@stdlib/UUIDs`
  [4ec0a83e] Unicode `@stdlib/Unicode`
  [e66e0078] CompilerSupportLibraries_jll v0.5.0+0 `@stdlib/CompilerSupportLibraries_jll`
  [deac9b47] LibCURL_jll v7.73.0+4 `@stdlib/LibCURL_jll`
  [29816b5a] LibSSH2_jll v1.9.1+2 `@stdlib/LibSSH2_jll`
  [c8ffd9c3] MbedTLS_jll v2.24.0+2 `@stdlib/MbedTLS_jll`
  [14a3606d] MozillaCACerts_jll v2020.7.22 `@stdlib/MozillaCACerts_jll`
  [4536629a] OpenBLAS_jll v0.3.17+2 `@stdlib/OpenBLAS_jll`
  [05823500] OpenLibm_jll v0.7.5+0 `@stdlib/OpenLibm_jll`
  [83775a58] Zlib_jll v1.2.12+1 `@stdlib/Zlib_jll`
  [8e850b90] libblastrampoline_jll v3.1.0+0 `@stdlib/libblastrampoline_jll`
  [8e850ede] nghttp2_jll v1.41.0+1 `@stdlib/nghttp2_jll`
  [3f19e933] p7zip_jll v16.2.1+1 `@stdlib/p7zip_jll`
     Testing Running tests...
Running tests:
Test Summary:   | Pass  Total  Time
Test loguniform |  123    123  1.2s
Test Summary: | Pass  Total  Time
Test arcsine  |   29     29  0.2s
Test Summary: | Pass  Total  Time
Test dirac    |  138    138  0.7s
    [Discrete]
    ------------
    testing truncated(BetaBinomial(10, 0.2, 0.25),3,5)
    testing truncated(BetaBinomial(10, 2, 2.5),3,5)
    testing truncated(BetaBinomial(10, 60, 40),3,5)
    testing truncated(Binomial(5, 0.4),3,5)
    testing truncated(Binomial(6, 0.8),3,5)
    testing truncated(Binomial(100, 0.1),3,5)
    testing truncated(Binomial(100, 0.9),3,5)
    testing truncated(Binomial(10, 0.0),3,5)
    testing truncated(Binomial(10, 1.0),3,5)
    testing truncated(DiscreteUniform(6),3,5)
    testing truncated(DiscreteUniform(7),3,5)
    testing truncated(DiscreteUniform(2, 8),3,5)
    testing truncated(Geometric(),3,5)
    testing truncated(Geometric(0.02),3,5)
    testing truncated(Geometric(0.1),3,5)
    testing truncated(Geometric(0.5),3,5)
    testing truncated(Geometric(0.9),3,5)
    testing truncated(NegativeBinomial(),3,5)
    testing truncated(NegativeBinomial(6),3,5)
    testing truncated(NegativeBinomial(1, 0.5),3,5)
    testing truncated(NegativeBinomial(5, 0.6),3,5)
    testing truncated(NegativeBinomial(0.5, 0.5),3,5)
    testing truncated(Poisson(),3,5)
    testing truncated(Poisson(0.5),3,5)
    testing truncated(Poisson(2.0),3,5)
    testing truncated(Poisson(10.0),3,5)
    testing truncated(Poisson(80.0),3,5)

    [Continuous]
    ------------
    testing truncated(BetaPrime(),3,5)
    testing truncated(BetaPrime(3.0),3,5)
    testing truncated(BetaPrime(3.0, 5.0),3,5)
    testing truncated(BetaPrime(5.0, 3.0),3,5)
    testing truncated(Cauchy(),3,5)
    testing truncated(Cauchy(2.0),3,5)
    testing truncated(Cauchy(0.0, 1.0),3,5)
    testing truncated(Cauchy(10.0, 1.0),3,5)
    testing truncated(Cauchy(2.0, 10.0),3,5)
    testing truncated(Chi(1),3,5)
    testing truncated(Chi(2),3,5)
    testing truncated(Chi(3),3,5)
    testing truncated(Chi(12),3,5)
    testing truncated(Chisq(1),3,5)
    testing truncated(Chisq(8),3,5)
    testing truncated(Chisq(20),3,5)
    testing truncated(Erlang(),3,5)
    testing truncated(Erlang(3),3,5)
    testing truncated(Erlang(3, 1.0),3,5)
    testing truncated(Erlang(5, 2.0),3,5)
    testing truncated(Exponential(),3,5)
    testing truncated(Exponential(2.0),3,5)
    testing truncated(Exponential(6.5),3,5)
    testing truncated(FDist(6.0, 8.0),3,5)
    testing truncated(FDist(8.0, 6.0),3,5)
    testing truncated(FDist(30, 40),3,5)
    testing truncated(Frechet(),3,5)
    testing truncated(Frechet(0.5),3,5)
    testing truncated(Frechet(3.0),3,5)
    testing truncated(Frechet(20.0),3,5)
    testing truncated(Frechet(60.0),3,5)
    testing truncated(Frechet(0.5, 2.0),3,5)
    testing truncated(Frechet(3.0, 2.0),3,5)
    testing truncated(Gamma(),3,5)
    testing truncated(Gamma(2.0),3,5)
    testing truncated(Gamma(1.0, 1.0),3,5)
    testing truncated(Gamma(3.0, 1.0),3,5)
    testing truncated(Gamma(3.0, 2.0),3,5)
    testing truncated(GeneralizedExtremeValue(1.0, 1.0, 1.0),3,5)
    testing truncated(GeneralizedExtremeValue(0.0, 1.0, 0.0),3,5)
    testing truncated(GeneralizedExtremeValue(0.0, 1.0, 1.1),3,5)
    testing truncated(GeneralizedExtremeValue(0.0, 1.0, 0.6),3,5)
    testing truncated(GeneralizedExtremeValue(0.0, 1.0, 0.3),3,5)
    testing truncated(GeneralizedExtremeValue(-1.0, 0.5, 0.6),3,5)
    testing truncated(GeneralizedPareto(),3,5)
    testing truncated(GeneralizedPareto(1.0, 1.0),3,5)
    testing truncated(GeneralizedPareto(0.1, 2.0),3,5)
    testing truncated(GeneralizedPareto(1.0, 1.0, 1.0),3,5)
    testing truncated(GeneralizedPareto(-1.5, 0.5, 2.0),3,5)
    testing truncated(Gumbel(),3,5)
    testing truncated(Gumbel(3.0),3,5)
    testing truncated(Gumbel(3.0, 5.0),3,5)
    testing truncated(Gumbel(5.0, 3.0),3,5)
    testing truncated(InverseGamma(),3,5)
    testing truncated(InverseGamma(2.0),3,5)
    testing truncated(InverseGamma(1.0, 1.0),3,5)
    testing truncated(InverseGamma(1.0, 2.0),3,5)
    testing truncated(InverseGamma(2.0, 1.0),3,5)
    testing truncated(InverseGamma(2.0, 3.0),3,5)
    testing truncated(InverseGaussian(),3,5)
    testing truncated(InverseGaussian(0.8),3,5)
    testing truncated(InverseGaussian(2.0),3,5)
    testing truncated(InverseGaussian(1.0, 1.0),3,5)
    testing truncated(InverseGaussian(2.0, 1.5),3,5)
    testing truncated(InverseGaussian(2.0, 7.0),3,5)
    testing truncated(Laplace(),3,5)
    testing truncated(Laplace(2.0),3,5)
    testing truncated(Laplace(0.0, 1.0),3,5)
    testing truncated(Laplace(5.0, 1.0),3,5)
    testing truncated(Laplace(5.0, 1.5),3,5)
    testing truncated(Levy(),3,5)
    testing truncated(Levy(2),3,5)
    testing truncated(Levy(2, 8),3,5)
    testing truncated(Levy(3.0, 3),3,5)
    testing truncated(Logistic(),3,5)
    testing truncated(Logistic(2.0),3,5)
    testing truncated(Logistic(0.0, 1.0),3,5)
    testing truncated(Logistic(5.0, 1.0),3,5)
    testing truncated(Logistic(2.0, 1.5),3,5)
    testing truncated(Logistic(5.0, 1.5),3,5)
    testing truncated(LogNormal(),3,5)
    testing truncated(LogNormal(1.0),3,5)
    testing truncated(LogNormal(0.0, 2.0),3,5)
    testing truncated(LogNormal(1.0, 2.0),3,5)
    testing truncated(LogNormal(3.0, 0.5),3,5)
    testing truncated(LogNormal(3.0, 1.0),3,5)
    testing truncated(LogNormal(3.0, 2.0),3,5)
    testing truncated(NoncentralChisq(2, 2),3,5)
    testing truncated(NoncentralChisq(2, 5),3,5)
    testing truncated(NoncentralF(2, 2, 2),3,5)
    testing truncated(NoncentralF(8, 10, 5),3,5)
    testing truncated(NoncentralT(2, 2),3,5)
    testing truncated(NoncentralT(10, 2),3,5)
    testing truncated(Normal(),3,5)
    testing truncated(Normal(2.0),3,5)
    testing truncated(Normal(-3.0, 2.0),3,5)
    testing truncated(Normal(1.0, 10.0),3,5)
    testing truncated(NormalCanon(),3,5)
    testing truncated(NormalCanon(0.0, 1.0),3,5)
    testing truncated(NormalCanon(-1.0, 2.5),3,5)
    testing truncated(NormalCanon(2.0, 0.8),3,5)
    testing truncated(Pareto(),3,5)
    testing truncated(Pareto(2.0),3,5)
    testing truncated(Pareto(2.0, 1.5),3,5)
    testing truncated(Pareto(3.0, 2.0),3,5)
    testing truncated(Rayleigh(),3,5)
    testing truncated(Rayleigh(3.0),3,5)
    testing truncated(Rayleigh(8.0),3,5)
    testing truncated(Rician(1.0, 1.0),3,5)
    testing truncated(Rician(5.0, 1.0),3,5)
    testing truncated(Rician(10.0, 1.0),3,5)
    testing truncated(StudentizedRange(2.0, 2.0),3,5)
    testing truncated(StudentizedRange(5.0, 10.0),3,5)
    testing truncated(StudentizedRange(10.0, 5.0),3,5)
    testing truncated(SymTriangularDist(3.0, 2.0),3,5)
    testing truncated(SymTriangularDist(10.0, 8.0),3,5)
    testing truncated(TDist(1.2),3,5)
    testing truncated(TDist(5.0),3,5)
    testing truncated(TDist(28.0),3,5)
    testing truncated(TriangularDist(0, 5),3,5)
    testing truncated(TriangularDist(-4, 14, 3),3,5)
    testing truncated(TriangularDist(2, 2000, 500),3,5)
    testing truncated(truncated(Normal(27, 3), 0, Inf),3,5)
    testing truncated(Uniform(3.0, 17.0),3,5)
    testing truncated(Weibull(),3,5)
    testing truncated(Weibull(0.5),3,5)
    testing truncated(Weibull(5.0),3,5)
    testing truncated(Weibull(20.0, 1.0),3,5)
    testing truncated(Weibull(1.0, 2.0),3,5)
    testing truncated(Weibull(5.0, 2.0),3,5)

┌ Warning: `@_inline_meta` is deprecated, use `@inline` instead.
│   caller = get_staged(mi::Core.MethodInstance) at utilities.jl:110
└ @ Core.Compiler ./compiler/utilities.jl:110
Test Summary: | Pass  Total   Time
Test truncate | 5318   5318  14.0s
Test Summary:    | Pass  Total  Time
Test truncnormal |  328    328  0.7s
Test Summary:              | Pass  Total  Time
Test truncated_exponential |    8      8  0.0s
Test Summary: | Pass  Total  Time
Test normal   |  197    197  1.0s
Test Summary: | Pass  Total  Time
Test laplace  |   24     24  0.1s
Test Summary: | Pass  Total  Time
Test cauchy   |   24     24  0.2s
Test Summary: | Pass  Total  Time
Test uniform  |   24     24  0.1s
Test Summary:  | Pass  Total  Time
Test lognormal |  227    227  0.6s
Test Summary: | Pass  Total   Time
Test mvnormal | 5976   5976  18.1s

signal (11): Segmentation fault: 11
in expression starting at /Users/mose/.julia/packages/Distributions/vR2pk/test/mvlognormal.jl:114
ntuple at ./ntuple.jl:0
unknown function (ip: 0x117f8f7df)
_jl_invoke at /Users/mose/repo/julia/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/mose/repo/julia/src/gf.c:2486
#mapslices#174 at ./abstractarray.jl:2799
mapslices##kw at ./abstractarray.jl:2755 [inlined]
_median at /Users/mose/repo/julia/usr/share/julia/stdlib/v1.8/Statistics/src/Statistics.jl:873 [inlined]
#median#47 at /Users/mose/repo/julia/usr/share/julia/stdlib/v1.8/Statistics/src/Statistics.jl:871 [inlined]
median##kw at /Users/mose/repo/julia/usr/share/julia/stdlib/v1.8/Statistics/src/Statistics.jl:871 [inlined]
test_mvlognormal at /Users/mose/.julia/packages/Distributions/vR2pk/test/mvlognormal.jl:53
test_mvlognormal at /Users/mose/.julia/packages/Distributions/vR2pk/test/mvlognormal.jl:12
unknown function (ip: 0x117f8d84f)
_jl_invoke at /Users/mose/repo/julia/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/mose/repo/julia/src/gf.c:2486
jl_apply at /Users/mose/repo/julia/src/./julia.h:1789 [inlined]
do_call at /Users/mose/repo/julia/src/interpreter.c:126
eval_body at /Users/mose/repo/julia/src/interpreter.c:0
eval_body at /Users/mose/repo/julia/src/interpreter.c:522
eval_body at /Users/mose/repo/julia/src/interpreter.c:522
jl_interpret_toplevel_thunk at /Users/mose/repo/julia/src/interpreter.c:744
jl_toplevel_eval_flex at /Users/mose/repo/julia/src/toplevel.c:888
jl_toplevel_eval_flex at /Users/mose/repo/julia/src/toplevel.c:832
ijl_toplevel_eval at /Users/mose/repo/julia/src/toplevel.c:897 [inlined]
ijl_toplevel_eval_in at /Users/mose/repo/julia/src/toplevel.c:947
eval at ./boot.jl:368 [inlined]
include_string at ./loading.jl:1251
_jl_invoke at /Users/mose/repo/julia/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/mose/repo/julia/src/gf.c:2486
_include at ./loading.jl:1308
include at ./client.jl:460 [inlined]
macro expansion at /Users/mose/.julia/packages/Distributions/vR2pk/test/runtests.jl:85 [inlined]
macro expansion at /Users/mose/repo/julia/usr/share/julia/stdlib/v1.8/Test/src/Test.jl:1375 [inlined]
top-level scope at /Users/mose/.julia/packages/Distributions/vR2pk/test/runtests.jl:85
jl_toplevel_eval_flex at /Users/mose/repo/julia/src/toplevel.c:879
jl_toplevel_eval_flex at /Users/mose/repo/julia/src/toplevel.c:832
ijl_toplevel_eval at /Users/mose/repo/julia/src/toplevel.c:897 [inlined]
ijl_toplevel_eval_in at /Users/mose/repo/julia/src/toplevel.c:947
eval at ./boot.jl:368 [inlined]
include_string at ./loading.jl:1251
_jl_invoke at /Users/mose/repo/julia/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/mose/repo/julia/src/gf.c:2486
_include at ./loading.jl:1308
include at ./client.jl:460
unknown function (ip: 0x117bac067)
_jl_invoke at /Users/mose/repo/julia/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/mose/repo/julia/src/gf.c:2486
jl_apply at /Users/mose/repo/julia/src/./julia.h:1789 [inlined]
do_call at /Users/mose/repo/julia/src/interpreter.c:126
eval_body at /Users/mose/repo/julia/src/interpreter.c:0
jl_interpret_toplevel_thunk at /Users/mose/repo/julia/src/interpreter.c:744
jl_toplevel_eval_flex at /Users/mose/repo/julia/src/toplevel.c:888
jl_toplevel_eval_flex at /Users/mose/repo/julia/src/toplevel.c:832
ijl_toplevel_eval at /Users/mose/repo/julia/src/toplevel.c:897 [inlined]
ijl_toplevel_eval_in at /Users/mose/repo/julia/src/toplevel.c:947
eval at ./boot.jl:368 [inlined]
exec_options at ./client.jl:280
_start at ./client.jl:506
jfptr__start_46562 at /Users/mose/repo/julia/usr/lib/julia/sys.dylib (unknown line)
_jl_invoke at /Users/mose/repo/julia/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/mose/repo/julia/src/gf.c:2486
jl_apply at /Users/mose/repo/julia/src/./julia.h:1789 [inlined]
true_main at /Users/mose/repo/julia/src/jlapi.c:562
jl_repl_entrypoint at /Users/mose/repo/julia/src/jlapi.c:706
Allocations: 185990432 (Pool: 185919860; Big: 70572); GC: 216
ERROR: Package Distributions errored during testing (exit code: 139)

This is non-reproducible, it crashes most of the time, but every now and then tests do pass.

Now that we print timings of testsets, I noted a correlation between the mvnormal set taking about 18 seconds and the segfault, when it takes about 5 seconds, tests will go on. Edit: after running the tests more times, I'm not sure this correlation is real.

@dnadlinger
Copy link
Member

My WIP branch is here: https://github.com/dnadlinger/julia/commits/aarch64-darwin

This isn't usable for end users yet, as debug info registration (backtraces, …) isn't working yet (just using DebuggerSupportPlugin didn't work out as smoothly as I had hoped, see commit message), and it currently requires building against a recent LLVM from Git. Just for reference in case @staticfloat is planning to look into this while it's night time here across the pond.

@dnadlinger
Copy link
Member

Now with eh frame and debug info registration fixed: dnadlinger@6feb722. See commit message for a few caveats re LLVM patches – the code is also rather janky and uncommented, but passes almost all of the main test suite again.

Note that this replaces the RTDyldMemoryManagerJL codegen memory manager with the regular LLVM JITLink in-process one on darwin-aarch64, so at least jl_jit_total_bytes is still broken. Somebody more familiar with the memory manager implementation would need to check what the best way to port the memory overhead optimisations over is.

dnadlinger added a commit to dnadlinger/julia that referenced this issue Jan 5, 2022
…ll code model

This fixes JuliaLang#41440, JuliaLang#43285 and similar issues, which stem from
CodeModel::Large not being correctly implemented on MachO/ARM64.

Requires LLVM 13.x or Git main (tested: 1dd5e6fed5db with patches
from the JuliaLang/llvm-project julia-release/13.x branch, available
at https://github.com/dnadlinger/llvm-project/commits/julia-main).

Requires an LLVM patch to pass through __eh_frame unwind information,
without which backtraces silently won't work:
llvm/llvm-project#52921

```
diff --git a/llvm/lib/ExecutionEngine/JITLink/MachO_arm64.cpp b/llvm/lib/ExecutionEngine/JITLink/MachO_arm64.cpp
index f2a029d35cd5..4d958b302ff9 100644
--- a/llvm/lib/ExecutionEngine/JITLink/MachO_arm64.cpp
+++ b/llvm/lib/ExecutionEngine/JITLink/MachO_arm64.cpp
@@ -705,6 +705,10 @@ void link_MachO_arm64(std::unique_ptr<LinkGraph> G,
     Config.PrePrunePasses.push_back(
         CompactUnwindSplitter("__LD,__compact_unwind"));

+    Config.PrePrunePasses.push_back(EHFrameSplitter("__TEXT,__eh_frame"));
+    Config.PrePrunePasses.push_back(EHFrameEdgeFixer("__TEXT,__eh_frame",
+        8, Delta64, Delta32, NegDelta32));
+
     // Add an in-place GOT/Stubs pass.
     Config.PostPrunePasses.push_back(
         PerGraphGOTAndPLTStubsBuilder_MachO_arm64::asPass);
```
@dnadlinger
Copy link
Member

dnadlinger commented Jan 5, 2022

PR now at #43664.

dnadlinger added a commit to dnadlinger/julia that referenced this issue Jan 5, 2022
…ll code model

This fixes JuliaLang#41440, JuliaLang#43285 and similar issues, which stem from
CodeModel::Large not being correctly implemented on MachO/ARM64.

Requires LLVM 13.x or Git main (tested: 1dd5e6fed5db with patches
from the JuliaLang/llvm-project julia-release/13.x branch, available
at https://github.com/dnadlinger/llvm-project/commits/julia-main).

Requires an LLVM patch to pass through __eh_frame unwind information,
without which backtraces silently won't work:
llvm/llvm-project#52921

```
diff --git a/llvm/lib/ExecutionEngine/JITLink/MachO_arm64.cpp b/llvm/lib/ExecutionEngine/JITLink/MachO_arm64.cpp
index f2a029d35cd5..4d958b302ff9 100644
--- a/llvm/lib/ExecutionEngine/JITLink/MachO_arm64.cpp
+++ b/llvm/lib/ExecutionEngine/JITLink/MachO_arm64.cpp
@@ -705,6 +705,10 @@ void link_MachO_arm64(std::unique_ptr<LinkGraph> G,
     Config.PrePrunePasses.push_back(
         CompactUnwindSplitter("__LD,__compact_unwind"));

+    Config.PrePrunePasses.push_back(EHFrameSplitter("__TEXT,__eh_frame"));
+    Config.PrePrunePasses.push_back(EHFrameEdgeFixer("__TEXT,__eh_frame",
+        8, Delta64, Delta32, NegDelta32));
+
     // Add an in-place GOT/Stubs pass.
     Config.PostPrunePasses.push_back(
         PerGraphGOTAndPLTStubsBuilder_MachO_arm64::asPass);
```
dnadlinger added a commit to dnadlinger/julia that referenced this issue Jan 5, 2022
…ll code model

This fixes JuliaLang#41440, JuliaLang#43285 and similar issues, which stem from
CodeModel::Large not being correctly implemented on MachO/ARM64.

Requires LLVM 13.x or Git main (tested: 1dd5e6fed5db with patches
from the JuliaLang/llvm-project julia-release/13.x branch, available
at https://github.com/dnadlinger/llvm-project/commits/julia-main).

Requires an LLVM patch to pass through __eh_frame unwind information,
without which backtraces silently won't work:
llvm/llvm-project#52921

```
diff --git a/llvm/lib/ExecutionEngine/JITLink/MachO_arm64.cpp b/llvm/lib/ExecutionEngine/JITLink/MachO_arm64.cpp
index f2a029d35cd5..4d958b302ff9 100644
--- a/llvm/lib/ExecutionEngine/JITLink/MachO_arm64.cpp
+++ b/llvm/lib/ExecutionEngine/JITLink/MachO_arm64.cpp
@@ -705,6 +705,10 @@ void link_MachO_arm64(std::unique_ptr<LinkGraph> G,
     Config.PrePrunePasses.push_back(
         CompactUnwindSplitter("__LD,__compact_unwind"));

+    Config.PrePrunePasses.push_back(EHFrameSplitter("__TEXT,__eh_frame"));
+    Config.PrePrunePasses.push_back(EHFrameEdgeFixer("__TEXT,__eh_frame",
+        8, Delta64, Delta32, NegDelta32));
+
     // Add an in-place GOT/Stubs pass.
     Config.PostPrunePasses.push_back(
         PerGraphGOTAndPLTStubsBuilder_MachO_arm64::asPass);
```
dnadlinger added a commit to dnadlinger/julia that referenced this issue Jan 5, 2022
…ll code model

This fixes JuliaLang#41440, JuliaLang#43285 and similar issues, which stem from
CodeModel::Large not being correctly implemented on MachO/ARM64.

Requires LLVM 13.x or Git main (tested: 1dd5e6fed5db with patches
from the JuliaLang/llvm-project julia-release/13.x branch, available
at https://github.com/dnadlinger/llvm-project/commits/julia-main).

Requires an LLVM patch to pass through __eh_frame unwind information,
without which backtraces silently won't work:
llvm/llvm-project#52921

```
diff --git a/llvm/lib/ExecutionEngine/JITLink/MachO_arm64.cpp b/llvm/lib/ExecutionEngine/JITLink/MachO_arm64.cpp
index f2a029d35cd5..4d958b302ff9 100644
--- a/llvm/lib/ExecutionEngine/JITLink/MachO_arm64.cpp
+++ b/llvm/lib/ExecutionEngine/JITLink/MachO_arm64.cpp
@@ -705,6 +705,10 @@ void link_MachO_arm64(std::unique_ptr<LinkGraph> G,
     Config.PrePrunePasses.push_back(
         CompactUnwindSplitter("__LD,__compact_unwind"));

+    Config.PrePrunePasses.push_back(EHFrameSplitter("__TEXT,__eh_frame"));
+    Config.PrePrunePasses.push_back(EHFrameEdgeFixer("__TEXT,__eh_frame",
+        8, Delta64, Delta32, NegDelta32));
+
     // Add an in-place GOT/Stubs pass.
     Config.PostPrunePasses.push_back(
         PerGraphGOTAndPLTStubsBuilder_MachO_arm64::asPass);
```
dnadlinger added a commit to dnadlinger/julia that referenced this issue Jan 5, 2022
…ll code model

This fixes JuliaLang#41440, JuliaLang#43285 and similar issues, which stem from
CodeModel::Large not being correctly implemented on MachO/ARM64.

Requires LLVM 13.x or Git main (tested: 1dd5e6fed5db with patches
from the JuliaLang/llvm-project julia-release/13.x branch, available
at https://github.com/dnadlinger/llvm-project/commits/julia-main).

Requires an LLVM patch to pass through __eh_frame unwind information,
without which backtraces silently won't work:
llvm/llvm-project#52921

```
diff --git a/llvm/lib/ExecutionEngine/JITLink/MachO_arm64.cpp b/llvm/lib/ExecutionEngine/JITLink/MachO_arm64.cpp
index f2a029d35cd5..4d958b302ff9 100644
--- a/llvm/lib/ExecutionEngine/JITLink/MachO_arm64.cpp
+++ b/llvm/lib/ExecutionEngine/JITLink/MachO_arm64.cpp
@@ -705,6 +705,10 @@ void link_MachO_arm64(std::unique_ptr<LinkGraph> G,
     Config.PrePrunePasses.push_back(
         CompactUnwindSplitter("__LD,__compact_unwind"));

+    Config.PrePrunePasses.push_back(EHFrameSplitter("__TEXT,__eh_frame"));
+    Config.PrePrunePasses.push_back(EHFrameEdgeFixer("__TEXT,__eh_frame",
+        8, Delta64, Delta32, NegDelta32));
+
     // Add an in-place GOT/Stubs pass.
     Config.PostPrunePasses.push_back(
         PerGraphGOTAndPLTStubsBuilder_MachO_arm64::asPass);
```
dnadlinger added a commit to dnadlinger/julia that referenced this issue Jan 5, 2022
…ll code model

This fixes JuliaLang#41440, JuliaLang#43285 and similar issues, which stem from
CodeModel::Large not being correctly implemented on MachO/ARM64.

Requires LLVM 13.x or Git main (tested: 1dd5e6fed5db with patches
from the JuliaLang/llvm-project julia-release/13.x branch, available
at https://github.com/dnadlinger/llvm-project/commits/julia-main).

Requires an LLVM patch to pass through __eh_frame unwind information,
without which backtraces silently won't work:
llvm/llvm-project#52921

```
diff --git a/llvm/lib/ExecutionEngine/JITLink/MachO_arm64.cpp b/llvm/lib/ExecutionEngine/JITLink/MachO_arm64.cpp
index f2a029d35cd5..4d958b302ff9 100644
--- a/llvm/lib/ExecutionEngine/JITLink/MachO_arm64.cpp
+++ b/llvm/lib/ExecutionEngine/JITLink/MachO_arm64.cpp
@@ -705,6 +705,10 @@ void link_MachO_arm64(std::unique_ptr<LinkGraph> G,
     Config.PrePrunePasses.push_back(
         CompactUnwindSplitter("__LD,__compact_unwind"));

+    Config.PrePrunePasses.push_back(EHFrameSplitter("__TEXT,__eh_frame"));
+    Config.PrePrunePasses.push_back(EHFrameEdgeFixer("__TEXT,__eh_frame",
+        8, Delta64, Delta32, NegDelta32));
+
     // Add an in-place GOT/Stubs pass.
     Config.PostPrunePasses.push_back(
         PerGraphGOTAndPLTStubsBuilder_MachO_arm64::asPass);
```
dnadlinger added a commit to dnadlinger/julia that referenced this issue Jan 10, 2022
…ll code model

This fixes JuliaLang#41440, JuliaLang#43285 and similar issues, which stem from
CodeModel::Large not being correctly implemented on MachO/ARM64.

Requires LLVM 13.x or Git main (tested: 1dd5e6fed5db with patches
from the JuliaLang/llvm-project julia-release/13.x branch, available
at https://github.com/dnadlinger/llvm-project/commits/julia-main).

Requires an LLVM patch to pass through __eh_frame unwind information,
without which backtraces silently won't work (already applied on
JuliaLang/llvm-project@julia-release/13.x):
llvm/llvm-project#52921
dnadlinger added a commit to dnadlinger/julia that referenced this issue Jan 10, 2022
…ll code model

This fixes JuliaLang#41440, JuliaLang#43285 and similar issues, which stem from
CodeModel::Large not being correctly implemented on MachO/ARM64.

Requires LLVM 13.x or Git main (tested: 1dd5e6fed5db with patches
from the JuliaLang/llvm-project julia-release/13.x branch, available
at https://github.com/dnadlinger/llvm-project/commits/julia-main).

Requires an LLVM patch to pass through __eh_frame unwind information,
without which backtraces silently won't work (already applied on
JuliaLang/llvm-project@julia-release/13.x):
llvm/llvm-project#52921
MilesCranmer pushed a commit to MilesCranmer/julia that referenced this issue Jan 14, 2022
…ll code model

This fixes JuliaLang#41440, JuliaLang#43285 and similar issues, which stem from
CodeModel::Large not being correctly implemented on MachO/ARM64.

Requires LLVM 13.x or Git main (tested: 1dd5e6fed5db with patches
from the JuliaLang/llvm-project julia-release/13.x branch, available
at https://github.com/dnadlinger/llvm-project/commits/julia-main).

Requires an LLVM patch to pass through __eh_frame unwind information,
without which backtraces silently won't work (already applied on
JuliaLang/llvm-project@julia-release/13.x):
llvm/llvm-project#52921
LilithHafner pushed a commit to LilithHafner/julia that referenced this issue Feb 22, 2022
…ll code model

This fixes JuliaLang#41440, JuliaLang#43285 and similar issues, which stem from
CodeModel::Large not being correctly implemented on MachO/ARM64.

Requires LLVM 13.x or Git main (tested: 1dd5e6fed5db with patches
from the JuliaLang/llvm-project julia-release/13.x branch, available
at https://github.com/dnadlinger/llvm-project/commits/julia-main).

Requires an LLVM patch to pass through __eh_frame unwind information,
without which backtraces silently won't work (already applied on
JuliaLang/llvm-project@julia-release/13.x):
llvm/llvm-project#52921
LilithHafner pushed a commit to LilithHafner/julia that referenced this issue Mar 8, 2022
…ll code model

This fixes JuliaLang#41440, JuliaLang#43285 and similar issues, which stem from
CodeModel::Large not being correctly implemented on MachO/ARM64.

Requires LLVM 13.x or Git main (tested: 1dd5e6fed5db with patches
from the JuliaLang/llvm-project julia-release/13.x branch, available
at https://github.com/dnadlinger/llvm-project/commits/julia-main).

Requires an LLVM patch to pass through __eh_frame unwind information,
without which backtraces silently won't work (already applied on
JuliaLang/llvm-project@julia-release/13.x):
llvm/llvm-project#52921
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
system:apple silicon Affects Apple Silicon only (Darwin/ARM64) - e.g. M1 and other M-series chips system:arm ARMv7 and AArch64
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants