Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

miscompiled TLS wrapper, likely coroutine related, with sanitizer #91312

Closed
avikivity opened this issue May 7, 2024 · 28 comments · Fixed by #91483
Closed

miscompiled TLS wrapper, likely coroutine related, with sanitizer #91312

avikivity opened this issue May 7, 2024 · 28 comments · Fixed by #91483
Assignees
Labels
ipo Interprocedural optimizations miscompilation release:backport

Comments

@avikivity
Copy link
Contributor

avikivity commented May 7, 2024

Prior to 533b7c1, compiling some coroutine that references a TLS variable generates this wrapper:

0000000000029d80 <_ZTWN2db13schema_tablesL14the_merge_lockE>:
   29d80: 50                           	pushq	%rax
   29d81: e8 4a b3 10 00               	callq	0x1350d0 <__tls_init>
   29d86: 64 48 8b 04 25 00 00 00 00   	movq	%fs:0x0, %rax
   29d8f: 48 8d 80 00 00 00 00         	leaq	(%rax), %rax
		0000000000029d92:  R_X86_64_TPOFF32	_ZN2db13schema_tablesL14the_merge_lockE
   29d96: 59                           	popq	%rcx
   29d97: c3                           	retq
   29d98: 0f 1f 84 00 00 00 00 00      	nopl	(%rax,%rax)

It correctly calls __tls_init.

With 533b7c1 and later, up to 18.1.1, the following TLS wrapper is generated:

0000000000029d80 <_ZTWN2db13schema_tablesL14the_merge_lockE>:
   29d80: 50                           	pushq	%rax
   29d81: e8 00 00 00 00               	callq	0x29d86 <_ZTWN2db13schema_tablesL14the_merge_lockE+0x6>
		0000000000029d82:  R_X86_64_PLT32	_ZTH15data_type_for_vIN7seastar13basic_sstringIcjLj15ELb1EEEE-0x4
   29d86: 64 48 8b 04 25 00 00 00 00   	movq	%fs:0x0, %rax
   29d8f: 48 8d 80 00 00 00 00         	leaq	(%rax), %rax
		0000000000029d92:  R_X86_64_TPOFF32	_ZN2db13schema_tablesL14the_merge_lockE
   29d96: 59                           	popq	%rcx
   29d97: c3                           	retq
   29d98: 0f 1f 84 00 00 00 00 00      	nopl	(%rax,%rax)

The call to __tls_init was replaced by a call to some random function. When the coroutine is then called, it does not initialize the object.

I will follow up with a full reproducer.

@avikivity
Copy link
Contributor Author

Copying @dianqk, the author of the patch.

@avikivity avikivity changed the title miscompile TLS wrapper, likely coroutine related, with sanitizer miscompiled TLS wrapper, likely coroutine related, with sanitizer May 7, 2024
@tchaikov
Copy link
Contributor

tchaikov commented May 7, 2024

another data point is that if the source is compiled with -O0, the issue goes away. i guess the reason is that -O0 disables the globalopt pass.

@avikivity
Copy link
Contributor Author

Reproducer:

wget https://scratch.scylladb.com/schema_tables.ii


clang++ -MD -MT build/debug/db/schema_tables.o -MF build/debug/db/schema_tables.o.d -std=c++20 -I/home/avi/scylla/seastar/include -I/home/avi/scylla/build/debug/seastar/gen/include -U_FORTIFY_SOURCE -Werror=unused-result -fstack-clash-protection -fsanitize=address -fsanitize=undefined -fno-sanitize=vptr -DSEASTAR_API_LEVEL=7 -DSEASTAR_BUILD_SHARED_LIBS -DSEASTAR_SSTRING -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_DEBUG -DSEASTAR_DEFAULT_ALLOCATOR -DSEASTAR_SHUFFLE_TASK_QUEUE -DSEASTAR_DEBUG_SHARED_PTR -DSEASTAR_DEBUG_PROMISE -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_TYPE_ERASE_MORE -DFMT_SHARED -I/usr/include/p11-kit-1 -ffile-prefix-map=/home/avi/scylla=. -march=westmere -DDEBUG -DSANITIZE -DDEBUG_LSA_SANITIZER -DSCYLLA_ENABLE_ERROR_INJECTION -Og -DSCYLLA_BUILD_MODE=debug -g -gz -iquote. -iquote build/debug/gen -std=gnu++20  -ffile-prefix-map=/home/avi/scylla=. -march=westmere -DBOOST_ALL_DYN_LINK    -fvisibility=hidden -isystem abseil -Wall -Werror -Wextra -Wimplicit-fallthrough -Wno-mismatched-tags -Wno-c++11-narrowing -Wno-overloaded-virtual -Wno-unused-parameter -Wno-unsupported-friend -Wno-missing-field-initializers -Wno-deprecated-copy -Wno-psabi -Wno-enum-constexpr-conversion -Wno-error=deprecated-declarations -DXXH_PRIVATE_API -DSEASTAR_TESTING_MAIN  -S  schema_tables.ii -o schema_tables.s

grep -A10 '_ZTWN2db13schema_tablesL14the_merge_lockE.*:' schema_tables.s

@dianqk
Copy link
Member

dianqk commented May 7, 2024

Thanks. I can reproduce it. The bisected outcome might not be the issue itself, but I'll investigate this first.

@avikivity
Copy link
Contributor Author

Simplified command line:

clang++    -fsanitize=address       -Og    -std=gnu++20       -Wno-c++11-narrowing -Wno-unsupported-friend  -S  schema_tables.ii -o schema_tables.s && grep -A10 '_ZTWN2db13schema_tablesL14the_merge_lockE.*:' schema_tables.s

I verified that -fsanitize=address is required.

@avikivity
Copy link
Contributor Author

Yes it's probably coroutines not emitting llvm.used. I remember bugs in this area.

@EugeneZelenko EugeneZelenko added coroutines C++20 coroutines and removed new issue labels May 7, 2024
@llvmbot
Copy link
Member

llvmbot commented May 7, 2024

@llvm/issue-subscribers-coroutines

Author: Avi Kivity (avikivity)

Prior to 533b7c1, compiling some coroutine that references a TLS variable generates this wrapper:
0000000000029d80 &lt;_ZTWN2db13schema_tablesL14the_merge_lockE&gt;:
   29d80: 50                           	pushq	%rax
   29d81: e8 4a b3 10 00               	callq	0x1350d0 &lt;__tls_init&gt;
   29d86: 64 48 8b 04 25 00 00 00 00   	movq	%fs:0x0, %rax
   29d8f: 48 8d 80 00 00 00 00         	leaq	(%rax), %rax
		0000000000029d92:  R_X86_64_TPOFF32	_ZN2db13schema_tablesL14the_merge_lockE
   29d96: 59                           	popq	%rcx
   29d97: c3                           	retq
   29d98: 0f 1f 84 00 00 00 00 00      	nopl	(%rax,%rax)

It correctly calls __tls_init.

With 533b7c1 and later, up to 18.1.1, the following TLS wrapper is generated:

0000000000029d80 &lt;_ZTWN2db13schema_tablesL14the_merge_lockE&gt;:
   29d80: 50                           	pushq	%rax
   29d81: e8 00 00 00 00               	callq	0x29d86 &lt;_ZTWN2db13schema_tablesL14the_merge_lockE+0x6&gt;
		0000000000029d82:  R_X86_64_PLT32	_ZTH15data_type_for_vIN7seastar13basic_sstringIcjLj15ELb1EEEE-0x4
   29d86: 64 48 8b 04 25 00 00 00 00   	movq	%fs:0x0, %rax
   29d8f: 48 8d 80 00 00 00 00         	leaq	(%rax), %rax
		0000000000029d92:  R_X86_64_TPOFF32	_ZN2db13schema_tablesL14the_merge_lockE
   29d96: 59                           	popq	%rcx
   29d97: c3                           	retq
   29d98: 0f 1f 84 00 00 00 00 00      	nopl	(%rax,%rax)

The call to __tls_init was replaced by a call to some random function. When the coroutine is then called, it does not initialize the object.

I will follow up with a full reproducer.

@dianqk
Copy link
Member

dianqk commented May 7, 2024

Simplified command line:

clang++    -fsanitize=address       -Og    -std=gnu++20       -Wno-c++11-narrowing -Wno-unsupported-friend  -S  schema_tables.ii -o schema_tables.s && grep -A10 '_ZTWN2db13schema_tablesL14the_merge_lockE.*:' schema_tables.s

I verified that -fsanitize=address is required.

It should also need -fvisibility=hidden.

@dianqk
Copy link
Member

dianqk commented May 7, 2024

I haven't found any obvious issues, unless I missed something.

_ZTH15data_type_for_vIN7seastar13basic_sstringIcjLj15ELb1EEEE is an alias for __tls_init:

@_ZTHN2db13schema_tablesL14the_merge_lockE = internal alias void (), ptr @__tls_init
@_ZTH15data_type_for_vIN7seastar13basic_sstringIcjLj15ELb1EEEE = linkonce_odr alias void (), ptr @__tls_init
@_ZTH15data_type_for_vIiE = linkonce_odr alias void (), ptr @__tls_init

Perhaps we need a runtime issue reproduction?

@avikivity
Copy link
Contributor Author

Aha. The runtime reproducer is quite heavy. If you want, I can provide a pre-built docker image and instructions to expose the problem with gdb.

@avikivity
Copy link
Contributor Author

My guess is that it's related. The compiler lost some information about TLS, so it started aliasing __tls_init to some other function. I'm single-stepping through it now, and I'm seeing that it's not initializing the variable.

@avikivity
Copy link
Contributor Author

I think what happens is that the guard variable is set when __tls_init (or its alias) is called, so the initializers aren't run.

I set a watchpoint on the variable:

(gdb) bt
#0  0x0000000003fa2dc7 in __tls_init ()
#1  0x0000000003fc2ff6 in TLS wrapper function for data_type_for_v<utils::UUID> ()
#2  0x0000000003fc16c4 in data_type_for<utils::UUID> () at ./types/types.hh:728
#3  cql3::untyped_result_set_row::get_as<utils::UUID> (this=<optimized out>, name=...) at ./cql3/untyped_result_set.hh:71
#4  0x00000000062f5a13 in db::system_keyspace::load_local_info (this=<optimized out>) at ./db/system_keyspace.cc:1431
#5  0x000000000634e487 in std::__n4861::coroutine_handle<seastar::internal::coroutine_traits_base<db::system_keyspace::local_info>::promise_type>::resume (this=<optimized out>)
    at /usr/bin/../lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/coroutine:242
#6  seastar::internal::coroutine_traits_base<db::system_keyspace::local_info>::promise_type::run_and_dispose (this=<optimized out>) at ./seastar/include/seastar/core/coroutine.hh:83
#7  0x00007ffff616cd9f in seastar::reactor::run_tasks (this=0x52400001c100, tq=...) at ./build/debug/seastar/./seastar/src/core/reactor.cc:2690
#8  0x00007ffff6178c76 in seastar::reactor::run_some_tasks (this=0x52400001c100) at ./build/debug/seastar/./seastar/src/core/reactor.cc:3152
#9  0x00007ffff617e0ba in seastar::reactor::do_run (this=0x52400001c100) at ./build/debug/seastar/./seastar/src/core/reactor.cc:3320
#10 0x00007ffff617c14a in seastar::reactor::run (this=0x52400001c100) at ./build/debug/seastar/./seastar/src/core/reactor.cc:3210
#11 0x00007ffff5d0e3e1 in seastar::app_template::run_deprecated(int, char**, std::function<void ()>&&) (this=0x7fffecc110f0, ac=7, av=0x7fffffffd618, func=...) at ./build/debug/seastar/./seastar/src/core/app-template.cc:276
#12 0x00007ffff5d0bd6b in seastar::app_template::run(int, char**, std::function<seastar::future<int> ()>&&) (this=0x7fffecc110f0, ac=7, av=0x7fffffffd618, func=...) at ./build/debug/seastar/./seastar/src/core/app-template.cc:167
#13 0x0000000002e8f2ba in scylla_main (ac=<optimized out>, av=<optimized out>) at ./main.cc:677
#14 0x0000000002e8d348 in std::function<int (int, char**)>::operator()(int, char**) const (this=0x7fffeca0a510, __args=<optimized out>, __args=<optimized out>)
    at /usr/bin/../lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/std_function.h:591
#15 main (ac=7, av=<optimized out>) at ./main.cc:2162

So some unrelated __tls_init set it.

@avikivity
Copy link
Contributor Author

Confirmed. So a local __tls_init is generated, but then another gets called, perhaps due to the aliasing.

@avikivity
Copy link
Contributor Author

I verified that a binary built before the patch works, and after the patch fails. So the bisection is still accurate.

@avikivity
Copy link
Contributor Author

diff of the generated assembly:

$ diff -u schema_tables.s.full.{good,bad}
--- schema_tables.s.full.good	2024-05-07 20:02:55.038613957 +0300
+++ schema_tables.s.full.bad	2024-05-07 20:50:43.162254156 +0300
@@ -396,7 +396,6 @@
 0000000000000f40 l     O .rodata	0000000000000060 __PRETTY_FUNCTION__._ZN2db13schema_tables15hold_merge_lockEv
 0000000000000000 l    d  .text._ZN7seastar9get_unitsINS_35semaphore_default_exception_factoryENSt6chrono3_V212steady_clockEEENS_6futureINS_15semaphore_unitsIT_T0_EEEERNS_15basic_semaphoreIS7_S8_EEm	0000000000000000 .text._ZN7seastar9get_unitsINS_35semaphore_default_exception_factoryENSt6chrono3_V212steady_clockEEENS_6futureINS_15semaphore_unitsIT_T0_EEEERNS_15basic_semaphoreIS7_S8_EEm
 0000000000003c5a l     O .rodata.str1.1	0000000000000039 .L___asan_gen_.4441
-00000000001350d0 l     F .text	000000000000011c __tls_init
 0000000000000138 l       .tbss	0000000000000050 db::schema_tables::the_merge_lock
 0000000000003c93 l     O .rodata.str1.1	000000000000006b .L___asan_gen_.4442
 000000000018d900 l     F .text	0000000000000778 db::schema_tables::merge_schema(seastar::sharded<db::system_keyspace>&, seastar::sharded<service::storage_proxy>&, gms::feature_service&, std::vector<mutation, std::allocator<mutation>>, bool) (.resume)
@@ -7220,6 +7219,7 @@
 0000000000000000  w    F .text._ZN7seastar20noncopyable_functionIFNS_6futureINS_15semaphore_unitsINS_35semaphore_default_exception_factoryENSt6chrono3_V212steady_clockEEEEEvEED2Ev	00000000000000ce .hidden seastar::noncopyable_function<seastar::future<seastar::semaphore_units<seastar::semaphore_default_exception_factory, std::chrono::_V2::steady_clock>> ()>::~noncopyable_function()
 0000000000000000  w    F .text._ZN7seastar20noncopyable_functionIFNS_6futureINS_15semaphore_unitsINS_35semaphore_default_exception_factoryENSt6chrono3_V212steady_clockEEEEEvEEC2EOSA_	0000000000000112 .hidden seastar::noncopyable_function<seastar::future<seastar::semaphore_units<seastar::semaphore_default_exception_factory, std::chrono::_V2::steady_clock>> ()>::noncopyable_function(seastar::noncopyable_function<seastar::future<seastar::semaphore_units<seastar::semaphore_default_exception_factory, std::chrono::_V2::steady_clock>> ()>&&)
 0000000000000000  w    F .text._ZN7seastar6futureIvE14then_impl_nrvoINS_20noncopyable_functionIFNS0_INS_15semaphore_unitsINS_35semaphore_default_exception_factoryENSt6chrono3_V212steady_clockEEEEEvEEESA_EET0_OT_	00000000000001cd .hidden seastar::future<seastar::semaphore_units<seastar::semaphore_default_exception_factory, std::chrono::_V2::steady_clock>> seastar::future<void>::then_impl_nrvo<seastar::noncopyable_function<seastar::future<seastar::semaphore_units<seastar::semaphore_default_exception_factory, std::chrono::_V2::steady_clock>> ()>, seastar::future<seastar::semaphore_units<seastar::semaphore_default_exception_factory, std::chrono::_V2::steady_clock>>>(seastar::noncopyable_function<seastar::future<seastar::semaphore_units<seastar::semaphore_default_exception_factory, std::chrono::_V2::steady_clock>> ()>&&)
+00000000001350d0  w    F .text	000000000000011c .hidden thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>
 0000000000029da0 g     F .text	00000000000019a7 .hidden db::schema_tables::merge_schema(seastar::sharded<db::system_keyspace>&, seastar::sharded<service::storage_proxy>&, gms::feature_service&, std::vector<mutation, std::allocator<mutation>>, bool)
 0000000000000000         *UND*	0000000000000000 freeze(std::vector<mutation, std::allocator<mutation>> const&)
 0000000000000000  w    F .text._ZN7seastar8internal13futurize_baseIvE21make_exception_futureINSt15__exception_ptr13exception_ptrEEENS_6futureIvEEOT_	00000000000000ca .hidden seastar::future<void> seastar::internal::futurize_base<void>::make_exception_future<std::__exception_ptr::exception_ptr>(std::__exception_ptr::exception_ptr&&)
@@ -57984,7 +57984,8 @@
 
 0000000000029d80 <thread-local wrapper routine for db::schema_tables::the_merge_lock>:
    29d80: 50                           	pushq	%rax
-   29d81: e8 4a b3 10 00               	callq	0x1350d0 <__tls_init>
+   29d81: e8 00 00 00 00               	callq	0x29d86 <thread-local wrapper routine for db::schema_tables::the_merge_lock+0x6>
+		0000000000029d82:  R_X86_64_PLT32	thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>-0x4
    29d86: 64 48 8b 04 25 00 00 00 00   	movq	%fs:0x0, %rax
    29d8f: 48 8d 80 00 00 00 00         	leaq	(%rax), %rax
 		0000000000029d92:  R_X86_64_TPOFF32	db::schema_tables::the_merge_lock
@@ -336033,11 +336034,11 @@
   1350c5: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
   1350cf: 90                           	nop
 
-00000000001350d0 <__tls_init>:
+00000000001350d0 <thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>>:
   1350d0: 53                           	pushq	%rbx
   1350d1: 64 80 3c 25 00 00 00 00 00   	cmpb	$0x0, %fs:0x0
 		00000000001350d5:  R_X86_64_TPOFF32	__tls_guard
-  1350da: 74 02                        	je	0x1350de <__tls_init+0xe>
+  1350da: 74 02                        	je	0x1350de <thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>+0xe>
   1350dc: 5b                           	popq	%rbx
   1350dd: c3                           	retq
   1350de: 64 c6 04 25 00 00 00 00 01   	movb	$0x1, %fs:0x0
@@ -336047,7 +336048,7 @@
 		00000000001350f3:  R_X86_64_TPOFF32	db::schema_tables::the_merge_lock
   1350f7: 48 c1 e8 03                  	shrq	$0x3, %rax
   1350fb: 80 b8 00 80 ff 7f 00         	cmpb	$0x0, 0x7fff8000(%rax)
-  135102: 0f 85 a5 00 00 00            	jne	0x1351ad <__tls_init+0xdd>
+  135102: 0f 85 a5 00 00 00            	jne	0x1351ad <thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>+0xdd>
   135108: 64 48 c7 04 25 00 00 00 00 01 00 00 00       	movq	$0x1, %fs:0x0
 		000000000013510d:  R_X86_64_TPOFF32	db::schema_tables::the_merge_lock
   135115: 64 48 8b 1c 25 00 00 00 00   	movq	%fs:0x0, %rbx
@@ -336055,13 +336056,13 @@
 		0000000000135121:  R_X86_64_TPOFF32	db::schema_tables::the_merge_lock+0x8
   135125: ba 38 00 00 00               	movl	$0x38, %edx
   13512a: 31 f6                        	xorl	%esi, %esi
-  13512c: e8 00 00 00 00               	callq	0x135131 <__tls_init+0x61>
+  13512c: e8 00 00 00 00               	callq	0x135131 <thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>+0x61>
 		000000000013512d:  R_X86_64_PLT32	__asan_memset-0x4
   135131: 48 8d 83 00 00 00 00         	leaq	(%rbx), %rax
 		0000000000135134:  R_X86_64_TPOFF32	db::schema_tables::the_merge_lock+0x40
   135138: 48 c1 e8 03                  	shrq	$0x3, %rax
   13513c: 80 b8 00 80 ff 7f 00         	cmpb	$0x0, 0x7fff8000(%rax)
-  135143: 75 7d                        	jne	0x1351c2 <__tls_init+0xf2>
+  135143: 75 7d                        	jne	0x1351c2 <thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>+0xf2>
   135145: 64 48 8b 04 25 00 00 00 00   	movq	%fs:0x0, %rax
   13514e: 48 8d 88 00 00 00 00         	leaq	(%rax), %rcx
 		0000000000135151:  R_X86_64_TPOFF32	db::schema_tables::the_merge_lock
@@ -336071,37 +336072,37 @@
 		0000000000135161:  R_X86_64_TPOFF32	db::schema_tables::the_merge_lock+0x48
   135165: 48 c1 e8 03                  	shrq	$0x3, %rax
   135169: 80 b8 00 80 ff 7f 00         	cmpb	$0x0, 0x7fff8000(%rax)
-  135170: 75 65                        	jne	0x1351d7 <__tls_init+0x107>
+  135170: 75 65                        	jne	0x1351d7 <thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>+0x107>
   135172: 64 48 c7 04 25 00 00 00 00 00 00 00 00       	movq	$0x0, %fs:0x0
 		0000000000135177:  R_X86_64_TPOFF32	db::schema_tables::the_merge_lock+0x48
   13517f: 64 48 8b 04 25 00 00 00 00   	movq	%fs:0x0, %rax
   135188: 48 8d b0 00 00 00 00         	leaq	(%rax), %rsi
 		000000000013518b:  R_X86_64_TPOFF32	db::schema_tables::the_merge_lock
-  13518f: 48 8d 3d 00 00 00 00         	leaq	(%rip), %rdi            # 0x135196 <__tls_init+0xc6>
+  13518f: 48 8d 3d 00 00 00 00         	leaq	(%rip), %rdi            # 0x135196 <thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>+0xc6>
 		0000000000135192:  R_X86_64_PC32	seastar::basic_semaphore<seastar::semaphore_default_exception_factory, std::chrono::_V2::steady_clock>::~basic_semaphore()-0x4
-  135196: 48 8d 15 00 00 00 00         	leaq	(%rip), %rdx            # 0x13519d <__tls_init+0xcd>
+  135196: 48 8d 15 00 00 00 00         	leaq	(%rip), %rdx            # 0x13519d <thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>+0xcd>
 		0000000000135199:  R_X86_64_PC32	__dso_handle-0x4
-  13519d: e8 00 00 00 00               	callq	0x1351a2 <__tls_init+0xd2>
+  13519d: e8 00 00 00 00               	callq	0x1351a2 <thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>+0xd2>
 		000000000013519e:  R_X86_64_PLT32	__cxa_thread_atexit-0x4
-  1351a2: e8 00 00 00 00               	callq	0x1351a7 <__tls_init+0xd7>
+  1351a2: e8 00 00 00 00               	callq	0x1351a7 <thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>+0xd7>
 		00000000001351a3:  R_X86_64_PLT32	.text.startup+0x28c
   1351a7: 5b                           	popq	%rbx
-  1351a8: e9 00 00 00 00               	jmp	0x1351ad <__tls_init+0xdd>
+  1351a8: e9 00 00 00 00               	jmp	0x1351ad <thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>+0xdd>
 		00000000001351a9:  R_X86_64_PLT32	.text.startup+0x2fc
   1351ad: 64 48 8b 04 25 00 00 00 00   	movq	%fs:0x0, %rax
   1351b6: 48 8d b8 00 00 00 00         	leaq	(%rax), %rdi
 		00000000001351b9:  R_X86_64_TPOFF32	db::schema_tables::the_merge_lock
-  1351bd: e8 00 00 00 00               	callq	0x1351c2 <__tls_init+0xf2>
+  1351bd: e8 00 00 00 00               	callq	0x1351c2 <thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>+0xf2>
 		00000000001351be:  R_X86_64_PLT32	__asan_report_store8-0x4
   1351c2: 64 48 8b 04 25 00 00 00 00   	movq	%fs:0x0, %rax
   1351cb: 48 8d b8 00 00 00 00         	leaq	(%rax), %rdi
 		00000000001351ce:  R_X86_64_TPOFF32	db::schema_tables::the_merge_lock+0x40
-  1351d2: e8 00 00 00 00               	callq	0x1351d7 <__tls_init+0x107>
+  1351d2: e8 00 00 00 00               	callq	0x1351d7 <thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>+0x107>
 		00000000001351d3:  R_X86_64_PLT32	__asan_report_store8-0x4
   1351d7: 64 48 8b 04 25 00 00 00 00   	movq	%fs:0x0, %rax
   1351e0: 48 8d b8 00 00 00 00         	leaq	(%rax), %rdi
 		00000000001351e3:  R_X86_64_TPOFF32	db::schema_tables::the_merge_lock+0x48
-  1351e7: e8 00 00 00 00               	callq	0x1351ec <__tls_init+0x11c>
+  1351e7: e8 00 00 00 00               	callq	0x1351ec <thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>+0x11c>
 		00000000001351e8:  R_X86_64_PLT32	__asan_report_store8-0x4
   1351ec: 0f 1f 40 00                  	nopl	(%rax)
 
@@ -763956,7 +763957,7 @@
 0000000000000000 <thread-local wrapper routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>>:
        0: 50                           	pushq	%rax
        1: e8 00 00 00 00               	callq	0x6 <thread-local wrapper routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>+0x6>
-		0000000000000002:  R_X86_64_PLT32	.text+0x1350cc
+		0000000000000002:  R_X86_64_PLT32	thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>-0x4
        6: 64 48 8b 04 25 00 00 00 00   	movq	%fs:0x0, %rax
 		000000000000000b:  R_X86_64_TPOFF32	data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>
        f: 59                           	popq	%rcx
@@ -763967,7 +763968,7 @@
 0000000000000000 <thread-local wrapper routine for data_type_for_v<int>>:
        0: 50                           	pushq	%rax
        1: e8 00 00 00 00               	callq	0x6 <thread-local wrapper routine for data_type_for_v<int>+0x6>
-		0000000000000002:  R_X86_64_PLT32	.text+0x1350cc
+		0000000000000002:  R_X86_64_PLT32	thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>-0x4
        6: 64 48 8b 04 25 00 00 00 00   	movq	%fs:0x0, %rax
 		000000000000000b:  R_X86_64_TPOFF32	data_type_for_v<int>
        f: 59                           	popq	%rcx

@avikivity
Copy link
Contributor Author

My explanation: the patch changed a local symbol __tls_init to a global weak symbol thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>. Another translation unit also emitted the same symbol (but with different contents), and it got linked into the executable in preference to ours.

@avikivity
Copy link
Contributor Author

Is there a way to disable the pass as a workaround? I tried -mllvm -disablepass=globalopt, but it didn't work.

@dianqk
Copy link
Member

dianqk commented May 8, 2024

Is there a way to disable the pass as a workaround? I tried -mllvm -disablepass=globalopt, but it didn't work.

AFAIK, there is no way to disable arbitrary passes.

@dianqk
Copy link
Member

dianqk commented May 8, 2024

My explanation: the patch changed a local symbol __tls_init to a global weak symbol thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>. Another translation unit also emitted the same symbol (but with different contents), and it got linked into the executable in preference to ours.

Thank you for your analysis. I will continue to look into this issue tonight.

@avikivity
Copy link
Contributor Author

My explanation: the patch changed a local symbol __tls_init to a global weak symbol thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>. Another translation unit also emitted the same symbol (but with different contents), and it got linked into the executable in preference to ours.

Thank you for your analysis. I will continue to look into this issue tonight.

Thanks!

@dianqk
Copy link
Member

dianqk commented May 8, 2024

My explanation: the patch changed a local symbol __tls_init to a global weak symbol thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>. Another translation unit also emitted the same symbol (but with different contents), and it got linked into the executable in preference to ours.

Agreed, I believe this is key to the issue. This is a minimal reproducer: https://llvm.godbolt.org/z/n5berxjfo.

@f2 = linkonce_odr hidden alias void (), ptr @f1

define void @foo() {
  call void @f1()
  ret void
}

define void @bar() {
  call void @f2()
  ret void
}

define internal void @f1() {
  ret void
}

IIUC, both LLVM 16 and LLVM 17 miscompiled. I expect that the call to f1 in foo can be overridden at link time, while the f2 in bar cannot.

I've created a draft PR #91483, but I still need some time to confirm this PR. Feel free giving it a try to see if everything is working as expected. :)

@avikivity
Copy link
Contributor Author

The original reproducer now generates the call to __tls_init. I'll test the executable later.

@avikivity
Copy link
Contributor Author

The executable also works.

@dianqk dianqk self-assigned this May 9, 2024
@dianqk dianqk added miscompilation and removed coroutines C++20 coroutines labels May 9, 2024
@dianqk dianqk added this to the LLVM 18.X Release milestone May 16, 2024
@github-project-automation github-project-automation bot moved this to Needs Triage in LLVM Release Status May 16, 2024
@tstellar tstellar moved this from Needs Triage to Needs Fix in LLVM Release Status May 16, 2024
dianqk added a commit that referenced this issue May 16, 2024
…91483)

Fixes #91312.

Don't perform the transform if the alias may be replaced at link time.
@github-project-automation github-project-automation bot moved this from Needs Fix to Done in LLVM Release Status May 16, 2024
@EugeneZelenko EugeneZelenko added ipo Interprocedural optimizations and removed llvm:optimizations labels May 16, 2024
@dianqk
Copy link
Member

dianqk commented May 16, 2024

/cherry-pick c796900

llvmbot pushed a commit to llvmbot/llvm-project that referenced this issue May 16, 2024
…lvm#91483)

Fixes llvm#91312.

Don't perform the transform if the alias may be replaced at link time.

(cherry picked from commit c796900)
@llvmbot
Copy link
Member

llvmbot commented May 16, 2024

/pull-request #92468

@MaskRay
Copy link
Member

MaskRay commented May 17, 2024

Simplified command line:

clang++    -fsanitize=address       -Og    -std=gnu++20       -Wno-c++11-narrowing -Wno-unsupported-friend  -S  schema_tables.ii -o schema_tables.s && grep -A10 '_ZTWN2db13schema_tablesL14the_merge_lockE.*:' schema_tables.s

I verified that -fsanitize=address is required.

#92468 should be safe, but I have a question about suppressing the optimization for linkonce_odr/weak_odr, so I want to compile this example myself. However, trunk clang currently crashes

 #9 0x000055bcabb61095 clang::NestedNameSpecifier::print(llvm::raw_ostream&, clang::PrintingPolicy const&, bool) const (/tmp/Rel/bin/clang+++0xab55095)
#10 0x000055bcabb60d40 clang::NestedNameSpecifier::print(llvm::raw_ostream&, clang::PrintingPolicy const&, bool) const (/tmp/Rel/bin/clang+++0xab54d40)
#11 0x000055bcab381dd8 (anonymous namespace)::FailedBooleanConditionPrinterHelper::handledStmt(clang::Stmt*, llvm::raw_ostream&) SemaTemplate.cpp:0:0
#12 0x000055bcabc0be88 (anonymous namespace)::StmtPrinter::PrintExpr(clang::Expr*) StmtPrinter.cpp:0:0
#13 0x000055bcabbf85ef (anonymous namespace)::StmtPrinter::VisitUnaryOperator(clang::UnaryOperator*) StmtPrinter.cpp:0:0
#14 0x000055bcabbf7172 clang::Stmt::printPretty(llvm::raw_ostream&, clang::PrinterHelper*, clang::PrintingPolicy const&, unsigned int, llvm::StringRef, clang::ASTContext const*) const (/tmp/Rel/bin/clang+++0xabeb1
72)
#15 0x000055bcab2e558f clang::Sema::findFailedBooleanCondition[abi:cxx11](clang::Expr*) (/tmp/Rel/bin/clang+++0xa2d958f)
#16 0x000055bcab2e5d60 clang::Sema::CheckTemplateIdType(clang::TemplateName, clang::SourceLocation, clang::TemplateArgumentListInfo&) (/tmp/Rel/bin/clang+++0xa2d9d60)
#17 0x000055bcab4817ab clang::TreeTransform<(anonymous namespace)::TemplateInstantiator>::TransformTemplateSpecializationType(clang::TypeLocBuilder&, clang::TemplateSpecializationTypeLoc, clang::TemplateName) Sema
TemplateInstantiate.cpp:0:0
#18 0x000055bcab489ab0 clang::TreeTransform<(anonymous namespace)::TemplateInstantiator>::TransformTemplateSpecializationType(clang::TypeLocBuilder&, clang::TemplateSpecializationTypeLoc) SemaTemplateInstantiate.c
pp:0:0
#19 0x000055bcab45a5ac clang::TreeTransform<(anonymous namespace)::TemplateInstantiator>::TransformType(clang::TypeLocBuilder&, clang::TypeLoc) SemaTemplateInstantiate.cpp:0:0
#20 0x000055bcab486503 clang::TreeTransform<(anonymous namespace)::TemplateInstantiator>::TransformElaboratedType(clang::TypeLocBuilder&, clang::ElaboratedTypeLoc) SemaTemplateInstantiate.cpp:0:0
#21 0x000055bcab45a726 clang::TreeTransform<(anonymous namespace)::TemplateInstantiator>::TransformType(clang::TypeLocBuilder&, clang::TypeLoc) SemaTemplateInstantiate.cpp:0:0
#22 0x000055bcab45a136 clang::TreeTransform<(anonymous namespace)::TemplateInstantiator>::TransformType(clang::TypeSourceInfo*) SemaTemplateInstantiate.cpp:0:0
#23 0x000055bcab459d89 clang::Sema::SubstType(clang::TypeSourceInfo*, clang::MultiLevelTemplateArgumentList const&, clang::SourceLocation, clang::DeclarationName, bool) (/tmp/Rel/bin/clang+++0xa44dd89)
#24 0x000055bcab2f1e5e SubstDefaultTemplateArgument(clang::Sema&, clang::TemplateDecl*, clang::SourceLocation, clang::SourceLocation, clang::TemplateTypeParmDecl*, llvm::ArrayRef<clang::TemplateArgument>, llvm::Ar
rayRef<clang::TemplateArgument>) SemaTemplate.cpp:0:0
#25 0x000055bcab2f182f clang::Sema::SubstDefaultTemplateArgumentIfAvailable(clang::TemplateDecl*, clang::SourceLocation, clang::SourceLocation, clang::Decl*, llvm::ArrayRef<clang::TemplateArgument>, llvm::ArrayRef
<clang::TemplateArgument>, bool&) (/tmp/Rel/bin/clang+++0xa2e582f)

@MaskRay MaskRay reopened this May 17, 2024
@github-project-automation github-project-automation bot moved this from Done to Needs Triage in LLVM Release Status May 17, 2024
tstellar pushed a commit to llvmbot/llvm-project that referenced this issue May 17, 2024
…lvm#91483)

Fixes llvm#91312.

Don't perform the transform if the alias may be replaced at link time.

(cherry picked from commit c796900)
@avikivity
Copy link
Contributor Author

Thanks a lot for the fix and backport.

@dianqk
Copy link
Member

dianqk commented May 20, 2024

I have created an issue for the assertion at #92757. Per #92619, we should be able to close this PR again.

@dianqk dianqk closed this as completed May 20, 2024
@github-project-automation github-project-automation bot moved this from Needs Triage to Done in LLVM Release Status May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ipo Interprocedural optimizations miscompilation release:backport
Projects
Development

Successfully merging a pull request may close this issue.

7 participants