Poor optimization of thread local globals on OSX

|  |  |
| --- | --- |
| Bugzilla Link | [41722](https://llvm.org/bz41722) |
| Version | 8.0 |
| OS | MacOS X |
| Reporter | LLVM Bugzilla Contributor |
| CC | @TNorthover |

## Extended Description 
Multiple calls to tlv_get_addr are (often) generated per usage of a thread local variable on OSX. This issue was discovered by looking at the assembly generated by rustc, and is discussed in more detail here:

https://github.com/rust-lang/rust/pull/60341#issuecomment-487982828

I know very little about llvm - so hopefully this all makes sense. The linked IR [1] demonstrates the issue. Often, the optimizer spits out IR which references thread_local globals multiple times when the unoptimized IR only references them once. Often associated with getelementptr.

In the final assembly the asm does the tlv_get_addr dance twice.

movq	_foo@TLVP(%rip), %rdi
callq	*(%rdi)

For larger structures with multiple members, the problem gets worse, resulting in many redundant calls to tlv_get_addr. In contrast, when targeting linux, __tls_get_addr@PLT, is only invoked once.

Maybe there's a good reason the address isn't cached on OSX, but I'm hoping there isn't :).

[1]: https://gist.github.com/alexcrichton/a9a90412152d04caa7011042aa89b6bf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Poor optimization of thread local globals on OSX #41067

Extended Description

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development


Bugzilla Link	41722
Version	8.0
OS	MacOS X
Reporter	LLVM Bugzilla Contributor
CC	@TNorthover

Poor optimization of thread local globals on OSX #41067

Description

Extended Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions