Is there any potential alternative to an llvm-symbolizer runtime dependency? #2978

isker · 2024-10-22T19:33:34Z

isker
Oct 22, 2024

Hi. I'm containerizing a workerd-based application and am generally struggling with llvm-symbolizer.

workerd is nice to containerize in general because it's a mostly static binary that you can dump into a distroless base image and end up with a really simple/stupid/cheap-to-build application image.

But this breaks down with the requirement for llvm-symbolizer. Excepting one semi-abandoned herculean "man versus society" effort, every distribution of LLVM I'm aware of is a constellation of dynamically linked executables that collectively weigh hundreds of megabytes, making the premise of a simple/easy/cheap containerized model for workerd applications start to degrade, if you want symbolized stacktraces. It seems that you really need to be reaching for a particular Linux distribution's package manager to easily materialize a functional LLVM toolchain.

I speculate that Cloudflare doesn't have these kinds of concerns because it has a more tailored packaging/deployment/sandboxing model for workerd. But for the rest of us, well, containers are often the standard deployment medium, even for things like workerd that hardly need containerization to start with.

I have the great privilege of being almost completely clueless about native tooling, so the title question is pretty literal. Why does workerd need llvm-symbolizer when e.g. nodejs does not? Is workerd making a different tradeoff than nodejs in this regard, or is it some kind of architectural constraint? And is there a potential for workerd to statically support what it gets dynamically from llvm-symbolizer today?

Thanks!

Edit: now that I think about it, maybe it's the case that nodejs simply never exposes native stacktraces...

Answered by kentonv

Oct 23, 2024

You don't need to symbolize in production. If llvm-symbolizer is not available, then the stack trace will be dumped as a list of instruction addresses (relative to the binary base). If you log those addresses, you can feed them into the symbolizer offline, as long as you have the same binary available. In fact, the version of the binary in production can be stripped of debug symbols, which makes it a lot smaller. You only need the full version of the binary when you're doing the offline symbolization.

View full answer

kentonv · 2024-10-23T14:20:35Z

kentonv
Oct 23, 2024
Maintainer

You don't need to symbolize in production. If llvm-symbolizer is not available, then the stack trace will be dumped as a list of instruction addresses (relative to the binary base). If you log those addresses, you can feed them into the symbolizer offline, as long as you have the same binary available. In fact, the version of the binary in production can be stripped of debug symbols, which makes it a lot smaller. You only need the full version of the binary when you're doing the offline symbolization.

1 reply

isker Oct 23, 2024
Author

Sounds good, thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Is there any potential alternative to an llvm-symbolizer runtime dependency? #2978

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Is there any potential alternative to an llvm-symbolizer runtime dependency? #2978

Uh oh!

Uh oh!

isker Oct 22, 2024

Replies: 1 comment · 1 reply

Uh oh!

kentonv Oct 23, 2024 Maintainer

Uh oh!

isker Oct 23, 2024 Author

isker
Oct 22, 2024

Replies: 1 comment 1 reply

kentonv
Oct 23, 2024
Maintainer

isker Oct 23, 2024
Author