-
Hi. I'm containerizing a workerd-based application and am generally struggling with llvm-symbolizer. workerd is nice to containerize in general because it's a mostly static binary that you can dump into a distroless base image and end up with a really simple/stupid/cheap-to-build application image. But this breaks down with the requirement for llvm-symbolizer. Excepting one semi-abandoned herculean "man versus society" effort, every distribution of LLVM I'm aware of is a constellation of dynamically linked executables that collectively weigh hundreds of megabytes, making the premise of a simple/easy/cheap containerized model for workerd applications start to degrade, if you want symbolized stacktraces. It seems that you really need to be reaching for a particular Linux distribution's package manager to easily materialize a functional LLVM toolchain. I speculate that Cloudflare doesn't have these kinds of concerns because it has a more tailored packaging/deployment/sandboxing model for workerd. But for the rest of us, well, containers are often the standard deployment medium, even for things like workerd that hardly need containerization to start with. I have the great privilege of being almost completely clueless about native tooling, so the title question is pretty literal. Why does workerd need llvm-symbolizer when e.g. nodejs does not? Is workerd making a different tradeoff than nodejs in this regard, or is it some kind of architectural constraint? And is there a potential for workerd to statically support what it gets dynamically from llvm-symbolizer today? Thanks! Edit: now that I think about it, maybe it's the case that nodejs simply never exposes native stacktraces... |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
You don't need to symbolize in production. If llvm-symbolizer is not available, then the stack trace will be dumped as a list of instruction addresses (relative to the binary base). If you log those addresses, you can feed them into the symbolizer offline, as long as you have the same binary available. In fact, the version of the binary in production can be stripped of debug symbols, which makes it a lot smaller. You only need the full version of the binary when you're doing the offline symbolization. |
Beta Was this translation helpful? Give feedback.
You don't need to symbolize in production. If llvm-symbolizer is not available, then the stack trace will be dumped as a list of instruction addresses (relative to the binary base). If you log those addresses, you can feed them into the symbolizer offline, as long as you have the same binary available. In fact, the version of the binary in production can be stripped of debug symbols, which makes it a lot smaller. You only need the full version of the binary when you're doing the offline symbolization.