-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
jemalloc probably won't work well on aarch64-linux #91
Comments
Please note that this isn't causing huge problems for me. Yet. But eventually I want to distribute aarch64-linux builds of my Nix package for buck2. So, this is mainly just to catalogue the issue since I suspect any movement towards an actual solution will require a bit of stakeholder input, and because someone else may eventually run into it. |
We have now disabled jemalloc everywhere apart from Mac/Linux, since it doesn't play well on other OS's like Illumos #120. We also disable jemalloc if you are doing a build of Buck2 with Buck2, mostly because we haven't setup Buck2 to build Jemalloc. Is that enough? |
That's close, but an aarch64-linux package for e.g. NixOS (just as an example) won't be able to work cross-system unless we also turn off jemalloc there, too. Would a patch to make It might also be worth exploring if other allocators can boost performance while more gracefully handling these requirements. |
Internally we'll probably always use jemalloc, and it's been well tuned, so I am suspicious there is anything else out there with higher performance. But if you find something, we'd switch. Happy for it to be an optional Cargo feature. Note that it is already gated on a few things. Alternatively have you tried asking upstream at jemalloc, in case they can have a fallback path for the NixOS example? |
I believe Jason has stated multiple times that the page size being (effectively) part of the API isn't going to change because it would require a large rework; see jemalloc/jemalloc#467. That said, apparently jemalloc can support sizes smaller than the baked in page size (e.g. build for 16k, run on 4k using the It would probably be good to still add a flag for places that don't enable this, and also because packages like |
A feature flag to disable jemalloc seems reasonable, and like it would solve all the issues here. Patch welcome. |
This was fixed by #693 as the pre-built binaries now use 16k pages. Users who build aarch64 buck2 binaries on their own still need to enable it themselves, though. |
Leaving this here while I'm using the laptop, so that I don't forget it. Maybe something can be done, or not. But this will probably come back to bite someone eventually, I suspect.
jemalloc seems to currently be a dependency as Buck2's global allocator. While I understand jemalloc is a big part of what makes Facebook tick, and it's excellent, there is a problem: jemalloc compiles the page size of the host operating system into the library, effectively making it part of its ABI. In other words, if you build jemalloc on a host with page size X, and then run it on an OS with page size Y, and X != Y, then things get bad; your program just crashes.
Normally, until relatively recently, this hasn't a problem. Why? Because most systems have mostly decided collectively that 4096 byte pages are good enough (that's wrong, but not important.) So almost everything uses that — except for the new fancy Apple Silicon M-series, such as my M2 MBA. These systems exclusively makes use of not 4k, but 16k pages. This page size is perfectly allowed by the architecture (actually, 4k, 8k, 16k, 32k, and 64k are all valid on aarch64) and 16k pages are a great choice for many platforms, especially client ones.
So the problem begins to crop up once people start building aarch64-linux binaries for their platforms; e.g. Arch Linux ARM or NixOS, which distribute aarch64 binaries. Until the advent of Apple Silicon, you could reasonably expect everything to use the same page size. But now we have this newly, reasonably popular platform using 16k pages. There's a weird thing happening here: most of the systems building packages for users are some weird ARM board (or VM) in a lab churning out binaries 24/7. They just need to run Linux and not set on fire. They aren't very fast and they typically are old CPUs, and often are running custom, hacked Linux kernsl that barely work. But most developers or end users? They want good performance and lots of features, with a stable kernel. For ARM platforms, the only options they reasonably have today for supported ARM systems are Raspberry Pis, Nvidia Jetson series, and now Apple Silicon. And Apple Silicon is, without comparison, the best bang for your buck and the highest performer. So there's a thing here where users are more likely to use one platform I feel, and it's becoming more popular — while systems churning out packages will use another, incompatible one.
This isn't a theoretical concern; Asahi Linux users like myself still (somewhat often) run into broken software. jemalloc isn't the only thing that doesn't support non-4k pages easily, it's just one of the more notorious and easy-to-spot culprits, and it turns otherwise working packages into non-working ones: https://github.com/AsahiLinux/docs/wiki/Broken-Software
Right now, I'm building buck2 manually, so this isn't a concern. But it means my binaries aren't applicable to non-AS users, and vice versa.
So there are a few reasonable avenues of attack here:
libc
.I don't know which one of these is the best option.
The text was updated successfully, but these errors were encountered: