Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use wasm start-function linking and __cxa_atexit #25

Closed
sunfishcode opened this issue Dec 3, 2017 · 9 comments
Closed

Use wasm start-function linking and __cxa_atexit #25

sunfishcode opened this issue Dec 3, 2017 · 9 comments
Labels

Comments

@sunfishcode
Copy link
Member

sunfishcode commented Dec 3, 2017

.init_array/.fini_array and .ctors/.dtors have a history of being exploitable. For example, "Abusing .CTORS and .DTORS For FUN and PROFIT", and other examples are easy to find. Modern ELF systems have mitigated it with RELRO which makes these sections read-only. However, WebAssembly doesn't support read-only memory. This may change in the future (or perhaps we could use a separate linear memory space), though at present there are no proposals. If we implement traditional .init_array/.fini_array support now, we'd be opening up an attack vector.

WebAssembly would be less vulnerable than traditional ELF sysystems without RELRO, because wasm's indirect calls can only call into defined function entry points, and type signatures have to match, however theoretical exploits are still possible.

I propose to avoid .init_array and .fini_array, and instead:

  • use wasm start functions for initialization code, and add support for linking them, rather than use .init_array, as suggested here.
  • lower destructor code, such as __attribute__((destructor)), into functions registered with __cxa_atext by initializers. I've implemented this in LLVM here.

To be sure, with current __cxa_atexit implementations, function pointers are still stored in writeable linear memory, so the problem already exists. However, .init_array/.fini_array are dense arrays of pointers, making them easier to hit, and they're more likely to live at a predictable address.

__cxa_atexit is also generally more robust than .fini_array because

  • If cleanups are registered dynamically while global ctors are being run, it properly orders all the cleanups, including the dynamically registered ones, in the reverse order.
  • If the process exits before all the global constructors have run, it doesn't run destructors that haven't been registered yet.

This approach also doesn't preclude implementing .init_array and .fini_array in the future. We could always add support for .fini_array in the tools without breaking the ABI.

The main downside of this approach is that it's different from how other platforms work, and would require more adaptation when porting libc and other low-level tools. However, I believe it's worth the effort.

@dschuff
Copy link
Member

dschuff commented Dec 4, 2017

I'm a little ambivalent about this.
I guess we have 2 separate proposals: to have the linker synthesize a start function, containing calls to each of the start functions of the object files; and to use __cxa_atexit or equivalent to call destructors (instead of a mechanism that mirrors init).

First, destruction because to me that's a simpler argument.
In favor of .fini_array:

  1. Symmetric with .init_array
  2. (not so much in favor of fini_array but against the proposed alternative): As written in the patch it requires the program to link __cxa_atexit (and have it use the C calling convention) even in C or another non-C++ language. Or failing that, the language/runtime author has to roll their own equivalent.

In favor of the alternative:

  1. Shutdown ordering works correctly out of the box
  2. Since wasm does have first-class construction support but not destruction, it makes some sense to expose that asymmetry.

On balance I think it makes sense to go with the alternative for now but I don't have a strong opinion.

Construction is more interesting:
First, the argument in favor of .init_array:

  1. Consistency with ELF, ease of porting system/language runtimes
  2. It's "declarative" in the sense that neither the linker nor other tools need to synthesize or understand any code, nor even to move code around or create code sections (this is one of the advantages .init_array has over the older .init scheme). All the startup and shutdown code is under the control of the system library author. I think this is an advantage that we haven't considered in the discussion so far.

In favor of the alternative:

  1. The security mitigation.
  2. Constructors do not need to be address-taken, reducing the size of the wasm table.

The security issue is an interesting one. It's basically in the general class of "don't have read-write function pointers anywhere it can be helped", and specifically "don't have read-write init function pointers." (and possibly fini function pointers, in the case of .fini_array). In general this is a great idea. I'm a little skeptical about how much protecting the init functions buys us though. The init functions are called before any user code is run, and never again (and if the attacker controls the contents of the binary as in the linked example, then it's already over anyway). So modifying them once the program has started doesn't matter. (We can protect the original startup function that calls them by not exporting it or taking its address at all). There are of course the fini functions; although if those are handled using atexit then pointers to them are stored in some library data structure. So in this particular case I'm not sure I buy the security argument.

Making the constructors not address-taken might be useful though?
To me it kind of comes down to how we feel about making the linker generate code. In abstract principle I think linkers shouldn't be required to have that sort of intelligence, but 1) wasm is a bit different in that regard and we will probably end up having some of that in LLD anyway, and 2) there's even precedent for linker-generated code in ELF where the benefit is big enough (e.g. TLS model reduction).

@dschuff
Copy link
Member

dschuff commented Dec 4, 2017

I guess one other advantage of synthesizing a start function would be that if we do LTO, we could inline the init functions into it (and move it to the front of the code section). That seems like a win. And I guess a smart VM could also take advantage by e.g. not running the slow optimized codegen or something.

@dschuff
Copy link
Member

dschuff commented Dec 4, 2017

And while I still don't really buy the security advantage of protecting the init process itself, keeping the init functions out of the table is probably still good as a means of reducing the amount of code that an attacker could cause to be called in some other context.

Ok, so... maybe I'm sold.

@kripken
Copy link
Member

kripken commented Dec 4, 2017

Any concerns about the reentrancy issues with the start function, that forced us to introduce __post_instantiate for dynamic linking?

@sbc100
Copy link
Member

sbc100 commented Dec 4, 2017

I think we should consider separately whether or not to use the wasm start function. How this function is called can be a different discussion than whether or not this start function is linker synthesized.

@sunfishcode
Copy link
Member Author

I agree with @sbc100; if it turns out that we can't use the wasm start function, we can use the __post_instantiate convention or so instead, and still do the rest of what's proposed here.

See also WebAssembly/design#1160 for another report of this issue.

@sunfishcode
Copy link
Member Author

One issue is that plain start-function linking doesn't have an obvious way to implement init function priorities.

It could be implemented with linker metadata. It would consist of a list of (function index, priority) pairs. This isn't quite as pretty, but it does retain most of the advantages discussed above.

Optionally, it might be nice to continue to use the wasm start function, as a special case for priority 65535, which is the default priority for C++ constructors and __attribute__((constructor)), so it'd be what most user code uses.

@dschuff
Copy link
Member

dschuff commented Dec 7, 2017

Related: The new data-segment initialization in the current threads proposal (see specifically https://github.com/WebAssembly/threads/blob/master/proposals/threads/ConditionalSegmentInitialization.md and the discussion at WebAssembly/threads#62) switches data segment initialization from being done implicitly by the VM to being done explicitly, in the start function. This means we'll have to at least have some initialization function that's called during instantiation (i.e. from the start function), at least for threaded code. Maybe we want to just have the same mechanism for that, as for constructors.

@sunfishcode
Copy link
Member Author

https://reviews.llvm.org/D40759 is now updated with a patch implementing the LLVM side of both the ctor and dtor parts of this proposal, including support for init priorities.

jyknight pushed a commit to jyknight/llvm-monorepo that referenced this issue Dec 14, 2017
Summary:
This change lays the groundwork lowering of @llvm.global_ctors
and @llvm.global_dtors for the wasm object format.  Some parts
of this patch are subset of: https://reviews.llvm.org/D40759

See WebAssembly/tool-conventions#25

Subscribers: jfb, dschuff, jgravelle-google, aheejin, sunfish

Differential Revision: https://reviews.llvm.org/D41208

llvm-svn=320742
earl pushed a commit to earl/llvm-mirror that referenced this issue Dec 14, 2017
Summary:
This change lays the groundwork lowering of @llvm.global_ctors
and @llvm.global_dtors for the wasm object format.  Some parts
of this patch are subset of: https://reviews.llvm.org/D40759

See WebAssembly/tool-conventions#25

Subscribers: jfb, dschuff, jgravelle-google, aheejin, sunfish

Differential Revision: https://reviews.llvm.org/D41208

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320742 91177308-0d34-0410-b5e6-96231b3b80d8
jyknight pushed a commit to jyknight/llvm-monorepo that referenced this issue Dec 15, 2017
Summary:
- lowers @llvm.global_dtors by adding @llvm.global_ctors
  functions which register the destructors with `__cxa_atexit`.
- impements @llvm.global_ctors with wasm start functions and linker metadata

See [here](WebAssembly/tool-conventions#25) for more background.

Subscribers: jfb, dschuff, mgorny, jgravelle-google, aheejin, sunfish

Differential Revision: https://reviews.llvm.org/D41211

llvm-svn=320774
earl pushed a commit to earl/llvm-mirror that referenced this issue Dec 15, 2017
Summary:
- lowers @llvm.global_dtors by adding @llvm.global_ctors
  functions which register the destructors with `__cxa_atexit`.
- impements @llvm.global_ctors with wasm start functions and linker metadata

See [here](WebAssembly/tool-conventions#25) for more background.

Subscribers: jfb, dschuff, mgorny, jgravelle-google, aheejin, sunfish

Differential Revision: https://reviews.llvm.org/D41211

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320774 91177308-0d34-0410-b5e6-96231b3b80d8
jyknight pushed a commit to jyknight/llvm-monorepo that referenced this issue Jan 12, 2018
This change create a new synthetic function in the final
output binary which calls the static constructors in sequence.

See: WebAssembly/tool-conventions#25

Differential Revision: https://reviews.llvm.org/D41893

llvm-svn=322388
@sbc100 sbc100 closed this as completed Jan 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants