Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Split out OS-specific bits in the runtime to help support freestanding targets. #185

Closed
wants to merge 1 commit into from

Conversation

miselin
Copy link

@miselin miselin commented Jul 25, 2014

@brson
Copy link
Contributor

brson commented Jul 25, 2014

I like the general idea of creating a platform abstraction crate for the standard library.

@alexcrichton
Copy link
Member

When initially extracting libcore, I found it quite helpful to have a mostly-working copy of libcore before drafting an RFC just to see what would actually work and what would have to be left behind. It may be beneficial to have a mostly-working version of what you would like for this as well.

I, too, like the concept of centralizing platform-specific bits. It may be tough to abstract over unix/windows in all situations. If you're abstracting over I/O, for example, the interface is likely just libnative. For things like threads and mutexes, however, it can be useful to have a centralized location for them and they all have basically the same interface.

@miselin
Copy link
Author

miselin commented Jul 25, 2014

I wonder then if perhaps the more 'correct' solution is to integrate these bits into libnative, rather than creating a new crate for the purpose. Moving things like rustrt::mutex::imp into native::mutex, for example.

@miselin
Copy link
Author

miselin commented Jul 25, 2014

RE the libcore mention - I can definitely start working on doing this splitting work to see what can be done easily and what requires more sweeping changes.

If we can agree on the correct crate for this work to be done in (os, native, rustrt, etc), I can work on this.

@pczarn
Copy link

pczarn commented Jul 25, 2014

Will this new crate live in between libcore and liballoc? If so, using liballoc on a new system might require everything OS-specific to be ported.

I imagine it wouldn't because the interface can be incomplete.

@miselin
Copy link
Author

miselin commented Jul 25, 2014

@pczarn you raise a really good point, though with the current interface, writing a custom alloc::heap or alloc::libc_heap does not seem to be a significant challenge.

Ignoring the concept of custom allocators, which could (maybe?) solve the dependency there, I think moving the raw system allocation glue to os or native would mean things like alloc::boxed, alloc::arc, and alloc::rc do not have to be maintained & duplicated.

@thestinger
Copy link

It doesn't belong in native. That library is part of the dynamic 1:1/M:N threading runtime, and should not be required to do I/O, concurrency, etc. because it adds lots of overhead and lossy abstractions.

@pczarn
Copy link

pczarn commented Jul 25, 2014

It might belong in between alloc and rustrt. Several libraries with platform-specific functionality would depend on os (rustrt, native, green, rustuv, std, time). Using Rust in a freestanding environment would require implementing a reasonable number of crates (alloc, libc, os).

@Ericson2314
Copy link
Contributor

On IRC I was thinking of the benefits of having multiple OS crates. Probably the most basic argument is to avoid adding extra dependencies to things that don't need it. Obviously liballoc need not relay on IO.

More importantly though I'll take an OS crate over the current situation. libstd facade was a big step in the right direction, but something along these lines is needed to finish it!

@Ericson2314
Copy link
Contributor

Not be self-promoting, but my proposal here mentions this issue http://discuss.rust-lang.org/t/traits-ml-modules/272 .

@brson
Copy link
Contributor

brson commented Jul 28, 2014

@miselin Integrating into libnative will not work I think because of the dependencies. libnative is the last thing linked when building a Rust exe - nothing depends on it.

@pczarn
Copy link

pczarn commented Jul 28, 2014

For reference, this is the most relevant part of the dependency graph. gist / Old source

Imgur

@bharrisau
Copy link

That is a bit out of date. Pretty sure libcollections also depends on libunicode for example.

@pczarn
Copy link

pczarn commented Jul 28, 2014

Updated. This includes my proposal:

image

@bharrisau
Copy link

And just to be aware - there is technically some OS specific stuff still hidden in LLVM frame lowering for segmented stack prologues (stack limits). Until something like #131 comes in anyway.

Edit: Wrong rfc

@miselin
Copy link
Author

miselin commented Jul 29, 2014

The LLVM-specific stuff certainly still exists, but yes - RFC #131 would solve that problem in a much better way.

@pczarn - the problem with the location of os in that diagram is that it depends on alloc, while alloc ends up needing to depend on os in its current state. Unless there's a better way to do the OS-level memory allocation, that dependency is hard to break.

@bharrisau
Copy link

Sorry - wrong RFC. I was lazy.

@miselin
Copy link
Author

miselin commented Jul 29, 2014

An initial POC of this is now in my fork of the Rust repo.

@bharrisau
Copy link

The only allocation is a raw call to malloc in mutex.rs which can be replaced by a direct call to libc::malloc. And thread is using Box, if you get rid of that there is no dependency on liballoc.

@miselin
Copy link
Author

miselin commented Jul 29, 2014

I have updated the POC branch in my fork. I have done a trivial removal of Box from thread, but this does not currently build.

@bharrisau
Copy link

You'll probably need to look at forget and drop if you aren't using Box, as the memory will be freed otherwise. Maybe it will be too much work to remove the dependency on liballoc.

Edit: Transmute to raw pointer is the same as forget. But you will need to drop if thread fails (where it says "avoid leaking memory").

@miselin
Copy link
Author

miselin commented Jul 29, 2014

I've updated the POC branch with work so far on having alloc depend on os. It's looking pretty bleak: error: cannot provide an extension implementation where both trait and type are not defined in this crate.

I am thinking about updating the RFC to add something like an alloc_os crate for the OS-specific bits that alloc needs, as I think breaking the dependency that os has on alloc is proving to be quite inelegant.

@pczarn
Copy link

pczarn commented Jul 29, 2014

I'm assuming support for custom allocators in the future. I don't think alloc_os is important.

@miselin
Copy link
Author

miselin commented Jul 29, 2014

@pczarn what timeline is there for custom allocators? (@alexcrichton, @brson might also have an idea?)

@pczarn
Copy link

pczarn commented Jul 29, 2014

@miselin, the design won't be finished before 1.0, and not long after that, either... I guess. Actually, a pluggable default allocator is important to break the dependency on OS-specific heap.

@bharrisau
Copy link

@miselin It looks like the main issue you are having breaking the dependency between os and alloc is the nature of your 'trivial changes'. For example, you should leave all the Box stuff in alloc, leave the extern thread_start in rustrt, and make the unsafe fn create accept a raw pointer to pass to extern thread start, and return a result instead of fail! on thread creation failure. Then all the Box parts can still be handled in rustrt, and os only needs to worry about the OS specific things.

@ryanra
Copy link

ryanra commented Jul 30, 2014

As @miselin and I were talking about on IRC, there are some potential pitfalls if a freestanding implementation for libos is written in Rust (e.g., if you're writing an OS in Rust) . Your implementation for os::X might depend on a standard library (say collections) that itself calls a function stub declared in the libos crate -- if this calls makes it back to your implementation for os::X, you'll end up with a run-time recursive nightmare.

In general, the problem stems from an inability to specify dependencies within libos and dependencies from libos on a standard library.

My preference would be to break up a potential libos into smaller crates as a first step to statically specifying these fine-grained dependencies within libos elements and between libos and the standard libs.

@mneumann
Copy link

As I just ported Rust to DragonFlyBSD, I am all in favor for this idea. Having target_os specific code scattered all around in different places is a bad thing IMHO, and you quickly can shoot yourself in the shoe when you forget something. Having a single "libos" makes testing for that specific target OS a lot easier I guess. For the library part (not rustc itself), I had to make target specific changes to liblibc, libnative, librustdoc, librustrt, librustuv, libstd, libgreen.

The hardest part though is porting liblibc and I think for generating the system structures and "defines" we should use the help of a generator script, because if we get the layout wrong (it occured twice to me), it's pretty hard to spot. I envision to specify the defines, types and structure fields lets say in JSON, from which we can generate a simple C test program, that checks if our generated structure has the same size, as well as if all fields have the same size and offset as the corresponding system struct. From the same JSON we'd generate the Rust code of course.

@miselin
Copy link
Author

miselin commented Jul 31, 2014

@ryanra one of my other problems with multiple OS crates is that it solves a problem for freestanding purposes that no other target ever has to worry about (because each other target has an established set of system libraries). I acknowledge the recursion problem this can cause, but I think creating N crates isn't the right solution.

@bharrisau yup. I think we can get away with depending on things like (posix_)memalign. My main pushback on that was the idea of having to implement a bunch of these C APIs which would tend to simply call out to the OS-specific allocator anyway. At the same time, while custom allocators don't exist yet, when they do they may well work to make alloc platform-agnostic. Time will tell.

I can certainly keep working on moving things into platform in my local branch.

I did a quick check just now on a couple crates, and...

rustuv is hard to split because it is a binding to libuv, which may not even exist on some targets, just to complicate matters.

native uses OS-specific #[cfg] for:

  • the entire native::io module
  • ignoring SIGPIPE in native::start (which is only needed on UNIX-ish)
  • setting native::OS_DEFAULT_STACK_ESTIMATE

green uses OS-specific #[cfg] for:

  • correct mmap flags for stack mappings
  • stack guard pages (trivially easy to move into platform, really)
  • scheduler RNG seeding (via /dev/urandom on UNIX-ish)
  • context switching (where we have OS-specific bits because Windows context switches save FP registers)

backtrace is C. The fact it exists in the RFC is because I only grep'd for target_os instead of searching for #[cfg] entries - oops :-)

@bharrisau
Copy link

And green depends on libuv? I think if you had a platform that std (and everything std re-exports from) and native could use as the OS interface, that would be pretty awesome. Porting to a new platform would only require changes to libc and platform and all you would miss out on would be M:N threading.

From my quick look, libstd will be the most painful. All the env variables and command line arguments looked very different between unix and windows.

@bharrisau
Copy link

It's starting to form into 4 layers

  • Core provides support with almost zero requirements (LLVM support and some failure lang items)
  • If you supply a libc and handle the one weird memalign part, you can get to collections
  • Implementing platform gets you up to std, and maybe native
  • More work gets you everything - green and any other platform specific libraries

@miselin
Copy link
Author

miselin commented Jul 31, 2014

I completely missed where green pulls in rustuv because the dependency comes via the green_start macro. Well, other than the examples of using green with rustuv::event_loop.

Yes, I agree. That is the goal - modify/create libc and platform, and you're done. New "official" targets would still involve #[cfg] definitions, but these would be mostly isolated to a single crate rather than strewn across the codebase.

@miselin
Copy link
Author

miselin commented Jul 31, 2014

If native::io becomes a re-export of platform::io rather than a module inside native, then the third layer should be fairly "easy" (even if most of the module is stubbed out). However, that kind of changes native from "The native I/O and threading crate" into "A couple of thin abstractions and a re-export that happens to make this just like the old I/O and threading crate!" - that change mightn't be overly popular.

@bharrisau
Copy link

In some ways, native is only its own thing because green exists. i.e. it is the "Native I/O and threading" so that it contrasts against the "libuv I/O and threading".

Some things will have an 'optimised' solution in Windows or Linux, but might still be doable in standard ANSI C. In that case, you can probably leave the #cfg if it is too much work to abstract, and just provide the 'generic' impl for #[cfg(not(unix), not(windows)] (or whatever the case may be) that uses libc.

@miselin
Copy link
Author

miselin commented Jul 31, 2014

I've updated the RFC with results from a quick scan over the code base.

The POC branch has been updated to reflect the name platform, and the current state of moving things into platform.

EDIT: Relinking POC branch because it is now quite a few comments back.

@Ericson2314
Copy link
Contributor

@ryanra Sorry I meant to say that is the best that can be done for a single OS create -- I originally advocated for multiple crates too.

So if green_start is moved elsewhere, One gets green as soon as they get native? I would very much like that.

Also, if the platform implementations for existing OSes implement their own syscalls (which would be faster too) need anything that's not a wrap of a native library depend on libc?

@errordeveloper
Copy link

Also, if the platform implementations for existing OSes implement their own syscalls (which would be faster too) need anything that's not a wrap of a native library depend on libc?

I don't think this is neither easy nor feasible, considering that the way of making a syscall on Linux differs depending on CPU architectures to start with.

@miselin
Copy link
Author

miselin commented Aug 1, 2014

I've updated the RFC again.

I'll put the most important part of the latest change here in this thread, which is what I've come up with so far in doing the POC:

The outcome of this POC so far indicates that, with the platform crate, to bring the standard library to a new target the following work must be done.

  • Modify platform to add OS-specific glue (platform::mutex, platform::stack, platform::thread, platform::thread_local_storage, platform::time, platform::libunwind, platform::unwind, and platform::flock).
  • Modify libc to add OS-specific structs and definitions (this is something that could potentially be done with automation of some sort).
  • Implement needed bits in native::io.
  • Implement needed bits in green (green::stack, green::sched, green::context).
  • Implement needed bits in std (touches std::os, std::rt::backtrace, std::rand, std::rtdeps, at a minimum). If the system is not compatible with existing Windows/UNIX checks in std, extra checks for target_os will need to be added to more modules.
  • Implement needed bits in alloc, if needed.
  • Check bindings for libuv (rustuv), libgraphviz, libminiz, etc...

@Ericson2314 I'm not sure there is another place to move green_start! to. It is however a macro, so in theory if you provided the needed platform-specific bits then yes, green could work (you would need to provide your own rtio::EventLoop). You would just have to avoid invoking the green_start! macro, which (I think?) would mean no dependency on rustuv.

EDIT: or you could use green::basic for your event loop, if it suffices for your needs.

@Ericson2314
Copy link
Contributor

I've been looking over the @miselin's proof of concept. My basic opinion is most code that uses libc, even if it doesn't need to cfg anything, is still making some assumptions about the OS or sort of OS, and therefore ought to be moved to platform.

This is a lot less feasible for std though, so perhaps an initial goal is to make the crates below master besides platform (and alloc due to lack of allocators) not use libc. Put another way, the crates used by STD shouldn't assume a conventional OS (more on that below).

Also a couple of random points:

  • I think the code moved from rustdoc should go back, as the comments say its not a "production quality" implementation, and just a quick hack to get rustdoc going.
  • The rustrt::args could probably go straight to platform. #7756 mentions the current approach isn't too well liked so maybe a more drastic change would be appropriate. argc and argv are a rather unix/windowsy thing that don't make sense in other contexts so if a drastic change is taken, I'd vote to move the interface out of rustrt (though the implementation can still live in platform, just as an optional component.

I forked miselin's work to explore these ideas and will post links when ready.

@alexchandel
Copy link

I'd be in favor of having separate libmutex, libostime, libfile, and libnet crates. I've done these before at different times on embedded platforms, and it would be difficult to implement them all on every platform (e.g. PIC18 vs PIC32MZ with external flash), and certainly undesirable for embedded developers.

And without these, libplatform is pretty much librustrt, no?

@alexchandel
Copy link

@bharrisau Even though rlibc doesn't replace libc, why can't we still make these language items?

@miselin
Copy link
Author

miselin commented Aug 2, 2014

@alexchandel with multiple crates, are you wanting to cherry-pick the crates that you do implement in this case? (eg, "I don't care about file I/O, so I won't write libfile")

@Ericson2314 I think being able to remove the existence of #[cfg(unix)] or #[cfg(windows)] from all bar the platform-specific crates is something that is doable with long-term benefits. I'm not sure making this happen as part of the proof-of-concept is necessarily correct (especially since there is still discussion around one crate vs multiple crates, libc crate, etc...).

@Ericson2314
Copy link
Contributor

@miselin I meant remove libc from the other creates. We are well on our way to limiting usage of libc and #[cfg(<OS>)] in stuff below std!

@Ericson2314
Copy link
Contributor

Another POC, basically taking librustrt as libplatform (small link prevents it from building): https://github.com/Ericson2314/rust/tree/portability

As I see it, moving the io stuff out of libstd as libfs and libnet would be good to allow use of libstd without those features. Each of those libraries in turn would have to delegate to other crate(s) for their platform specific portions.

So libplatform(s) + libfs + libnet + libalmost_std = libstd, if we want to keep the facade where it is now.

libalmost_std could contain Reader, Writer, and other seemingly IOish things that admit non-io implementations. This would especially be great with HKT, as we can make IoError in those traits a * -> * parameter instead. Those traits may also be important to have outside libstd so that the platform-specific shim(s) can use them.

@miselin
Copy link
Author

miselin commented Aug 3, 2014

@Ericson2314 I had a quick look, but it seems like rust-checkout/src/librustrt would have to be changed if I wanted a rustrt for my custom target (adding a new directory next to unix/windows, essentially).

One of the things platform offers is the ability to abstract away all of the platform specifics, in a crate that can be trivially implemented outside of the Rust source tree. That is, if I am building the libraries myself anyway, I just need to build a platform that suffices for my system, and then provide that as the basis upon which to build the rest of the standard library. This, instead of modifying an actual local Rust checkout. The actual thing that prompted me to write the RFC was the fact that I did not want to write my own rustrt and provide it in my repository :-)

What are your thoughts on this?

@errordeveloper
Copy link

I definitely agree with @alexchandel. In most of my recent projects we have embedded systems that do have networking, but have pretty much no use for filesystem.

@Ericson2314
Copy link
Contributor

@miselin I agree that is a good goal. The problem is that the OS bits of rustc both depend and are dependent on the more portable bits. I'll try next to move the platform-agnostic stuff out of librustrt, but I'm not sure how far I'll get.

rustrt::thread is an especially good example of where the platform specific and platform agnostic code is quite entwined.

@miselin
Copy link
Author

miselin commented Aug 4, 2014

For what it's worth, my statement before should actually probably be "to abstract away all of the platform specifics, in a crate (or crates) that can be trivially implemented outside of the Rust source tree."

@alexcrichton
Copy link
Member

The planned state of the runtime, I/O interfaces, and synchronization primitives has shifted significantly from when this was open. With #230 now accepted, it's likely that the number of layers of abstraction between the standard library and the system you're running on is going to be greatly reduced.

As a result, this RFC has since become a little out of date and may need significant re-wording to take into account #230. Additionally, there's quite a bit of discussion here which may need to get re-incorporated into the RFC. These will definitely be part of the driving principles of the rearchitecting of the runtime, however!

Consequently, I'm going to close this for now, in the spirit of having as few in-flight RFCs for modifying the runtime as possible. I suspect that many of the concerns in this RFC will be addressed when #230 is completed, as well!

Thanks again for the RFC, and I'm sure that the progress on #230 will definitely welcome comments throughout!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.