-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Begin extracting frestanding components of libstd #11968
Conversation
/** | ||
* Trait for values that can be compared for equality and inequality. | ||
* Trait for values that can be compared for equality and inequality.q |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
q?
If we move |
@huonw Yes. |
#[cfg(not(test))] | ||
pub struct TypeId { | ||
priv t: u64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we wanted this to be priv
to allow us to change its format over time?
I suppose it doesn't matter too much if intrinsics are always unstable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh! I had to make it pub to separate it from its Clone and Eq impls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could come up with some way to make this private. In the future it's conceivable that Clone and Eq will end up in this crate again.
It's going to get pretty painful when the pointers and containers are defined in a primitive crate, methods for a default allocator are added in another crate and finally more methods requiring failure and other runtime features are added in libstd. |
"It's going to get pretty painful" is not at all a reason to not pursue this. This is an awesome effort, and I would love to pursue this further, it's already looking really promising. I want to think a bit about the name |
HOST_CRATES := syntax rustc rustdoc rustpkg | ||
CRATES := $(TARGET_CRATES) $(HOST_CRATES) | ||
TOOLS := compiletest rustpkg rustdoc rustc | ||
|
||
DEPS_std := native:rustrt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a DEPS_prim :=
line just to be explicit? It may also prevent warning about an uninitialized variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also it'd be cool to see an empty dependency line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
It's a reason to consider another way of doing it. Spreading out the implementation of each container between 3 or more crates is complex, and will require changes to the user-facing API due to the additional traits. |
Some thoughts
|
Types like |
You're talking about a hypothetical thing which hasn't happened yet. Everything here is just a consolidation which should happen anyway, and this is no more difficult to understand than it was before. I agree we should tread lightly when adding things like one-off traits. If you have concrete suggestions about how to explore this use case, we always welcome them, but unconstructive criticism is not particularly helpful |
str::raw::from_c_str(e); | ||
}); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you copy these tests to std::os
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see you did, nevermind.
I'm really liking this, awesome work! |
In what way is it a consolidation?
It's not hypothetical. A language feature was even added with allocators as the primary use case. A good API for them has already been worked out and tested. With this design, the containers will end up being defined in this primitive crate, with a default allocator crate providing extension methods like For containers not in the standard library, the crate solution requires splitting them into 3 crates. Third party crates are also unable to reduce the pain to solely a documentation issue by exporting everything in a prelude.
My concrete suggestion to is to use compile-time selection rather than exposing many implementation details as traits and making heavy usage of dynamic dispatch. I've already explored different ways of doing this and settled on configuration flags as by far the least bad solution without explicit compiler support. |
This is not very convincing, what data do you have to support this? |
I explained the problem for containers above. This same issue reoccurs for any module that's not entirely freestanding. The For another example, traits like An alternative that I'm even less fond of is using weak symbols or dynamically selected functionality for everything from unwinding vs. failure to the choice of default allocator. It's not pay-for-what-you-use and will necessitate the need for an alternative library without the overhead. |
Regardless, discussion of containers and such are irrelevant to this PR because it doesn't deal with them at all. I think we'll cross that bridge when we get there. Thinking about it, why can't the comparisons trait move to this crate? I would expect that |
It's completely relevant to this pull request. I explained the problems that are going to be caused by choosing to do the split like this, and doing it in incremental steps doesn't avoid them. It's worth discussing the design constraints and existing experiments to determine the best way to do it.
It works, and all of the pointer-like and container-like types defined in modules like The |
Do you have any problems with the specific components moved in this pull request? Do you disagree that the functions moved can be used in all contexts? I am 100% behind every change made in this pull request. I never said I was 100% behind all future movements
I do not believe that this is the correct way to do this. I also do not believe that it should block this. As implemented, If you have concerns about this pull request in specific, please voice them. If you believe that future movement will cause problems, then please suggest constructive comments for how to improve this pull request. If the comments have nothing to do with this pull request, then please save them for later. I share your concerns in having an explosion of libraries, and I not believe that it is the right way to approach this. Again though, this pull request is something I believe needs to happen. |
Yes, as explained above, I don't think this is the right solution. Moving them from the standard library is enacting a specific solution without having considered the alternatives.
This isn't part of what I'm discussing. I'm not doing a review of whether or not the code will compile, but whether is it the right design.
If there is not a willingness to move all of the useful functionality not depending on the runtime to the crate, then it is a step backwards as it is far inferior to the existing solution.
I think moving this functionality will cause problems. It will create a separate freestanding standard library without most of the usable functionality it should have. I think it will be unlikely that the rest of the usable functionality will be moved due to the sacrifices involved (as covered in my previous comments).
It doesn't do this, as it exposes the intrinsics in a module named intrinsics rather than exposing functionality in modules based on the functionality involved. Many of these are also I think it's important to do this in a way where it can be made transparent to the user-level API documentation in the future. I don't think this split will allow for doing that due to the trait coherence rules. |
@thestinger I don't understand how your arguments apply to this patch; they seem to be about future patches. I have made very few changes to the types here. This is mostly just moving modules around and thinking harder about their interdependencies than we we have in the past with everything entangled in std. std still has nearly the exact same interface, and where it doesn't it is in my opinion for the better. I am trying to figure out how to make Rust's libraries maximally useful in any environment, and doing so by extracting the parts of std that can work everywhere. |
@alexcrichton Agree the name |
@brson Perhaps we can resurrect the name |
@kballard To avoid confusion, I think no, it should not be named 'core'. |
@alexcrichton I would like to have some kind of lazy language item so that a crate like prim could include code that depends on the malloc and free lang items while not requiring them to be defined in the local crate. There is probably a lot of code that just needs to declare that it needs the global heap, but has no other runtime needs. If we can push most of std into the crate that only needs unique allocation then we have a highly-compatible library. |
The main runtime features that the library needs are allocation, failure, and possibly local data. |
@thestinger What other strategies might we take to reduce the standard library's platform-dependencies? |
The There are certainly environments without dynamic memory allocation and global variables so I don't think the The difference between abort and unwinding is easier to deal with and I think it could just be dealt with at a compiler level.
I haven't given any examples of methods unable to handle allocation failure. Handling the allocation error case for vectors and hash tables means returning an |
With a configuration-based solution, features like floating point can be made optional without re-organization of the library. Splitting out a floating point crate would work but is a backwards incompatible change and may even require new traits. In the future, more features will become usable in the freestanding use case (containers) but other features (like floating point, and mutable global state) will also become optional to support more platforms. I don't think the crate splitting solution is a scalable one. |
@thestinger In other words, this issue will have to be resolved before 1.0? |
I don't think it would be a meaningful 1.0 release if it didn't offer basic sequence, map and set containers with a reasonable subset of the methods marked as stable. All I'm trying to push for here is being able to use the standard library while producing binaries in the same size and performance class as C and with a similar set of runtime dependencies. A feature bringing significant size/performance overhead or new runtime dependencies should be something you can opt-out from (like This is not just relevant to the freestanding use cases but is also necessary to make the language an appealing alternative to C for building the libraries behind every major application. It should be possible to write libraries in Rust for use in existing and future projects in many different languages without a significant overhead in code size and dependencies over C and without having to opt-out of the standard library. |
At the very least, all of this work will help to structure libstd so it is
|
In what way does it make the situation better? What is the advantage to having some functionality arbitrarily split into an underlying library? It's only going to be used through the regular standard library because it's not a viable standalone library and won't be for the reasons I've stated above. |
I meant structuring libstd so it can be built with less dependencies.
|
It will have the same dependencies before and after these changes. I think it's a clear step away from exposing as many features as possible based on the runtime environment. There will be the runtime-dependent standard library pulling in megabytes of Rust code for the smallest utilities/libraries and the crippled freestanding library without even a vector type. We can do so much better... |
An example of the contortions this would cause: we won't be able to have methods for both (And similarly, if I think this split will directly have other negative effects on the API, due to requirements like that. |
The unwinding vs. abort issue would be easy to solve with a weak symbol. At the same time, unwinding is problematic because you get far faster compile times and significantly smaller binaries when unwinding isn't supported. It also enables more LLVM transformations, as a function that's not |
Where would this leave libraries that want to work in multiple environments? For example, If yes, Rust should provide the appropriate options for interoperability with other such libraries and end user convenience, at which point I don't see why If not, and I would have to split into separate completely different crates, that would go against pretty much everything Rust has done so far with |
@huonw @thestinger I believe I now understand the scenario where new traits arise, e.g. huonw's example where one needs to put (I'm still wrapping my head around the multiple crate issue, which AFAICT arises when a library developer wants to allow their library to be used in many target contexts; a library developer who is content to target a narrow audience, e.g. non-kernel developers, may be able to get away with only one crate. I think.) Anyway, I want to understand the basis for objecting to splitting a method like
Does forcing the introduction of new traits have more problems that I have missed, or is the above cover the cases reasonably well? (Again, at this point I'm just trying to understand the problems introduced by the split-into-many-traits phenomenon alone; I can imagine there are further problems introduced when library developers are forced to generate several crates to ensure their library is usable in a maximal set of contexts.) I think that if we do go down this path, it definitely puts pressure on us to support glob imports properly, rather than potentially deprecating them. The other two objections above I have less of an opinion on; my gut tells me that they seem surmountable, e.g. via better tooling for the first and some macro or some new |
FWIW, I had no idea what "prim" means before reading #11828. Is "libprimitive" too long? |
@pnkfelix: Traits are part of the public API and are a name you need to import to make use of the methods. I think every trait needs to be clearly thought out and should have a clear usage in generic code. It is a backwards compatibility hazard to need extra traits whenever functionality becomes optional or becomes available in the freestanding crate. Eventually, nearly all of the standard library would be defined in the freestanding crate. There's no reason we can't define all the containers there and JSON/XML parsers. If 1.0 declares an interface as stable but it's not in the freestanding crate, then it either needs to be dropped or the standard library replaced. We may decide to make a feature like floating point optional in the future, and while a configuration flag will not change the public API, splitting it out into another crate would be an API break. |
@thestinger hmm, I would have thought that a trait T2 that extends another trait T would be a perfectly sensible way to represent optional functionality. But I am going to try to understand why this is bad. Is the core of your complaint is inherent in your phrasing that the backwards-compatibility hazard arises "whenever functionality becomes optional": are you talking about the scenario where in version 1.0 of the standard library, we have a trait Or is it just that my original thinking about |
@pnkfelix: Part of my complaint is that moving functionality to the freestanding crate is a backwards compatibility risk because it introduces new traits to add extensions methods. The other part is that we may decide to make functionality included in the freestanding crate like floating point intrinsics optional and it will end up not being present in that crate at all. I also strongly dislike having the public API include traits that are not carefully thought out functionality aimed at writing generic code. I'm come to view using traits to add extension methods as an anti-pattern because it over-complicates the API and makes me wonder when free functions are actually meant to be used. |
(Haskell has only a few typeclasses in its standard lib (and they're almost all very generic), for comparison.) |
@huonw: Good example! Haskell has |
My 2 cents would be: |
@Tobba: So how do we define the methods on Rust's standard library already rivals most other languages in terms of complexity and code size, while providing a fraction of the features. I think it's clear that increasing the complexity any more is unacceptable, especially when there are other ways to support other profiles. |
@thestinger: My idea was that any unimplemented lang item inside of a library would refer to an undefined symbol and be sorted out by rustc at link time, which means that runtime support can be provided by a separate library which only the executable itself imports This would let you implement anything requiring allocation or failure inside of libprim without defining any functions that would be provided by the runtime I dont see how splitting it up like this would add too much complexity if done right |
All of the containers will need to be defined in this libprim library. Where should the |
@thestinger |
If there's no sensible distinction between the libraries, then there's no reason to do the split. Containers do need to be available without a runtime burden, so the standard library would need to work for freestanding usage anyway if they were defined in it. |
Trying to get my head around all of the above: So splitting into crates based on platform/runtim isn't going to work because an impl can't be extended (optional functions must be added as new traits, and nobody wants a trait explosion). Would it be a potential solution if there was a way to extend an impl when re-exporting the object in std? i.e. prim::Option is a subset of std::Option (both are enums, but the std one has more methods in the impl). I've had a look at using #[cfg] to do freestanding libstd, one of the big annoyances is that the 'deriving' shortcut doesn't play nice with the cfg annotation (the deriving stuff is where lots of the dependency tangling happens). It would be good if things like derive[Clone] could be made dependent on a cfg. It also looks like the #[cfg] is done at compile time - given how rust code is handled in rlibs now it would be good to have a more powerful (lazy?) annotation that isn't evaluated till generating object code. At least everyone is agreed that it would be nice to have libstd freestanding. |
Closing due to inactivity. I think that something along these lines need to happen, especially for embedded rust usage (in other languages), sometimes pulling in all of |
If the whole cast expression is a unary expression (`(*x as T)`) or an addressof expression (`(&x as T)`), then not surrounding the suggestion into a block risks us changing the precedence of operators if the cast expression is followed by an operation with higher precedence than the unary operator (`(*x as T).foo()` would become `*x.foo()`, which changes what the `*` applies on). The same is true if the expression encompassing the cast expression is a unary expression or an addressof expression. The lint supports the latter case, but missed the former one. This PR fixes that. Fixes rust-lang#11968
[`unnecessary_cast`]: Avoid breaking precedence If the whole cast expression is a unary expression (`(*x as T)`) or an addressof expression (`(&x as T)`), then not surrounding the suggestion into a block risks us changing the precedence of operators if the cast expression is followed by an operation with higher precedence than the unary operator (`(*x as T).foo()` would become `*x.foo()`, which changes what the `*` applies on). The same is true if the expression encompassing the cast expression is a unary expression or an addressof expression. The lint supports the latter case, but missed the former one. This PR fixes that. Fixes rust-lang#11968 *Please write a short comment explaining your change (or "none" for internal only changes)* changelog: [`unnecessary_cast`]: Avoid breaking precedence with unary operators (`(*x as T).foo()` -- before: `*x.foo()` -- now: `{*x}.foo()`)
This branch starts pulling modules out of std into a crate called 'prim'. This crate is intended to have no runtime requirements, no unique allocation or failure, and thus be suitable for use as the foundation of any Rust-based software. It currently includes kinds, intrinsics, mem, cast, ptr, and the swap and replace functions, which appear to be the lowest-level building blocks of std. It should also include most of 'option' but a rustc bug is preventing the move, probably
Drop
, and the comparison traits.Clone
and some of the operators can't be moved because they have implementations for vectors that require allocation.Beyond this I expect to add another crate that requires an allocator, adds clone, num, iter, vec, str, etc.
cc #11828