Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid trait for space optimization of enums. #41

Closed
wants to merge 1 commit into from

Conversation

vadimcn
Copy link
Contributor

@vadimcn vadimcn commented Apr 10, 2014

Proposing to add Invalid trait and use it for Option<T> space optimization. This is an alternative to RFC #36.

Add `Invalid` trait and use it for `Option<T>` space optimization, instead of hardcoding "special" types in the compiler.

# Motivation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*T is nullable. &T and ~T are not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Internally, they are (you can do NPO on them too).

@sfackler
Copy link
Member

It's not clear to me how this would actually work. How would the Invalid trait be usable by the compiler itself as the type implementing it is being compiled? It seems like it'd either have to compile the crate once, dynamically load it, and then call the set_invalid method for all types that implement it or have some kind of interpreter that'd operate directly on the AST.

@huonw
Copy link
Member

huonw commented Apr 13, 2014

With associated items (and const-expr sizeof) something like

trait Invalid {
    static BITPATTERN: [u8, .. sizeof Self];
}

impl<T> Invalid for ~T {
    static BITPATTERN: [u8, .. 8] = [0, .. 8];
}

// ...

might work, although it seems a little peculiar.

@vadimcn
Copy link
Contributor Author

vadimcn commented Apr 13, 2014

@sfackler, how is it different from compiler using other built-in traits, such as Deref or Drop?

@sfackler
Copy link
Member

The compiler doesn't need to be able to execute an implementation of Drop at compile time. It just inserts calls to it at various points in the compiled program.

@vadimcn
Copy link
Contributor Author

vadimcn commented Apr 13, 2014

Invalid would work the same way. Why would it need to evaluate anything at compile time?

@sfackler
Copy link
Member

struct MyStruct {
    ...
}

impl Invalid for MyStruct {
...
}

static FOO: Option<MyStruct> = None;

@vadimcn
Copy link
Contributor Author

vadimcn commented Apr 13, 2014

Grr, that's annoying. I suppose we'd have to prohibit statics with Invalid, just like we prohibit statics with Drop...

@ghost
Copy link

ghost commented Apr 14, 2014

I think this feature should be implemented using a private, associated compile-time constant of the Self type. This INVALID value of Self type must be a compile-time constant for obvious reasons and it must't become part of the public interface of the type which implements the Invalid trait because... well, for obvious reasons. Therefore we need two new features before this "Invalid trait" feature can be implemented: associated constants and private items for traits (although, I think that items inside traits should be private by default). It would look something like the following (assuming the default is still public and thus the priv keyword is needed):

trait Invalid {
    priv static INVALID: Self;
}

struct S {
    n: int
}

impl Invalid for S {
    priv static INVALID: S = S { n: 123 };
}

And, to the question of what exactly are the semantics of a "private" trait method, an associated constant or an associated type, I think the answer is that it should work like this:

  1. The constant T::INVALID, where T is a type parameter which implements the Invalid trait, should be accessible only from the module where the Invalid trait is defined and its submodules.
  2. The constant S::INVALID, where S is an actual type which implements Invalid, should be accessible only from the module where the trait Invalid is implemented for type S, and its submodules.

Also, I don't think that this Invalid trait feature and the automatic space optimization feature for Option<&T> and Option<~T> are mutually exclusive. We can, and I think we should, have them both.

@pczarn
Copy link

pczarn commented Apr 15, 2014

I think this feature should be implemented with a type such as Difference<T, Value> but it reminded me of Either<A, B> and we don't need either one when we have enums:

enum OptimizedPtr {
    NotNull(uint),
    invalid NotNull(0)
}

An alternative:

enum OptimizedPtr<T: RawPtr> {
    NotNull(T),
    invalid NullPtr = T::null()
}

@nikomatsakis
Copy link
Contributor

This is an intriguing idea. I have to stew on it a bit, but I like it in principle.

@dobkeratops
Copy link

great idea, could this implement a -1 invalid array index type; and in turn operator[] overloaded to take a potentially invalid array index safely returning Some(T) or None

@vadimcn
Copy link
Contributor Author

vadimcn commented Apr 21, 2014

enum OptimizedPtr {
    NotNull(uint),
    invalid NotNull(0)
}

Somehow it doesn't feel right to actually instantiate the "invalid" bit pattern as an instance of T. Especially, if T implements Drop (would destructor be invoked on the invalid value? hmm...)

enum OptimizedPtr<T: RawPtr> {
   NotNull(T),
   invalid NullPtr = T::null()
}

Doesn't this suffer from the same problem as my proposal, in that it requires compile-time evaluation of functions? (or, alternatively, it requires "life-before-main" in order to initialize statics).

@nikomatsakis
Copy link
Contributor

At most recent meeting, we decided that while this idea is promising, the RFC is insufficiently fleshed out and hence we are going to close. Most importantly, it seems like the problems of how to handle static bitpatterns needs to be worked out and specified in detail. If you have a revised proposal that handles statics, feel free to re-open. In general, while the problem this RFC addresses is important, we don't feel it's so urgent that we have to draft a design THIS SECOND, and hence we can take some time to work through the details.

withoutboats pushed a commit to withoutboats/rfcs that referenced this pull request Jan 15, 2017
@petrochenkov petrochenkov removed the postponed RFCs that have been postponed and may be revisited at a later time. label Feb 24, 2018
hauntsaninja pushed a commit to openai/tiktoken that referenced this pull request Feb 26, 2023
The improvements to `byte_pair_merge` are:
- Changing the `parts` vector to avoid repetition of data.
  
This vector used to store ranges for which the invariant `parts[i].end
== parts[i + 1].start` holds, which makes the vector twice as big as it
needs to be.
  Keeping this vector small improves CPU-cache efficiency.
- Using `usize::MAX` as a sentinel in lieu of `Optional` for the
computation of the minimum rank.
  
This change removes branching from the loop to compute the minimum rank,
generating assembly that uses conditional moves instead.

Ideally, we could keep the `Optional` and inform it of the sentinel much
like `Optional<NonZeroUsize>`. As far as I could tell, specifying custom
sentinels for `Optional` has an old Rust
[RFC](rust-lang/rfcs#41) that has stalled, so we
don't get to have nice things.
- Minimizing the number of lookups into `ranks` by looking up ranks once
and iteratively updating them after each merge.

This reduces the number of rank lookups from `n*m` to `n + O(m)`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants