Support Rust Stable channel #7

Mythra · 2021-03-12T17:23:55Z

Opening a tracking issue here, as a carry-over from: google/starlark-rust#283 😄 , as it seems there were a couple options discussed (RUSTC_BOOTSTRAP in this crate vs others, potentially maintaining a stable only interface, etc.) I'm just opening this to track that. As I'd like to switch over to this crate at some point rather than maintaining my own fork 😅

ndmitchell · 2021-03-12T18:33:07Z

Very keen to have you switch over @Mythra! Keen to know if using RUSTC_BOOTSTRAP is feasible for you or not. Also happy to chat on video conference if you want, so we can talk more openly about what you're after, the kind of changes you've made and need supporting. While I doubt there's anything that we won't support, if there was, it would be good to know sooner rather than later!

Mythra · 2021-03-12T19:33:59Z

I haven't really messed with RUSTC_BOOTSTRAP before so to be honest I don't know, but I'm definitely willing to give it a shot although it seems intimidating.

As for my code I admit the biggest thing I'm worried about is porting over the error experience. I've put quite a bit into my solution with the old crate, and although it's far from ideal in a technical sense I'd definitely like to keep a similar user experience. I also haven't looked into the new #[starlark_module] proc macro, but I assume it's mostly the same meaning:

I can have an arg that is positional or by name.
I can have an argument only by name
I can accept variable types on an argument.

I'll document the current error strategies I'm doing below just as sort of an overview, but if a call would helpful or ideas that aren't best discussed in text I'd be happy to hop on a call or do anything of that sort.

All starlark errors rendered by starlark itself (e.g. a lexer error, etc.) all I've been doing is taking the span location of the error, and putting a "nice polish" on it:

if labels.is_empty() {
	return unknown_state_starlark_error(
		"CE00",
		"Reserved keyword `break` used in a place it can't be used",
		diag,
	);
}
let loc = chop_off_newline_if_present(&labels.get(0).unwrap().0);

let mut err = eyre!(
	"{}: break used outside of a loop",
	PyraStarlarkErrors::BadLoopControl,
)
.with_section(|| {
	highlight_code(
		&loc,
		"the break keyword is reserved, and can ony be used in a loop",
	)
});

if bad_indent(&loc) {
	err = err.with_suggestion(|| {
		"The indentation is less than the line before, perhaps missing an indent?"
	});
}

err

This example is one of the most complex for this face lift because it actually looks at the string representation of the location, and looks at the lines before to see if maybe someone just didn't indent correctly. Then again most of this is just outputing the error information, and highlighting in a way consistent with the rest of the tool. Nothing too fancy.

For my own errors for my types I've been abusing the fields in starlark_error! to signify not only the error code, the exact type, field, and any extra data that's needed to parse out the error. Effectively my errors from starlark get returned like so:

starlark_err!(err_code, "<type_name>".to_owned(), format!("{}#{}", field_name_in_type, extra_data));

My error handler then takes these, and does the following:

Get a string representation of the starlark code that caused the error (which is just the function being called i've defined, and maybe some whitespace. Nothing else).
Parse the typename, and call a particular function for that type called: TypeName::zero_in_on_argument. Zero in on argument looks up the field name that caused the error which may look something like: options.1 (signifying the options array's first item, is the field that caused the error), and parses out a new range to highlight which will cover just that argument.
Create an error message for that particular error code , which highlights an exact field that threw an error, and parse any extra data that is needed.

This leads me to be able to highlight rather than the whole function, a particular field (even nested):

   0: pyra::starlark::eval::invalid_task_name: Task name conflicts with a built in command, or is empty

  --> /home/cynthia/projects/personal/pyra/.github/pyra.star:8:5
   |
 5 | oneof_task = pyra_oneof(
 6 |   "build",
 7 |   options=[
 8 |     "help"
   |     ^^^^^ This task name is invalid
 9 |   ]
10 | )
   |

Finally the last thing I haven't actually implemented yet, but was just about to look into was storing a display representation of a stack-trace. This is for cases where a piece of configuration may be valid on it's own, but once I've parsed all of the starlark configuration and done validation on it there may be conflicting configurations.

In this case of my tool I have some things called "tasks", that have names that need to be globally unique. So at construction time they may be totally valid, but as soon as I look at the thing in whole I realize there are two tasks with the same name. Which is an error in my case, and I want to be able to show the user "hey here is where the two resources are defined that conflict", as they may be in two very different files, or hidden away through helper functions, or something of the sort.

ndmitchell · 2021-03-12T19:54:38Z

RUSTC_BOOTSTRAP has an intimidating name, to intimidate people. But it's really just set an environment variable and nightly features work on a stable compiler.

For error, I've rewritten it entirely. There are no more error codes. Everything is based on anyhow errors. Everything is defined as an enum using thiserror. We have a custom Diagnostic type (see https://github.com/facebookexperimental/starlark-rust/blob/master/starlark/src/errors/mod.rs#L37-L49) which is basically code location, anyhow error, call stack, and then we wrap it back up into an anyhow. I also rewrote the parser to allow things like break in more places, and then detect it after we have an AST, which gives much better error messages. It may be that our work there mostly overlaps, but I'm certain you'll have added great messages we missed, and adding new ones is definitely a good thing.

Because we moved to anyhow, you no longer need to abuse types to define your own errors. You can define them with full structure, and it's an open universe, so your error types are just as good as the internal ones.

For argument errors, you are doing better than we are. We have UnpackValue which basically says whether a type matches or doesn't match an argument. That lets us do all the validation on each argument, but only at the level of the argument not a custom sub-expression (bad), we don't allow a custom error message (bad) and positionally we report it at the level of the call (bad). But the plan was to change UnpackValue with a real Value parser, which would give us the first two. And the third is probably possible today - although there are challenges around mapping things like kwargs where multiple fields may combine or the nesting may be difficult to disentangle. If your techniques are generic enough to be broadly applicable, and not a performance drain in the non-error case, I'm sure we can figure something out.

You can capture the call stack pretty easily, and we can have Diagnostic which stores multiple call stacks. None of that sounds too hard, and addresses your globally unique thing. There is a perf cost to capturing call stacks. If it's an error path, no big deal, but if you want to capture it on every identifier there may be considerations. But, it's going to be as fast as it can be (we can capture lazy traces which don't resolve or allocate more than just a Vec).

Mythra · 2021-03-12T20:15:42Z

For error, I've rewritten it entirely. There are no more error codes. Everything is based on anyhow errors. Everything is defined as an enum using thiserror. We have a custom Diagnostic type (see https://github.com/facebookexperimental/starlark-rust/blob/master/starlark/src/errors/mod.rs#L37-L49) which is basically code location, anyhow error, call stack, and then we wrap it back up into an anyhow. I also rewrote the parser to allow things like break in more places, and then detect it after we have an AST, which gives much better error messages. It may be that our work there mostly overlaps, but I'm certain you'll have added great messages we missed, and adding new ones is definitely a good thing.

This sounds great but one thing I've noticed is the errors seem to be public to the crate

pub(crate) enum Dubious {

Which makes formatting them to fit the same style as the rest of the errors quite difficult. Would you be open to making them public but with a #[non_exhaustive] to indicate errors/fields on errors can be added without calling a breaking change? I'm also happy to look through and start porting over my error messages to this crate if they're not already reporting similar information (or better).

For argument errors, you are doing better than we are. We have UnpackValue which basically says whether a type matches or doesn't match an argument. That lets us do all the validation on each argument, but only at the level of the argument not a custom sub-expression (bad), we don't allow a custom error message (bad) and positionally we report it at the level of the call (bad). But the plan was to change UnpackValue with a real Value parser, which would give us the first two. And the third is probably possible today - although there are challenges around mapping things like kwargs where multiple fields may combine or the nesting may be difficult to disentangle. If your techniques are generic enough to be broadly applicable, and not a performance drain in the non-error case, I'm sure we can figure something out.

I'd be happy to upstream my work if you'd accept it, but right now admittedly it isn't the prettiest. And although it adds no cost to the non-error case (or should add miniscule overhead - like the cost of one field indicating where it is), it does add significant cost to the error case. Turning a SpanLoc -> String, and then basically "pseudo-parsing" that string to find the field is not exactly cheap, and also not the cleanest code to look at. The actual extraction looks something like this: https://gist.github.com/Mythra/d318980dd62190ff17e6f5d04e146576. Note: This also fails in cases like **kwargs, instead reporting the whole callsite.

But, it's going to be as fast as it can be (we can capture lazy traces which don't resolve or allocate more than just a Vec).

Yeah lazy traces would definitely be helpful here. Admittedly I'm starting from the case of "best UX first", then planning on doing some work before I release the tool to make sure in the case of small, to "large" (10000's of items) it meets my speed requirements.

ndmitchell · 2021-03-12T20:31:43Z

This sounds great but one thing I've noticed is the errors seem to be public to the crate

Yes, deliberately so. The only errors that aren't yet pub(crate) like ValueError are slated to become so in future. The reason is to keep encapsulation. When you start down the road of anyhow, you lose static checks on errors - which has been a massive advantage in terms of flexibility. Once you export the error types, people then have the option to build that on top dynamically checked - which often goes wrong.

Which makes formatting them to fit the same style as the rest of the errors quite difficult

What's your style? Is it better than the existing style? If so, why not make it the only style? Rather than provide choice and customisation, I'd rather just make the library as good as it can be, so everyone wins. Would be great to see a before/after comparison. If we need more structure, Diagnostic would be the place to put it, but I'm keen to make defining errors cheap, since its so common. But also good, since users often have to fight them.

I'd be happy to upstream my work if you'd accept it, but right now admittedly it isn't the prettiest.

As it stands, not really - the pseudo-parsing freaks me out a lot. But the idea and concept, yeah, I can get on board with that, and it would be great to have. Once you are on the error path, as long as its not outrageous, some cost is fine. I'd go for the approach of grab the string, reparse, reorientate with spans on the AST, then find the right subexpression using traversals. We need the better conversion of values to types (value parsing) to get workable substrings though.

Mythra · 2021-03-12T20:41:15Z

What's your style? Is it better than the existing style? If so, why not make it the only style? Rather than provide choice and customisation, I'd rather just make the library as good as it can be, so everyone wins. Would be great to see a before/after comparison. If we need more structure, Diagnostic would be the place to put it, but I'm keen to make defining errors cheap, since its so common. But also good, since users often have to fight them.

That makes sense. My format isn't too different. The big thing is having a unique error code that a user can then lookup using another command to get a more indepth explaination (similar to rustc's explain). But the basic idea is:

${error_code}: ${error title}

  ${code section with highlighting}

As it stands, not really

I don't blame you 😅 . This was very much a "I need a solution", but knowingly not a good solution. I'd be happy to integrate it with the solution you mentioned but admittedly not sure where to start there, but something we can look into.

ndmitchell · 2021-03-12T20:52:47Z

I'm unconvinced that a project the size of Rust Starlark, with the resources available to it, can support error codes with good informative explanations. If you look at projects like Python or Haskell they haven't managed to do it, and they have tried. Errors codes take a lot of time, love and maintenance. And in most cases a good error message is probably better, and for the ones that do require nuanced explanation, a StackOverflow question people can Google for.

For detailed errors on arguments, that's definitely something we can do. If you're interested in helping with it, we should probably video chat, as its easier to brainstorm the approaches that way.

ndmitchell · 2021-03-12T22:10:50Z

Maybe my remarks on the error codes are a bit harsh. If someone writes a web page, listing all the errors, and describing them in useful detail, and commits to keeping it up to date, we could easily have some kind of code prefix added. My hunch is that's too much work, and the benefits are insufficient for the time it would take. But if I'm wrong, and someone writes the web page, we can add the codes. What I don't want to do is add the codes (which has maintenance and development cost) assuming the web page will come later, when my suspicion is that it won't.

Mythra · 2021-03-14T12:08:38Z

I'd be happy to certainly maintain error codes, as well as well written error messages for those codes. I agree error codes can be tough in some places, and should never be used in place of a good solid error. That being said having a distinct code can be helpful to both parties, and is something I've already been doing for my whole codebase (including starlark), and would be happy to contribute back up, and maintain.

Also happy to set up a call some time in the coming weeks to discuss actual implementation of some of these ideas, and get working on them.

ndmitchell · 2021-03-14T13:13:44Z

Great, sounds like a call is the next step on most of these. I'm ndmitchell gmail.com. Email me and let's find a time that works.

ndmitchell · 2021-03-18T08:59:56Z

Issues #10, #11 and #12 were spun out from discussions on the call. We'll leave this ticket to deal with supporting the Rust stable channel, but agreed there is no need yet, so consider it an issue that may become important in future, but isn't required now.

indygreg · 2021-04-14T15:08:06Z

I use starlark-rust 0.3 in PyOxidizer and the lack of stable channel support is meaningful to me because PyOxidizer is at a stability and popularity point where people like Linux distros want to package it downstream and often have policies about the use of stable Rust features. I know of and have abused RUSTC_BOOTSTRAP before. But having it just works on stable avoids a lot of headaches, especially for software that is installed by end users who don't necessarily know Rust.

While I'm here: I'm glad to see this crate being actively developed! I ❤️ Starlark and hope to see it grow in popularity [in the Rust ecosystem].

ndmitchell · 2021-04-14T15:46:51Z

Thanks for your feedback @indygreg. I can see why stable makes things easier for you and distributions. I guess the first thing would be for you to assess whether stable is your only blocker in using the newer version - we changed a lot of the API surface area. While I think it's now more powerful, there might be something you were relying on that we removed, and that would be good to check (I'm sure that we can put it back if there is). It would also be good to know that you are happy with it, and happy to switch.

I think if you port PyOxidizer to it, we'll promise to switch to supporting stable within a week or so of that happening, so it doesn't delay you (or once you are far enough along that you're sure it's the way you want to go). Does that seem reasonable?

indygreg · 2021-04-14T16:00:27Z

Your proposal sounds reasonable!

Let me have a go at porting to 0.4. I have quite a bit of Starlark code and am doing some... questionable things. The last time I updated the crate it took several hours of work. I anticipate the same here given the surface area of the API changes. I'll file new issues for any issues I encounter with 0.4. It may take a few days before I have any meaningful feedback, however.

ndmitchell · 2021-04-14T16:06:49Z

Sounds great. If you have any questions about where things have moved to, feel free to raise a new issue and I can answer them there - and definitely for anything that seems to have gone entirely.

indygreg · 2021-04-21T22:17:29Z

As I started to look at refactoring my code, I found myself wanting to shave some yaks related to the horrible state of the Starlark code in PyOxidizer. So it will likely be a while longer before I have meaningful feedback.

The progress I have made so far is promising though. And it looks like changes in 0.4 introduce proper support for variable access, which means I won't need to shoehorn state into TypeValues, which was very ugly. I'm very eager to port to 0.4. But it will have to wait until some other work occurs on my end. Sorry for the delay!

ndmitchell · 2021-04-22T07:35:08Z

No rush on our side at all!

nemith · 2022-10-31T17:05:12Z

Is this still a goal? It would be really nice to use starlark in a project but not require nightly rust builds and have it work out of the box with stable rust.

As nice as the bootstrap process it, I shouldn't have to modify my project to fit the libraries?

stepancheg · 2022-10-31T17:10:36Z

I think currently we rely heavily on certain rust internals (like compile time type id) which are very hard to avoid. So practically it is not possible to make it stable at the moment.

However, we should reduce the number of unnecessary unstable features we use to make it more clear what we really need and what are small convenient bits.

Because from the top of my head, I cannot tell you exactly where we rely on unstable.

stepancheg · 2023-01-10T19:52:55Z

FYI, current version of starlark-rust works on stable. Need to release it to crates.

nemith · 2023-01-20T02:06:27Z

This does not seem to be the case due to dependencies (also from Facebook)

error[E0554]: `#![feature]` may not be used on the stable release channel
  --> /Users/brandonbennett/.cargo/registry/src/github.com-1ecc6299db9ec823/gazebo-0.6.0/src/lib.rs:10:49
   |
10 | #![cfg_attr(feature = "str_pattern_extensions", feature(pattern))]
   |                                                 ^^^^^^^^^^^^^^^^

error[E0554]: `#![feature]` may not be used on the stable release channel
  --> /Users/brandonbennett/.cargo/registry/src/github.com-1ecc6299db9ec823/gazebo-0.6.0/src/lib.rs:11:49
   |
11 | #![cfg_attr(feature = "str_pattern_extensions", feature(associated_type_bounds))]
   |                                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

error[E0554]: `#![feature]` may not be used on the stable release channel
  --> /Users/brandonbennett/.cargo/registry/src/github.com-1ecc6299db9ec823/gazebo-0.6.0/src/lib.rs:10:57
   |
10 | #![cfg_attr(feature = "str_pattern_extensions", feature(pattern))]
   |                                                         ^^^^^^^

nemith · 2023-01-20T02:25:40Z

It doesn't even seem to compile on nightly

$ cargo +nightly build
   Compiling starlark v0.8.0
error[E0407]: method `backtrace` is not a member of trait `Error`
   --> /Users/brandonbennett/.cargo/registry/src/github.com-1ecc6299db9ec823/starlark-0.8.0/src/errors/mod.rs:126:5
    |
126 | /     fn backtrace(&self) -> Option<&std::backtrace::Backtrace> {
127 | |         Some(self.message.backtrace())
128 | |     }
    | |_____^ not a member of trait `Error`

error[E0512]: cannot transmute between types of different sizes, or dependently-sized types
  --> /Users/brandonbennett/.cargo/registry/src/github.com-1ecc6299db9ec823/starlark-0.8.0/src/syntax/ast.rs:84:1
   |
84 | assert_eq_size!(AstStmt, [usize; 12]);
   | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |
   = note: source type: `Spanned<StmtP<AstNoPayload>>` (704 bits)
   = note: target type: `[usize; 12]` (768 bits)
   = note: this error originates in the macro `assert_eq_size` (in Nightly builds, run with -Z macro-backtrace for more info)

error[E0512]: cannot transmute between types of different sizes, or dependently-sized types
  --> /Users/brandonbennett/.cargo/registry/src/github.com-1ecc6299db9ec823/starlark-0.8.0/src/syntax/ast.rs:85:1
   |
85 | assert_eq_size!(AstExpr, [usize; 9]);
   | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |
   = note: source type: `Spanned<ExprP<AstNoPayload>>` (448 bits)
   = note: target type: `[usize; 9]` (576 bits)
   = note: this error originates in the macro `assert_eq_size` (in Nightly builds, run with -Z macro-backtrace for more info)

Some errors have detailed explanations: E0407, E0512.
For more information about an error, try `rustc --explain E0407`.
error: could not compile `starlark` due to 3 previous errors

stepancheg · 2023-01-20T02:47:32Z

It is compiles in-repo.

To compile it outside of repo, path to gazebo library also needs to overridden to github master version.

I'm working on removal of gazebo dependencies now (this commit for example c43b8a6).

nemith · 2023-01-20T21:28:59Z

Confirmed that building from main branch does compile.

starlark = { git = "https://github.com/facebookexperimental/starlark-rust" }

So maybe a new release at some point would be all that is needed. Cheers.

ndmitchell · 2024-01-18T17:36:38Z

I'll make a new release shortly.

ndmitchell · 2024-01-18T17:44:23Z

Actually, there was a release last year, so already done.

ndmitchell changed the title ~~Stable Support~~ Support Rust Stable channel Mar 12, 2021

This was referenced Mar 18, 2021

Support error codes #10

Open

Add a ValueParser #11

Open

Given substructure positions for errors #12

Open

indygreg mentioned this issue Sep 9, 2021

lalrpop compile error with Rust 1.56+ #32

Closed

Mythra closed this as completed Jan 14, 2023

Mythra reopened this Jan 20, 2023

ndmitchell closed this as completed Jan 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Rust Stable channel #7

Support Rust Stable channel #7

Mythra commented Mar 12, 2021

ndmitchell commented Mar 12, 2021

Mythra commented Mar 12, 2021

ndmitchell commented Mar 12, 2021

Mythra commented Mar 12, 2021

ndmitchell commented Mar 12, 2021

Mythra commented Mar 12, 2021

ndmitchell commented Mar 12, 2021

ndmitchell commented Mar 12, 2021

Mythra commented Mar 14, 2021

ndmitchell commented Mar 14, 2021

ndmitchell commented Mar 18, 2021

indygreg commented Apr 14, 2021

ndmitchell commented Apr 14, 2021

indygreg commented Apr 14, 2021

ndmitchell commented Apr 14, 2021

indygreg commented Apr 21, 2021

ndmitchell commented Apr 22, 2021

nemith commented Oct 31, 2022

stepancheg commented Oct 31, 2022 •

edited

Loading

stepancheg commented Jan 10, 2023

nemith commented Jan 20, 2023

nemith commented Jan 20, 2023

stepancheg commented Jan 20, 2023

nemith commented Jan 20, 2023

ndmitchell commented Jan 18, 2024

ndmitchell commented Jan 18, 2024 •

edited

Loading

Support Rust Stable channel #7

Support Rust Stable channel #7

Comments

Mythra commented Mar 12, 2021

ndmitchell commented Mar 12, 2021

Mythra commented Mar 12, 2021

ndmitchell commented Mar 12, 2021

Mythra commented Mar 12, 2021

ndmitchell commented Mar 12, 2021

Mythra commented Mar 12, 2021

ndmitchell commented Mar 12, 2021

ndmitchell commented Mar 12, 2021

Mythra commented Mar 14, 2021

ndmitchell commented Mar 14, 2021

ndmitchell commented Mar 18, 2021

indygreg commented Apr 14, 2021

ndmitchell commented Apr 14, 2021

indygreg commented Apr 14, 2021

ndmitchell commented Apr 14, 2021

indygreg commented Apr 21, 2021

ndmitchell commented Apr 22, 2021

nemith commented Oct 31, 2022

stepancheg commented Oct 31, 2022 • edited Loading

stepancheg commented Jan 10, 2023

nemith commented Jan 20, 2023

nemith commented Jan 20, 2023

stepancheg commented Jan 20, 2023

nemith commented Jan 20, 2023

ndmitchell commented Jan 18, 2024

ndmitchell commented Jan 18, 2024 • edited Loading

stepancheg commented Oct 31, 2022 •

edited

Loading

ndmitchell commented Jan 18, 2024 •

edited

Loading