Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add strict mechanism for opting into stricter subsets of the language #54903

Open
Keno opened this issue Jun 23, 2024 · 19 comments
Open

Add strict mechanism for opting into stricter subsets of the language #54903

Keno opened this issue Jun 23, 2024 · 19 comments
Labels
design Design of APIs or of the language itself feature Indicates new feature / enhancement requests

Comments

@Keno
Copy link
Member

Keno commented Jun 23, 2024

We've had a few discussions the past few weeks about a feature tentatively dubbed pragma strict after similar constructs in other languages. However, there wasn't really a cohesive writeup of the intent, so triage asked me to write one up to serve as the basis for discussion and fleshing out. I intend to edit this issue as the idea evolves.

Basic idea

The basic idea of the pragma strict feature is to have an opt-in mechanism of turning julia programs that are semantically valid, but undesirable for other reasons (e.g. using ambiguous syntax that should have arguably been disallowed, but we can't for backwards compatibility reasons) into errors. This would be an opt-in feature for developers who have personal, organizational or regulatory requirements for requiring stricter coding standards. An additional motivation is to provide an additional vehicle for low-frictition language evolution. For example, if a specific opt-in turns out to be popular across the majority of packages, a potential julia 2.0 that made the opt-in automatic while technically breaking, would be largely non-breaking in practice.

We are not imagining a single strict mode opt in here, but rather a finer grained set of options, plus versioned collections of options for particular use cases. See the last section for a an initial list of such options.

It is worth emphasizing again that this feature is only intended to disallow undesirable programs that are otherwise semantically valid. It is not intended to cause meaningful semantic differences in programs that are valid both in standard semantics and under the opt-in restrictions (i.e. turning on the restrictions may cause things to error, but if they don't the program should behave the same).

How does the opt-in work?

One of the primary questions in this proposal is how the user expresses the opt-in. There's a few separate semantic options, each
with a number of potential syntax options.

  1. Per module opt-in like our existing Experimental.@compiler_options
  2. Per file opt-in (e.g. using a magic comment on the first line) - popular in some other languages
  3. Per project opt-in in Project.toml

After some discussion on triage, a Project.toml-level opt-in seems like the best option. The primary motivation here is to allow opt-ins that need to be done in the parser (e.g. whitespace requirements). We don't currently define the execution ordering of parsing and execution for packages, so a module-toplevel opt-in may be semantically too late (relatedly, it may be ambiguous what happens when the opt-in is placed in the middle of a disallowed parse). An additional concern is that ideally IDE tooling would be able to understand the active set of restrictions without having to look at the code.

Concrete Project.toml syntax options

One convenient option would be reusing Preferences.jl. One might imagine a julia-level preference like:

name = "MyPackage"

[preferences.julia]
strict = ["nomultiassign", "uniqueidentifiers"]

This doesn't fully mesh with the usual preferences semantics, since preferences are ordinarily uniqued per-UUID while
they would be private for a particular package, but this might be ok. Alternatively, we could reserve the strict key
in each individual package's preference table:

name = "MyPackage"

[preferences.MyPackage]
strict = ["nomultiassign", "nolocalshadow", "noglobalshadow"]

Alternatively, we could have a new top-level strict section:

name = "MyPackage"
[strict]
julia = ["nomultiassign", "nolocalshadow", "noglobalshadow"]

Initial idea list for opt-in options

In this section, I'm collecting a list of potential options that might be implemented. However, I am not at this point asking people to brainstorm all the possibilities that could be implemented. I'm also not asking for detailed discussion on what should or should not be included in a particular option. Rather, I wanted to have a place to list all the ideas that have already come
up and a place to link any issues that could be addressed by this feature. Full design discussions for individual flags can be had on the PRs to implement them once the overall mechanism is in place.

Individual options

  • nomultiassign

Disallows multiple assignments in the same expression without parantheses. I.e. disallows a = b, c = d, e, = f = (1, 2)

  • nolocalshadow

Disallows shadowing of local variables, e.g. in the following


function foo()

	for i = 1:10

		for i = 1:10 # Error shadowing local `i` 

		end

		all(1:10) do i # Error shadowing local `i`
			iszero(i)
		end
	end
end
  • noglobalshadow

Disallows shadowing of global variables, e.g. in the following:

function foo()
	missing = false # Error local `missing` shadows imported global `missing`
end
  • Some variant of unique assignment

Stefan had proposed introducing a unique assignment operator, e.g. := for which there would then be a corresponding opt-in to enforce all assignments use it

  • Enforce export versioning

If we implement some variant of export versioning, there could be an opt-in forbidding unversioned exports.

Collections

The idea of collections is that users in general don't want to individually decide which opt ins matter to them, but will likely be following a standard set by their organizations or prescribed by a style guide. To this end, there could be meta opt-ins like "basestyle", which would
activate a standard collection of opt-ins. These collections should be versioned and activated based on the min-compat version of Julia. In this way, new opt-ins can be added to a collection, without automatically activating them on a julia version upgrade.

@nsajko nsajko added design Design of APIs or of the language itself feature Indicates new feature / enhancement requests labels Jun 23, 2024
@adienes
Copy link
Contributor

adienes commented Jun 23, 2024

I would suggest that one of the individual options should be disallowing control flow in non-"statement" position, i.e. #50415

@jariji
Copy link
Contributor

jariji commented Jun 23, 2024

#51223 is my proposal for := reassignment.

@jariji
Copy link
Contributor

jariji commented Jun 23, 2024

I like (Stefan's?) idea that whitespace must match operator precedence so you can't write 2 * 3+1, you have to write 2*3+1 or 2*3 + 1 or 2 * 3 + 1.

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Jun 23, 2024

Another thing we had talked about was having a set of defaults based on a version, which you could add to or subtract from which might look like this:

strict = ["1.12 defaults", "no local shadow", "-explicit imports"]

Would there be strictures besides the ones provided by Julia itself? Not clear on why there a strict section and a "julia" key in your examples. Wouldn't a single strict entry with a list of values suffice?

@nsajko
Copy link
Contributor

nsajko commented Jun 23, 2024

Some of the configurations should probably disallow accessing non-public names, including:

  1. Names of instance properties (ref propertynames). Wouldn't affect types defined in the same package.
  2. Names in a module (ref names). Wouldn't affect modules in the same package.

IMO this should be opt-out for all packages, but shouldn't affect the REPL.

@jariji
Copy link
Contributor

jariji commented Jun 23, 2024

What do you think about having these subsets be installable packages so users can contribute their own rules, rather than having an official set of rules?

@davidanthoff
Copy link
Contributor

One question is whether this needs to be implemented in Julia itself, or whether this belongs more in a linter like tool. At some level it strikes me that if this is a thing, we most definitely would want to implement support for this in things like the language server. And then the question: what is gained by having two implementations?

@davidanthoff
Copy link
Contributor

And another idea for potential use-cases: relative to more statically typed languages, it is really difficult to provide the kind of robust IDE experience from a language server that languages like TypeScript, Rust or C# have. But maybe there is a scenario where one could actually provide the same kind of robust IDE support if one was willing to avoid some of the more dynamic language features that Julia has. Obviously, that would be a terrible default, but I certainly have packages where I don't need many of the dynamic features of Julia and would much like to have an experience that is more statically typed. Not sure whether that is really feasible, but maybe worth exploring, and this strict type feature might be a good way to opt into a mode that gives one a statically typed IDE experience.

@nsajko
Copy link
Contributor

nsajko commented Jun 25, 2024

@davidanthoff maybe I'm wrong, but I have a feeling you may have this backwards. My feeling is that the only way for the language server to become a clear win for users (currently it's quite annoying with the false positive warnings) is to plug into the Julia implementation quite directly, maybe similarly to Cthulhu.jl. So maybe the language server for Julia should be just a thin wrapper around Julia.

@davidanthoff
Copy link
Contributor

@nsajko probably best to stick with Keno's suggestion to collect ideas here but not discuss or evaluate them in detail, that would presumably just distract from the topic of this issue. Having said that, if you have ideas and thoughts about the LS, please open an issue over at it's repo and we can discuss there.

@LilithHafner
Copy link
Member

I agree that per-project is the best approach.

I agree that Project.toml is the place to put this opt-in and configuration.

Concrete Project.toml syntax feedback

I think a toplevel strict is the simplest approach to avoid confusion over subtle inconsistencies with Preferences.jl's preference resolution.

name = "MyPackage"
strict = ["noreassignment", "nomisleadingwhitespace"]

Additionally, this strictness is a property tied to the package about as closely as it's name and version—it's likely that a project declared without strict rules will fail to parse with them. The closest analog to this feature I know of is Rust editions. In Rust, that field is stored in the [package] table, which serves a analogous role as the toplevel table in our Project.toml files.

Reserving the strict key in each individual package's preferences is a bit breaking. It's also unclear what it means to set the strict preference of any package other than the one named by the Project.toml file. Syntax that enables this seems problematic:

name = "MyPackage"

[preferences.OtherPackage]
strict = ["nomultiassign", "nolocalshadow", "noglobalshadow"]

A toplevel [strict] section seems unnecessarily verbose compared to a toplevel strict key.

@c42f
Copy link
Member

c42f commented Jul 18, 2024

provide an additional vehicle for low-frictition language evolution

Yes! We need this for syntax evolution which is often technically breaking but not actually very breaking at all in practice. There's so many examples of this. Some being #36547 and #54915. In #36547 (comment) I show that several bugs would be fixed by this syntax change. But the change itself is nevertheless, technically breaking and it's a really tough call to decide whether to do it.

Lilith has already mentioned Rust Editions. A core part of Rust editions are that they don't bifurcate the ecosystem because crates with using different editions can work together. We should definitely do that to avoid a python 2/3 style debacle.

For example, if a specific opt-in turns out to be popular across the majority of packages, a potential julia 2.0 that made the opt-in automatic while technically breaking, would be largely non-breaking in practice

For a lot of minor syntax improvements, I think they'd best be expressed as "use the latest syntax as of Julia version 1.x" rather than as opt-in flags. If we're trying to improve the syntax to change/remove ambiguous or confusing syntactic constructs, we want an incentive for the whole ecosystem to drop the old syntax. For example if we do the change @JeffBezanson mentioned here #54915 (comment) in a Julia edition we do want packages to drop the old syntax as soon as possible.

"Use Julia 1.x syntax edition" is great from this point of view:

  • Users have an incentive to opt in because they get the latest syntax niceties (for example multidimensional array literals)
  • They pay the minor price of updating old syntax (can be automated!)
  • Users of the ecosystem benefit from most packages being on the latest syntax version

So I think "syntax evolution" should not be fine grained, where at all possible - it's a bit different from the other "strict mode" things which are mentioned above, where users might want fine-grained control over opting out of certain language constructs.

@nsajko
Copy link
Contributor

nsajko commented Jul 29, 2024

xref #43654 #46411 #52014 #55304

@nsajko
Copy link
Contributor

nsajko commented Aug 1, 2024

xref #50040

@c42f
Copy link
Member

c42f commented Aug 12, 2024

#48434 seems like another candidate for Julia Editions - not so much "strict mode", but a change to lowering semantics. It's a reasonable candidate because we can detect most semantic changes introduced by changing the try scoping rules and provide help for people porting their code either as a diagnostic or automated refactoring pass. (The tricky part would be changes to semantics in top-level code - we can probably still deal with this using static analysis for most packages but there will always be hard cases.)

@o314
Copy link
Contributor

o314 commented Aug 13, 2024

rustc compiler handles things this way. see
https://rustc-dev-guide.rust-lang.org/implementing_new_features.html
https://rustc-dev-guide.rust-lang.org/feature-gates.html
interop with the ast is handled too
https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast_passes/feature_gate/fn.check_crate.html
there is macro for this (w rust macro_rules eg composable matchable macro )

@sadish-d
Copy link

regarding reassignment:

As I think some others also mentioned in #51223 , I think it makes sense to have x := 1 or local x = 1 for initial variable declaration + assignment / variable definition and x = 1 for any reassignment rather than the other way around (:= for reassignment).

@MilesCranmer
Copy link
Member

MilesCranmer commented Jan 5, 2025

What about using a scoped approach like @stable/@unstable in DispatchDoctor.jl? Like:

Base.@strict :nomultiassign :nolocalshadow begin

#= package code =#

end

It recurses through include and submodules too.

This way you could improve parts of your code at once, rather than needing to update the entire codebase in a single patch.

And perhaps sometimes you just need to disable it just for one single function, so you can turn it off with

Base.@nostrict begin

my_hacky_function() = #= ... =#

end

which toggles the outermost @strict for the scope.

You can use Preferences.jl to set defaults for the @strict scopes so you can configure options from the top level. This is the same way DispatchDoctor.jl uses Preferences.jl to provide defaults for codegen level and how many Unions to consider types unstable, which has worked really well I think.

This is also the same way clippy works in rust. You can set overall rules for a single project from Cargo.toml:

[lints.clippy]
enum_glob_use = "deny"

but then ALSO set them locally:

#![warn(clippy::all, clippy::pedantic)
{

// Block of code

}

@younes-io
Copy link

Another way to reconcile strict modes, linter rules, and "edition-like" changes is to introduce a graduated enforcement system: a "ladder" of increasingly strict enforcement. This would let developers climb as high up the ladder as they can, without forcing them to jump all at once. For instance:

Lint-Only (Soft) : Violations would merely produce warnings or style hints in IDEs and CI, and it would be useful for early detection and adopting best practices without breaking code.

Strict-Warn (Medium) : the compiler (or a compiler-like pass) can still compile the code but issues deprecation-level warnings that are treated seriously in automated checks. This could also be great for codebases that want warnings to become "CI failures", but not show-stoppers in local dev if you’re prototyping something quickly..

Strict-Error (Strong): at this level, strictness violations become hard errors: the code simply does not run under your chosen strict/edition setting. Could work well for teams or organizations that want compliance guarantees (e.g. regulatory), or for new library code that aims to adopt "modern Julia".

Crucially, one can imagine each rung on the ladder being toggled in Project.toml. So if you specify strict = ["julia-2025:strict-warn"], you get the medium rung for all the 2025 changes. This keeps the ecosystem from fragmenting.. people can choose their rung depending on how ready they are to enforce those rules...

Obviously, this extends the notion of a "Julia edition" to let project owners define not just which strict or edition rules to adopt, but also how strongly to enforce them, all in one place..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design Design of APIs or of the language itself feature Indicates new feature / enhancement requests
Projects
None yet
Development

No branches or pull requests