-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
document some std library evolution policies #18479
Conversation
In wanting to improve the standard library, it's helpful to have a set of principles and guidelines to lean back on, that show how to introduce such improvements while at the same time considering existing users of the language. A key idea behind documentation of this sort is that it should highlight a path forwards in making changes rather than trying to prevent them, and although the current snippet does include some language for what one shouldn't do, it also shows a few options for what one can do. This is a riff on nim-lang#18468 mainly based on what helps enable to the use of Nim productively in environments and teams where the priority and focus is not always on the tool (Nim in this case) but rather on the codebase itself, and its use by end users. We use similar guidance documentation to help coordinate the code of the teams using Nim in https://status-im.github.io/nim-style-guide/ where it is applied not as law, but rather as recommendations for the default approach that should be considered first.
When can breaking changes be made? There will inevitably be scenarios where a non-breaking alternative isn't feasible or is extremely undesirable compared to a breaking change. While I understand that breaking changes are a source of frustration, it's not like they're uncommon in programming languages either. Most (if not all) commonly used languages - from Python to Java - introduce breaking changes in their standard libraries with each language version, excluding patch versions that generally only fix severe bugs in non-breaking ways. (And the point of this paragraph isn't to blindly assert that "oh, these other languages do it, we should do it too", but to say that breaking changes are expected and accepted, at least to a certain degree.) https://rubyreferences.github.io/rubychanges/2.7.html#standard-library-contents-change |
Key to this whole process of maturing Nim is that changes are introduced in such a way that old stuff by and large keeps working - it's not even that complicated to do, if only you adapt the mindset and learn the skills necessary to introduce things in such a way - the "oh but I can't break things any more" kind of comment is mostly a matter of habit and education than anything else - it's a little uncomfortable at first to change because you need to approach problems different than you're used to, but really, it's not that bloody once you've learned it.
Notable about these examples are that in spite of having massive standard libraries, even compared to Nim, the breaking change lists are fairly small and contained - Java for example even has tooling to identify them which is easy because the language itself is a lot less complex - as a consequence the rules governing what is a breaking change are easier as well, and you can see the API report detailing what broke - to take an relevant example, can you imagine Java changing the semantics of These languages were created when the internet was being popularized and getting access to packages was .. tricky which has contributed to their large libraries - more modern approaches include splitting things up into packages with a package manager where each package can have its own change and breakage policy, as well as "maturity level", and this is what the text promotes as a next step in the evolution of Nim, in the case of immature std libs - of course, impossible situations still arise but the problem is then localized to those packages, and they can be upgraded or not, independently. If we're going to make these kinds of comparisons, also take a look at C, C++, rust and go instead - these more closely relate to Nim as being "systems programming languages" - these all come with fairly strict breakage policies, which has enabled third-parties to build up large eco-systems around them, being able to trust that the language by and large will remain usable from one release to the next, without having to reexamine large parts of the codebase and deal with mutually incompatible changes. The standard library is particular in that upgrading is an all-or-nothing proposition, and the problems are somewhat exacerbated compared to simpler languages due to Nim:s semantics (such as the global namespace, loose typing around generics and so on) which in particular are sensitive to silent runtime breakage that is difficult to understand and work around. That said, the guideline is there to establish a culture around a non-breaking-approach-first mentality that typically accompanies a language used in production. It doesn't say that no breaking changes should ever be made, merely that there exist ways to introduce change that have empathy for current users - for example spreading the upgrade over multiple versions where the final breaking version gets a breaking version number, using the deprecation features, etc. Recent standard library efforts have gone all-out in the other direction however where small imperfections are used to motivate breaking changes, and the model used to introduce them - This kind of policy is also here because it establishes a norm around which users of Nim can plan how they use Nim (or not) - as things stand right now, the norm is moving towards a break things mentality where every user with Nim code constantly must scour the PR flow to point how how things will break and argue the points over and over - at that stage, a guideline serves as a coordination point so that potential users of Nim can take a look at it and decide if the language is for them - I'd encourage those that downvote this and the other PR to write an alternative set of guidelines that they're willing to live up to, so that people like us that want to use Nim can decide if we should continue to do so, and if yes, what safeguards we need to put in place. |
I just read https://rust-lang.github.io/rfcs/1122-language-semver.html. And while the rules for Nim seem to be remarkably similar and I'll accept these for the sake of moving forward, I have to point out that Rust's way of doing it is in practice quite empirical:
So an empirical approach to the problem via "important packages" would equally be justified. |
So when/how often would Nim be releasing "major" versions, that would possibly contain breaking changes, compared to minor versions? |
"breaking change" is simply the wrong criterion to decide whether a change is acceptable, because most changes (bugfixes, features) are a potential breaking change. For e.g., this PR considers adding a function with a new name as non-breaking, but that's not even true, as I've shown in #18468 (comment): simply adding a new API Plenty of other changes that seem innocuous (fixing a bug, adding an inline pragma which makes std/random 10x faster (it does), making a proc generic to generalize an API, adding a new API, changing hashing to avoid major slowdowns, changing std/random to avoid non-random behavior etc) are in fact a breaking change for some use case (whether the use case predates the change or not). Under this policy, a ton of PRs and bug fixes would have to be reverted (see some examples in #18468 (comment) and #18468 (comment)), a ton of changes between 1.0 and 1.2 or 1.2 or 1.4 would violate this policy, not to mention most RFCs. At this point, might as well not issue any new minor point releases, and just start shipping nim 2.0, then 3.0 etc, without intermediate point releases. Simply deprecating APIs and keeping behavior immutable isn't the solution, as shown in #18468 (comment); a change to This is the 1 vs N churn: you either fix the bug in 1 place and make users of 1 package that relies on the old behavior un-happy, or you pass on the problem to all (transitive) clients of the API and require all of them to change their code, making everybody unhappy, and causing balkanization of APIs and a maintenance nightmare. Simply deferring breaking changes to nim 2.0 isn't a practical solution either, because nim 2.0 will also have bugs and waiting for 3.0 to have those bugs fixed isn't practical unless we decide to forgo of minor point releases. A better, more practical criterion for breaking changes is to assess impact, as done in rust (https://rust-lang.github.io/rfcs/1122-language-semver.html); impact can be assessed as follows:
As to when those changes should be made available, I argue in #18486:
Most languages frequently introduce necessary breaking changes; they improve not in spite of but thanks to judicious introduction of breaking change, whether it's python, java, D, rust, swift, etc; the product they offer improves release after release and despite some unavoidable complaints, the majority of people welcome those. |
In fact, you can argue that introducing new overloads for an already overloaded proc is an "implementation detail" (we already trusted the overload resolution process before to get it right) whereas entirely new procs are "breaking" changes.
Well the policy would be adhered to for new releases, not a big deal if the policy actually works.
Semver is just broken and empiricism is superior. Also, versions are used for marketing, Nim 2.0 should be a significant release, not just yet-another release because of Semver. |
I'm not sure about that - ie the maintenance nightmare is when a base package changes and now all your dependencies must follow suit - this is the coordination problem we're trying to avoid by constraining the cowboy approach - in particular, because most changes can be introduced in an additive manner without said disruption if merely you give it some thought. It is true that the result is not as pretty as it could be was Nim a greenfield project - the way out here is to roll the changes as opt-in for a few releases then make a breaking change. Above all though, this is a problem that is only going to be growing worse for every module added to the standard library - that's why the default reaction to a broken API is to move it out of the std lib and evolve it outside - this way, we split up the problem into pieces where each piece moves independently, and none of these problems exist any more. This is a much more powerful way of addressing things because it scales: it's not artificially held up by coordination overhead with the rest of the langauge and library. Again, maintaining a codebase of any size on top of a constantly moving target, such as we have in the real world, is a real and concrete a maintenance nightmare while the "balkanization" nightmare remains a theoretical construct. Everything sits on top of the standard library, hence everybody is affected when it changes - that's a fact - the theoretical future users are pretty to inspire fear with, but the reality is that people are smart enough to figure things out and move on, incredible as this may sound.
adding overloads is usually fine - except when they're in the wrong module and a match-all generic overload exists
er, semver is a communication tool to communicate the result of the empiricism to your users in an asynchronous best-effort manner |
Well surely you can simply use empiricism ("we follow these rules and test releases against an ever growing set of real world code") without following semver. There is no proof that semver significantly improves the "best-effort"-ness that everybody ends up doing and plenty of successful software projects do not follow semver (yet remain backwards compatible). In fact, semver assumes that bugfixes are much less harmful than new features and that's just wrong, esp. for a compiler. (Compiler bugfixes make the compiler stricter and do break code, a new feature can use new syntax that was previously a syntax error so by construction you know the new feature doesn't break code.) |
I dunno, I just see it as a way for the author of a library to communicate their best knowledge of the changes to the users of the library etc - ie "I fixed this bug" vs "I knowingly made a mess for you and you need to work to upgrade" - there's no real value statement about harm or importance in that, merely pragmatism. |
like any policy, one can get hung up on strict interpretations, or focus on the intent of the policy - in the former case, one makes absurd examples while in the latter, one makes a comon-sense judgement. the intent with this policy in particular would be that it's of the latter kind, where it's applied judiciously - for example, if it's applied to Likewise, changing the semantic of a function which previously explicitly guaranteed non-nil values such that it suddenly starts handing out |
The order is |
Moving things around or splitting stdlib into separate repos isn't magically going to fix problems. Take a look at fusion: 68 commits since it started (in april 2020); in the same timespan, nim saw 2518 commits, or 37x more; even if you restrict it to lib sub-directory, that's still 1201 commits vs 68. Monorepos are easier to maintain and grow because there are less moving parts, less dependencies, and it's easier to maintain consistency and you don't have synchronization issues when a change involves 2 repos. See also the background story around nim-lang/RFCs#310
You might say, let's use a decentralized stdlib, maybe even pkg/os, pkg/sequtils, but that'll only make things worse: less maintainers, less reviewers, lower quality bar, less trust when using those packages and yes, increased chances of dependency conflicts. Even llvm realized that and transitioned to a monorepo, containing both compiler and standard library, why do you think that is? |
Indeed it does not, but what arnetheduck needs is a way to pin down dependencies in a more fine granular manner than "requires Nim version X" where X combines 10 very good bugfixes with 3 runtime changing behaviors that are quite risky. |
This will always be the case. Even if stdlib isn't a factor, a new compiler version will combine 5 different bug fixes with 10 different features. Unfortunately if you need one of those bug fixes then you'll need to swallow the other changes and test that your software still works with them, if you can prove that it does not then you can ask for a special backported version that fixes the bug that you need fixed. |
It can be mitigated though by not doing these risky runtime changing behaviors (instead deprecate the offending proc) and/or by having a fixed number of these per release. And the fact that there is no agreement on some of these changes implies that "just use common sense" is too limited and we should have more policies. (And I cannot believe that I wrote this last sentence as it's against my beliefs...) |
I think above all we have lots of software and libraries that we want to write in Nim and would like to focus on that, instead of chasing aesthetic changes. Crucial in that process is having a stable base to build on - that means that the core libraries of the language don't change frivolously - semantically changing core libraries breaks idioms and introduces new failure modes - this applies to the parts that of that are covered by test suites, but also to the informal parts that rest on principles and precedent. When a module doesn't work out the way it should, it's easy to create a new one that does, assuming that it is truly that important, and it can easily be done without changing the existing one. Many in the Nim community have done so already, creating successful libraries and improvements to things in the standard library, to the point that they get used in spite of the significant convenience subsidy that the standard library offers. This is the power that we should be encouraging and unleashing instead of entrenching an unscalable model further . The amount of work needed to write a new module is directly related to the amount of work you're breaking when you change the existing one - is introducing The baseline for the precedent was set with the 1.0 release - for the sake of argument, let's say it's 1.4 even - we will obviously not be going back from there. The question is what to do next - keep breaking things and thus discouraging people from investing their time in using Nim for production projects or draw a line and start applying a higher standard to some standard library modules and move the rest to packages where they can be independently improved at an appropriate pace? Let's face it - some modules have outlived their times in a way that they can't be fixed without breaking them and significantly changing the way they work - the right place for such radical changes is not together with the compiler and the core standard libraries without which the language becomes meaningless.
I'm curious, from where did you draw all these conclusions? There's nothing in this proposal that says that the separate packages have to live in a separate repo, be maintained by other people or with lower quality standard. Of course, some libraries are so out of place that the only obvious and correct choice is to move them to a separate repo where likely they will fall out of use since better alternatives exist, but that doesn't necessarily apply to everything. On the contrary - the pace and the difficulty to make a Nim release speaks to the opposite - instead of being able to release a highly criticial bugfix to components like json parser or http server, the community must wait for months while the maintainers of the distribution meticulously must go through all completely unrelated changes, coordinate with all packages that broke as a result of changes in those unrelated packages, fix all new bugs introduced by new features and crusades, then release. |
That's because I keep bringing them up, I think. cough :-/ |
Ergo the need for a smaller standard library and separately versioned components thereof - this is a win-win for everyone: upgrades to the compiler are easier and upgrades to applications are easier and all these releases can be made in a more timely fashion instead of one thing blocking the other. |
Merged my PR instead (based on yours). |
In wanting to improve the standard library, it's helpful to have a set
of principles and guidelines to lean back on, that show how to introduce
such improvements while at the same time considering existing users of
the language.
A key idea behind documentation of this sort is that it should highlight
a path forwards in making changes rather than trying to prevent them,
and although the current snippet does include some language for what one
shouldn't do, it also shows a few options for what one can do.
This is a riff on #18468 mainly
based on what helps enable to the use of Nim productively in
environments and teams where the priority and focus is not always on the
tool (Nim in this case) but rather on the codebase itself, and its use
by end users.
We use similar guidance documentation to help coordinate the code of the
teams using Nim in https://status-im.github.io/nim-style-guide/ where it
is applied not as law, but rather as recommendations for the default
approach that should be considered first.