Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Developing Nim's stdlib and a Nim distribution #173

Closed
Araq opened this issue Oct 24, 2019 · 62 comments
Closed

Developing Nim's stdlib and a Nim distribution #173

Araq opened this issue Oct 24, 2019 · 62 comments

Comments

@Araq
Copy link
Member

Araq commented Oct 24, 2019

Design guidelines for Nim's stdlib

We plan to create a "Nim distribution" which consists of the
Nim compiler, Nimble and a selected/curated set of Nimble packages. The idea is to have the best of both worlds:

A Nim with a stdlib that can be maintained effectively by the Nim core developers and yet also something that has "batteries included". Nevertheless there still is a stdlib and it remains part of Nim's core.

Can the Nim compiler itself depend on a curated Nimble package? In the future yes, in the beginning, it shouldn't. We have to be conservative with Nim's core. This brings us to our first requirement:

(1) What the compiler itself needs must be part of the stdlib.

This is probably a temporary requirement until the "Nim distribution" has been implemented and tested successfully for a couple of months.

The second requirement should be uncontroversial:

(2) Vocabulary types must be part of the stdlib.

These are types most packages need to agree on for better interoperability, for example Option[T]. This rule also covers the existing collections like Table, CountTable etc. "Sorted" containers based on a tree-like data structure are still missing and should be added.

Time handling, especially the Time type are also covered by this rule.

(3) Existing, battle-tested modules stay

Reason: There is no benefit in moving them around just to fullfill some design fashion as in "Nim's core MUST BE SMALL". If you don't like an existing module, don't import it. If a compilation target (e.g. JS) cannot support a module, document this limitation.

This covers modules like os, osproc, strscans, strutils, strformat, etc.

And finally:

(4) New stdlib modules do not start as stdlib modules

Nim distribution

I imagine the "Nim distribution" to work like this: We have the usual Nim tarball with a dist/ directory that contains the set of selected packages we agreed on.

Adding a package

Every package in there must be voted into the distribution. The majority decides about whether to include the package or not.

A package must be at version 1 or later in order to be considered for inclusion. Ideally we can use the master branch of the github repository for inclusion.

After the decision to add it was made, a review process should start. The review should be done by the distribution maintainers.

The review process should focus on:

  • Quality and amount of tests.
  • Quality and quantity of the available documentation.
  • Overall quality. (For example, do have proc names that make sense?)
  • Security implications.

It should not focus on:

  • Compliance with our style guideline.
  • The number of spaces used for indentation.

Removing a package

Ideally packages are not removed. It's a package others depend on. We should fork the package to ensure it stays online. It's acceptable if the development on a package has stopped. If the community decides that the package A has been superseded by a different package B the distribution can start to include B and deprecate A.

Keeping the packages up to date

CIs ensure the tests are green all the time. The distribution itself will be version controlled and the packages are tied to a specific git commit that has been reviewed. There is a tension here between "use what is known to be stable" and "use the latest" and probably we should support both, default is "stable", and "latest" not only means "latest" but also "unsupported and not reviewed".

Packages can be updated individually via some command like koch update xyz.

Benefits

  • Newcomers do not have to spend time figuring out which packages are production ready.
  • Users behind a firewall that cannot easily use Nimble packages can use the packages we ship with the distribution.
  • We hope the process encourages package developers to focus on package quality, documentation.
  • Some protection against the current problem that a dependency might disappear under your feet.

Disadvantages

  • It's much work to maintain. Yes and ideally we try it out and try to measure its success after a couple of months.
  • Not including package X might make X's creators unhappy. However since there is a voting process involved, I hope the discussions stay civilized and reasonably objective.
  • Old cruft will accumulate. Sure, but that's better than having the "cruft" removed below your feet too quickly.
@rayman22201
Copy link

I want to elaborate on why I think point 4 is very important

(4) New stdlib modules do not start as stdlib modules

The process for contributing to the standard library as an outside contributor is currently too difficult for some areas.

A high quality stdlib is important. The stdlib is relied on by a lot of people. A tough code review process is not a bad thing in principle, but in practice, the current situation has scared off some valuable contributors, and hindered Nim's development.

There are issues when there is not a clear concept of an "owner" of a std lib module that can properly moderate and make executive decisions.

Such an "owner" needs to have a reasonable domain knowledge, familiarity with the code base, and time to stay up to date with all code review comment threads.

Without a clear leader, making a PR can turn into a yak shaving exercise, trying to convince several strong personalities, which may never agree on fundamental issues. Leaving the PR to languish, and turning off potential contributors.

The other possibility, is that due to lack of manpower, a PR sits with no progress (not even a comment) for a long period of time. Even a close, wont accept would be welcome here. Any feedback at all.

These situations are not the same as Issues or Bug reports. These are people that have spent their valuable time writing code for the project. They deserve timely feedback.

This isn't theoretical. These scenarios have actually happened several times here in Nim land.
Nobody want's their PR to sit in limbo for weeks.

If each module is it's own repo, the repo owner can make final decisions quickly. Making the process to contribute much easier.

The Nim core team is small. They don't have the man power or domain expertise to do this for a huge standard library with many modules.

This proposal allows the core team to focus on what they are good at, and spread some responsibility to the community. The core team can focus on being trusted curators, and leave the domain specific expertise to the domain experts.

This is exactly how Linux distros work. It has it's own challenges, but it's a fairly successful model IMO.

The community has already done a good job of filling in gaps in the standard library, creating better alternatives to some stdlib modules. This Proposal leverages what is already happening in the ecosystem, and allows everyone to benefit from it in a more formal way.

@dom96
Copy link
Contributor

dom96 commented Oct 24, 2019

Packages can be updated individually via some command like koch update xyz.

My god, please no.

A package must be at version 1 or later in order to be considered for inclusion.

How many existing packages fit this definition? What you'll end up doing with this proposal is encouraging people to artificially call their packages v1.0.0 to get the included in "the Nim distribution".

I propose alternative solutions to the problems that you say this proposal solves:

Newcomers do not have to spend time figuring out which packages are production ready.

This is what tags in Nimble are for, designate a tag and mark packages which are "production ready" with it. Then just give them a link: https://nimble.directory/search?query=nim-production-approved

Users behind a firewall that cannot easily use Nimble packages can use the packages we ship with the distribution.

Maybe there is more to this, but if users are behind a firewall then they are more often than not offered a proxy which Nimble supports by the way. I don't understand this problem at all. Maybe the problem is actually different? Are there companies which do not allow installation of packages and/or need pre-approval for each new component that is installed?

Some protection against the current problem that a dependency might disappear under your feet.

This is something that should be resolved with a proper, official and centralised, package website where users can publish their packages (i.e. upload them). You should pour resources into that instead of creating this distribution.

@rayman22201
Copy link

Are there companies which do not allow installation of packages and/or need pre-approval for each new component that is installed?

Yes there are. Many companies in the financial sector work this way.

@rayman22201
Copy link

rayman22201 commented Oct 24, 2019

This is what tags in Nimble are for, designate a tag and mark packages which are "production ready" with it. Then just give them a link: https://nimble.directory/search?query=nim-production-approved

That is not mutually exclusive to this proposal. This proposal has a huge extra benefit:

In theory, those set of packages and their dependencies have all been tested together, so that you know they won't interfere with each other.

It's the same way Linux distros work. You trust that if you apt-get some package that is part of Ubuntu stable it is going to be compatible with other software you have installed on your machine.

Correct me if I'm wrong about my assumptions here.

@dom96
Copy link
Contributor

dom96 commented Oct 24, 2019

In theory, those set of packages and their dependencies have all been tested together, so that you know they won't interfere with each other.

In what way could they possibly interfere with each other? What is the proposal for testing that they are compatible?

The only advantage this will have is perhaps compatibility with the Nim version that the packages are bundled with. Honestly though, we don't even have good test coverage for the stdlib, it should really be a priority to improve that instead of creating a distribution and supposedly testing it.

@dom96
Copy link
Contributor

dom96 commented Oct 24, 2019

Yes there are. Many companies in the financial sector work this way.

In that case I doubt a distribution where packages are voted on will solve this problem for them. There will always be packages that they wish to use for which they will need to gain approval.

Also, I assume that all of the packages in the distribution will need to be audited. Doing so for all packages that the community deems appropriate to include in the distribution will be a much larger burden than doing so for a select few packages that the financial institution requires for their software.

@Araq
Copy link
Member Author

Araq commented Oct 24, 2019

Here is what the distribution would avoid: You have modules A and B, A depends on C version 1, B depends on C version 2 (incompatible with version 1).

A list of Nimble packages doesn't achieve the same.

@dom96
Copy link
Contributor

dom96 commented Oct 24, 2019

So you'll remove any packages that have this incompatibility? That could make what's included in the distribution quite volatile.

The real solution to this problem is to implement support for it in the compiler. C v1 should be considered a separate package by the compiler to C v2 somehow.

@Araq
Copy link
Member Author

Araq commented Oct 24, 2019

So you'll remove any packages that have this incompatibility?

No, they won't be added in the first place, only one version of C can make it into the distribution.

The real solution to this problem is to implement support for it in the compiler. C v1 should be considered a separate package by the compiler to C v2 somehow.

I don't agree, it's not the compiler's job to enable a solution that is at best a momentary unfortunate situation and at worst caused by incompetence.

@rayman22201
Copy link

In that case I doubt a distribution where packages are voted on will solve this problem for them. There will always be packages that they wish to use for which they will need to gain approval.

Also, I assume that all of the packages in the distribution will need to be audited. Doing so for all packages that the community deems appropriate to include in the distribution will be a much larger burden than doing so for a select few packages that the financial institution requires for their software.

All this is true.
The company obviously must choose and audit the packages they require either way, but if the mechanisms to support such a "sealed" distribution are already in place, it would be a very attractive feature...
Then they just have to pick the packages they want, instead of building the entire infrastructure. NPM has a whole business model based on this idea, with self hosted registries.

@dom96
Copy link
Contributor

dom96 commented Oct 24, 2019

So you'll remove any packages that have this incompatibility?

No, they won't be added in the first place, only one version of C can make it into the distribution.

Yes, but the problem is that dependencies change as packages are updated. You will find that either removing a package altogether or keeping them frozen in an old state is the only way to keep compatibility. That is what I meant by "volatility".

@dom96
Copy link
Contributor

dom96 commented Oct 24, 2019

The company obviously must choose and audit the packages they require either way, but if the mechanisms to support such a "sealed" distribution are already in place, it would be a very attractive feature...
Then they just have to pick the packages they want, instead of building the entire infrastructure. NPM has a whole business model based on this idea, with self hosted registries.

Of course, but Araq isn't proposing an NPM-style model. He's proposing an official distribution which the community chooses. My point is that no financial institution will be happy with this because each will have different requirements.

We should work towards an NPM-style solution, with an ability to allow these institutions to create self-hosted registries. This is the sustainable way forward and may actually help Nim be financially sustainable (of course, NPM is on a whole different scale... but on the other hand they are hugely profitable AFAIK)

@rayman22201
Copy link

Of course, but Araq isn't proposing an NPM-style model. He's proposing an official distribution which the community chooses.

Again, these are not mutually exclusive.

  • There is the infrastructure to create distributions.
  • The official community distribution.
  • The ability to create custom distributions for your organization.

The community distribution is simply an "example" so to speak, that has certain high standards for compatibility. Again, this is exactly what Linux distros do. It is not a radical idea.

@Araq
Copy link
Member Author

Araq commented Oct 24, 2019

My point is that no financial institution will be happy with this because each will have different requirements.

How so? Previously they accepted nim-1.0.0.tar.xz as a trusted thing programmers are allowed to use, afterwards they do the same for nim-distribution.tar.xz which fullfills the same standards that nim-1.0.0.tar.xz did. The point is that it's easier to get permission for 1 software package as opposed to 10 different dependencies.

@dom96
Copy link
Contributor

dom96 commented Oct 24, 2019

Again, these are not mutually exclusive.

Yes, but they each take time, and Nim as a whole has limited resources. We should spending those resources wisely, on a solution that works long-term and for more use cases.

@dom96
Copy link
Contributor

dom96 commented Oct 24, 2019

How so?

Each financial institution will have a different set of packages that they want.

Previously they accepted nim-1.0.0.tar.xz as a trusted thing programmers are allowed to use, afterwards they do the same for nim-distribution.tar.xz which fullfills the same standards that nim-1.0.0.tar.xz did. The point is that it's easier to get permission for 1 software package as opposed to 10 different dependencies.

So as long as we put some packages together and call it a distribution the financial institution will just happily trust us? I find that hard to believe.

@rayman22201
Copy link

Yes, but they each take time, and Nim as a whole has limited resources. We should spending those resources wisely, on a solution that works long-term and for more use cases.

I agree, but we disagree on where those resources should be spent.
I think this will provide the most good long term.

See my initial post about contributing to the stdlib today.

@Araq
Copy link
Member Author

Araq commented Oct 25, 2019

We should spending those resources wisely, on a solution that works long-term and for more use cases.

Everything you suggested so far is more expensive than my proposal.

@dom96
Copy link
Contributor

dom96 commented Oct 25, 2019

Everything you suggested so far is more expensive than my proposal.

My suggestions also solve more problems and are effectively inevitable. You might as well put resources into them because you will have to do so eventually anyway.

I think this will provide the most good long term.

This is where we disagree. I strongly don't think this will provide much benefit to our users, even in the short-term, much less in the long-term.

@Araq
Copy link
Member Author

Araq commented Oct 26, 2019

No, they solve different problems and fail to see that there valid reasons behind the quite common stance, "I won't use X, it's a dependency".

@andreaferretti
Copy link

We should work towards an NPM-style solution, with an ability to allow these institutions to create self-hosted registries.

It is already super easy to create a self-hosted registry. Steps:

  1. Mirror the git repositories for packages you trust on a local git server (most big institutions already have a git server anyway)
  2. Copy packages.json, edit it to only keep trusted repositories and change the URLs to point at the local git server. Host this file somewhere, possibly on the same git server as above
  3. Use Nimble configuration to point to your local packages.json
[PackageList]
name = "CustomPackages"
url = "http://mydomain.org/packages.json"

@dom96
Copy link
Contributor

dom96 commented Oct 28, 2019

I agree completely with your assessment @andreaferretti. But the big factor here is "big institutions", these will have enough scale to maintain something like this. But if I'm a small startup I don't want to mess with this, I just want a hosted solution that works automagically for me.

Indeed, we should reuse this functionality for our custom NPM-like solution, or at least evaluate it to see its limitations.

@dom96
Copy link
Contributor

dom96 commented Oct 28, 2019

No, they solve different problems and fail to see that there valid reasons behind the quite common stance, "I won't use X, it's a dependency".

Just because something is part of the "Nim distribution" doesn't change the fact that the package is still a dependency...

@Araq
Copy link
Member Author

Araq commented Oct 28, 2019

But it does change that! Just like today's (rather silly IMO) stdlib's htmlparser is not an extra dependency the same would be true for everything in the Nim distribution. You don't need to review it for security problems, because somebody else did, it won't be removed all of a sudden, it's available out of the box with Nim...

@kidandcat
Copy link

kidandcat commented Oct 28, 2019

I'm with @dom96 in the sense that the resources we have are limited, and we should focus in smaller steps.

For example, I would propose to keep important packages under the nim-lang org in github, and give maintainers collaboration access, this way, if a maintainer dissapears, other person can focus on maintain X package without needing to fork and let the old one over there.

If you don't want to make those packages look "officially supported", just create another org nim-community or similar, but you should have control over the packages, because one thing I see a lot is that, two packages for the same funcionality, one has 100 stars but last commit 3 years old, and another one updated yesterday but just with 15 stars.

But I would solve those problems @Araq has listed one by one instead of trying a big solution like this distribution.

@mratsim
Copy link
Collaborator

mratsim commented Oct 28, 2019

Many people don't want to do "nimble install foo" even if that foo is https://github.com/nim-lang/zip/. Putting popular packages under "nim-lang" wouldn't solve that.

They either have restrictions on their machines or firewalls (i.e. financial institutions and believe me, there might be tug of war between devs and either sysadmins or IT security), or they want the "genuine" default experience.

The solution is that once one package becomes popular, people propose to be co-maintainers. Possibly we could have say "nim-webdev", "nim-science", "nim-games" organizations if a domain becomes quite big but that should be happening organically. What the Nim community might do is maybe allow subforums in the Nim forum so that people have a privileged place to discuss that.

@Araq
Copy link
Member Author

Araq commented Oct 28, 2019

But I would solve those problems @Araq has listed one by one instead of trying a big solution like this distribution.

I do not consider it a "big solution", I consider it the "smallest solution that could possibly work". The only thing that bothers me is that it adds maintenance costs. However, every other solutions also adds maintenance costs of some sort, it's inevitable. And also hopefully people chime in to mitigate this cost.

@c-blake
Copy link

c-blake commented Oct 28, 2019

A little wrinkle that hasn't been included in this discussion is package-level documentation. One of my packages of interest, https://github.com/c-blake/cligen in its simplest usage style requires "definitely some" because it's a rare approach with a couple rare features, but "not very much" documentation because usage "scales down" so very far.

For a user already committed to learning some things this is not an issue. For "fly by" potential users, I've hoped that I could keep their attention for at least 4..6 paragraphs and 1 code snippet at the top of the README.md. This probably seems like I am arguing against the proposal which I am not, really. It's just a "property" of the proposal. The usual Nim distro has an integrated documentation system. Maybe some thought about how this interacts with that is warranted.

I am fine with cligen being included in some dist/ sub-directory, and am generally in favor of this idea. In general, I personally avoid dependencies enough/have enough sympathy with others doing the same that I just dumped a bunch of general utility code with a Unix/Linux bias into cligen/ that may make sense to lift out. Another issue that may impact other packages.

Another reason unmentioned so far (but perhaps vaguely related to @mratsim's genuine default experience) for avoiding dependencies is that "System" package managers like rpm, apt, portage, etc. and "Language" package managers like nimble, pip, etc. usually do not know about each other or each other's files. So, you get these incoherent set-ups/installs where everything is in someone's home dir or it's in some system dir, but the files are "orphaned" relative to the system package manager, etc. Sometimes things like the site-packages (pioneered by Emacs) lets you have a hybrid/half-and-half incoherency. E.g., pip and system package managers both use site-packages. So, pip list will see all the packages a system package manager put into some Python site-packages, yet pip install will not register its files with the system package manager. So then, tools that tell you what package owns some file break. Let's just say there is some rationality to "avoiding this whole mess" or sympathizing with those who do. Fewer dependencies is one way to make the mess at least smaller while "let a thousand packages bloom" makes the problem worse.

{ If someone has some other ideas to make that mess smaller, it may be a good topic for a related but distinct RFC and/or nim/nimble issue. My best idea is some kind of "generator" for the usually <12 system package managers from language package descriptions and engaging with the people who maintain system package repos. }

@dom96
Copy link
Contributor

dom96 commented Oct 28, 2019

Many people don't want to do "nimble install foo" even if that foo is https://github.com/nim-lang/zip/. Putting popular packages under "nim-lang" wouldn't solve that.

You make it sound like 80%+ of us don't want to install packages via Nimble, which I seriously doubt is the case. I don't doubt that there are people working for financial institutions who have these restrictions, but I want to hear from them, and I would ask you to stop exaggerating how many of these people exist.

They either have restrictions on their machines or firewalls (i.e. financial institutions and believe me, there might be tug of war between devs and either sysadmins or IT security), or they want the "genuine" default experience.

As I mentioned previously, whatever restrictions the users have with regards to firewalls can be worked around by proxies. If IT needs to sign off on every single package then I don't see a reason why they wouldn't need to sign off on every single package inside a Nim distribution.

A "genuine" default experience is nice, it works well for anaconda for example. But I seriously doubt we've got anywhere near enough mature packages that could be bundled up into a useful distribution.

@c-blake
Copy link

c-blake commented Oct 28, 2019

Just a vote, not a survey, but I hate having to install packages via nimble (or pip or any language package manager). If someone wrote a nimble2x where x included ebuild then I would use that to manage a private package repo, install out of that to system directories and hope someday the ebuild(s) could get into a portage tree maintained by others.

This would allow, among other things, packages not written in Nim to depend upon, say, a command line utility written in Nim, or programs written in Nim to depend upon things not written in Nim, such as certain versions of C libraries that could be auto-installed as dependencies via the system package manager request to install some Nim program.

@mratsim
Copy link
Collaborator

mratsim commented Oct 29, 2019

@dom96 One example: https://irclogs.nim-lang.org/30-03-2018.html#08:08:48

I mean, I don't like ripping code from other people or setting other people's code as a dependancy

Also I did work in financial institutions (4 years) on the ops side and basically the discussion with developers was "don't do that" or "disclaimer: all damages to the company due to this unapproved code will be supported by your department budget". And this was for unapproved unzip library on AS400.

What a financial institution is looking into before choosing a package is insurance in the form of a support package. They want to know that should they use Nim for core part, they can rely on a fixed subset of Nim + packages + locked versions that work together with the possibility to ring the provider any time for emergency fixes that may cost hundreds of thousands or a huge loss of reputation. They will gladly pay an annual fee for such insurance.

Obviously given Nim size, they will probably start with an internal team, but as the reliance on Nim grows, they will be moved to the core business / value addition of the company (the financial institution proprietary algorithms) and the financial institution will look into offloading dependency support to an external provider for various reasons:

  • it's easier to pay external services than dealing with hiring and training
  • it looks better on the balance sheet and it's much easier for a company to justify spending for a service than justify spending for an employee (hence why the mind-boggling number of consultants that are basically full-time employees).
  • you ask legal compensation from an external provider in case of contract/service issue, you can't really do that from employees
  • a single point of contact that manages everything (and not 5 different package authors)

Now I agree that a tool to "generate your own Nim distro" would be great, but I think it's best to start with a proof-of-concept, "Nim important packages" distro, get it out, see how people like it. This would be very easy for people to try nim in ix.io or in their own Docker with as less friction as possible, write non-trivial useful programs that needs more than the standard library (say npegs or SDL2 or bigint or crypto or Arraymancer).

Then we can write the "generate your own distro" layer so that if someone wants a "Nim distro for finance" or "Nim distro for science" and want to contract a company to maintain it.

Alternatively, instead of being by domain, those distributions could be by security properties, say code that has been audited, code that only use safe features of Nim, no warranty code, similar to Ada Spark.


Now, that said, assuming we only had one person with the choice of working on either nimble or the distribution the priority should be in making nimble better over the distribution aspect because I agree that the vast majority of users will happily use a package manager.

For scalable usage, nimble needs in my opinion:


However, from a time and resource perspective, it's not an either nimble or distribution.

@Araq works on Nim full-time, and probably can provide a distribution in less time that you or anyone who wants to tackle lock files as a project on the side of a full-time job. Furthermore while we could also ask @Araq to work on Nimble instead, I would argue that he should be the last one working on it as he doesn't use any packages, everything is in the standard library and so he doesn't have to deal daily with nimble limitations.

@pmetras
Copy link

pmetras commented Oct 29, 2019

The reason I support Araq proposal is because I prefer to have a small subset of high quality libraries instead of thousands of low quality packages and concentrate support efforts on the libraries in the distribution. When Nimble improves with features to identify valuable packages, when the community is larger, the initial goal of distributions disappears. Look at Debian, there are hundreds of thousands of DEB packages on the Web, but only 59,000 are included into the latest Buster release. When you stick with packages from the distribution, you have the insurance that they will work together without trouble. This type of insurance is important for companies and Nim beginners.

@alehander92
Copy link

@pmetras but thats not how third party ecosystems work: popular ecosystems lead to people working on their own packages or even having choice between many quality packages for the same thing, it's a bit like free market vs a big state imo (i know this metaphor is overused): its hard to expect a very minimal team that already has huge amount of work on the language to somehow maintain a huge library suite as well.

I think the idea of distro is good in principle, but look at Go, C++, Python, Ruby, Java etc: how often do you see distributions except for niche cases like python science? my point is that having an active ecosystem is much more critical than having distributions, and that it seems its not a problem for many of those much more popular languages/ecosystems (correct me if i am wrong)

@alehander92
Copy link

@pmetras sorry, i now realized you argue about something similar, and distributions mostly in the beginning, i agree with that in a way, but i still want to point out that making the ecosystem bigger is more important: not sure how a distribution applies tho, maybe it helps

@dom96
Copy link
Contributor

dom96 commented Oct 29, 2019

Honestly, our package ecosystem is so immature that you won't be able to create a stable distribution that's useful to anyone.

Never mind creating a distribution that's useful for a financial institution!

Furthermore while we could also ask @Araq to work on Nimble instead, I would argue that he should be the last one working on it as he doesn't use any packages, everything is in the standard library and so he doesn't have to deal daily with nimble limitations.

@mratsim I disagree, @Araq has worked on Nimble and should work more on it. The creator of Nim avoiding such a major aspect of the language doesn't do our users any favours.

@disruptek
Copy link

It's a long read, so I decided to write it only once; this is how I'm planning on doing a distribution generator:
https://github.com/disruptek/nimph

It should fix my personal nimble pain points with more of an embrace and extend attitude so that a rising tide there will lift all boats. I know it won't meet a lot of the concerns here, but I did try to meet some. Worst-case scenario, I'll offer another failed research experiment. 😉

@Araq
Copy link
Member Author

Araq commented Oct 30, 2019

Honestly, our package ecosystem is so immature that you won't be able to create a stable distribution that's useful to anyone.

What?! We have nim-regex that's better than our stdlib packages, better packages to do Pegs, better packages to do serialization, a couple of useful UI libraries, ORMs, ...

The creator of Nim avoiding such a major aspect of the language doesn't do our users any favours.

IMHO a better Nimble cannot solve the inherent fragility of a distributed system. But I've said it before, the points I raised are not solved by a perfect package manager.

@kidandcat
Copy link

kidandcat commented Oct 30, 2019

You cannot argue that a decentralized solution is bad when do not exist a language that can be used without third party dependencies. It is easy, if you develop your language+pkg dependencies, you are, how many, 2 developers? 5-10 if you get a lot of help?

If you count the people that actually contributes to any Nim package, you have more than 100 people. I would bet every resource we had into making Nim community stronger, I just see this distribution like a specific feature for financial companies, not for the future of the language (and that scares me).

@Araq
Copy link
Member Author

Araq commented Oct 30, 2019

You cannot argue that a decentralized solution is bad when do not exist a language that can be used without third party dependencies.

Arguable. Python with its batteries included surely is/was useful without third party deps. Plenty of people use C++ or C without external dependencies, of course it depends on the application domains.

If you count the people that actually contributes to any Nim package, you have more than 100 people. I would bet every resource we had into making Nim community stronger, I just see this distribution like a specific feature for financial companies, not for the future of the language (and that scares me).

Fair enough I guess.

@pmetras
Copy link

pmetras commented Oct 30, 2019

I think there are multiple understandings of what "distribution" mean. I'll try to synthesize what I put in my understanding:

  • Nim code grouped by a thematic interest.
  • A group of related libraries that work together (no conflicts).
  • These libraries are stable and mature.
  • They work well with a related Nim compiler version.
  • They are commonly supported by a group of persons, of which their authors probably.
  • They are recognized as such, so efforts can be concentrated on this code than spread concurrently on variants.

For instance, if we have a distribution about data science and machine learning, I expect to find libraries about dataframes, machine learning and statistical algorithms, graphing. For a data structures and algorithms distributions, I expect to have classical container data structures (trees, hashmaps, etc.) and algorithms (sorts, hash, etc.). Another one about languages could have parsers and lexers libraries. One can imagine a medical, education or scientific distributions.

I don't care if it's included into Nim umbrella or not, that it is centralized or distributed, in a container or not. I don't need to create personal distribution or that it is based on Nimble or not. What I want to ease development for beginners and attract some type of companies or governments, when Nim compiler v1.3 is published, I can get data science v1.3 and stdlib v1.3 distributions easily, for instance. There is no barrier against me to become efficient in my domain of interest immediately. I don't need to spend time finding packages with Nimble and debug them or write the documentation...

@FedericoCeratto
Copy link
Member

FedericoCeratto commented Oct 31, 2019

Are there companies which do not allow installation of packages and/or need pre-approval for each new component that is installed?

Yes, many companies including "FAANGs" prohibit pip/nimble/npm/tarball installs but provide blanket approval for linux/bsd distributions for many reasons.
Security: many installers do not check for vetting or signatures, also provide no backports of security fixes and do not perform system-wide security updates. OS installers do all of that.
Reliable and reproducible deployments: those tools do not guarantee that all deployed systems use the same versions of libraries and applications company-wide. OSes do.
Legal: some Linux distributions have a legal vetting process.
Legal 2: various companies (including Canonical) provide legal indemnification against copyright breaches, and other provide insurance against intrusion and data loss, but only for popular distributions.
Userbase: Popular distributions are also reviewed and vetted (and rebuilt) by various large companies both for internal and external use.

Edit: A summary around supply chain attacks and how to mitigate them: https://drewdevault.com/2022/05/12/Supply-chain-when-will-we-learn.html and previously https://arxiv.org/pdf/2005.09535.pdf

@FedericoCeratto
Copy link
Member

FedericoCeratto commented Oct 31, 2019

Here is my suggestion: create periodical "snapshot" lists of compiler version + library names + library versions that are trusted, tested, known to work together and sign it. The snapshot itself is just a list of package names and versions. Such lists is then used:

  • to create a "battery included" compiler tarball/zip (without dropping the existing "lean" releases)
  • to provide recommendation on which lib / version to use in Nimble
  • to (finally) package Nim libraries in Linux/osx/BSD distributions. Perhaps also Windows.
  • to store a copy of the libraries in a repository (e.g. nimble.directory) in case of deletion, hijack and to help users that cannot access github

@c-blake
Copy link

c-blake commented Oct 31, 2019

One other wrinkle here is that "work together" here can be a bit more than a single-bit value. I put only requires "nim >= 0.19.2" in my cligen.nimble because the basic functionality works for such versions. However, there is some other functionality which only works for "nim >= 0.20.0". I got push back from someone, @genotrance, I think, for having too strict Nim version requirements given how ancient many Linux distro Nim versions are. I think Debian long-term support is still back at 0.17 or something. Something to consider, anyway.

@genotrance
Copy link

I'm actually proud that nimterop (which depends on cligen) supports and is tested with 0.19.6, 0.20.2, 1.0.2 and devel x Win, Lin and OSX. This provides users with a package that they can rely on for an extended period of time without being forced to upgrade.

Even with this test matrix, nimterop only supports Nim for the last year's worth of releases - 0.19.0 came out in Sep 2018. That doesn't seem like much when you go beyond hobby projects.

@pmetras
Copy link

pmetras commented Oct 31, 2019

@c-blake using git tags or hashes can ensure the correct code to use for a given distribution version. They are more reliable than version ranges in Nimble.

@c-blake
Copy link

c-blake commented Oct 31, 2019

I think my point may have been misunderstood. I was just following up on @FedericoCeratto's "tested known to work together". Of course, I/any other package author could (probably) go to a (variable amount of) effort with whens everywhere to try to make sure every single thing works all the way back to 0.19.2 (or whatever Nim version). Neither I, nor I think most package maintainers are likely to do that "100% of the time", though, and I don't think that's so unreasonable.. A lot of the time it's just easier to say if "demanding user X wants obscure feature Y then demanding user X needs a newer Nim". So, the sense of the relation is not "1 bit" works/doesn't together, but a vector/matrix of compatibility over features/code areas & Nim versions. I'm mostly just repeating myself, though. So, I'm not sure if I have clarified successfully or not.

@FedericoCeratto
Copy link
Member

@c-blake if you are suggesting implementing a feature matrix between Nim and each package, this is out of scope for this issue: at the end we need a boolean decision [distribute|not distribute]. Also, implementing and running test for all M^N combinations is exceedingly onerous.

@c-blake
Copy link

c-blake commented Oct 31, 2019

Well, it could be in scope to use weaker language about promises (and this issue is to discuss scope -- I think you may be jumping the gun on concluding anything about it).

@disruptek
Copy link

So we've had a cooling-off period on this RFC during which time I believe I've addressed most of the blockers on @mratsim's list. The thinking on stdlib evolution has, uh, evolved a bit since this RFC was written, but I'm curious to hear if the opinions on distribution have changed, as I'm just about ready to implement something.

@Araq
Copy link
Member Author

Araq commented Dec 23, 2019

So we've had a cooling-off period on this RFC during which time I believe I've addressed most of the blockers on @mratsim's list.

Er ... what? Any links please?

@disruptek
Copy link

https://github.com/disruptek/nimph

Nimph doesn't add anything to the tasks situation, which really doesn't feel like it needs a language-specific solution. The other goals are easily achieved with Nimph's support for hierarchies and git-native architecture, as well as the compiler's support for --clearNimblePath and --nimblePath.

Bare (and shallow) git repositories seem well-suited to use as the fundamental building block of distributions. They are trivially validated, contain a wealth of useful metadata, and are easily consumed by other tools.

Distributions are already here and work with both compiler and package manager, today.

@FedericoCeratto
Copy link
Member

Bare (and shallow) git repositories seem well-suited to use as the fundamental building block of distributions

Pulling whole repositories does not seem optimal. I just created a proposal for source packages in #179

@Araq
Copy link
Member Author

Araq commented Dec 24, 2019

Distributions are already here and work with both compiler and package manager, today.

Good, but my RFC is about one official distribution (!), not about having the tools to build distributions.

@Araq
Copy link
Member Author

Araq commented Apr 22, 2020

As a much trimmed down version of this idea we are trying https://github.com/nim-lang/fusion and see where it gets us. Anything beyond that seems to be beyond our current resources hence I close this RFC.

@Araq Araq closed this as completed Apr 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests