Automatically figure out dependencies for the project #165

f-f · 2019-03-27T07:32:23Z

Currently when one adds a package to the packages.dhall the expectation is that spago will also pick it up as a dependency...but no, build dependencies need to be specified in the spago.dhall, as the packages.dhall only contains the pool of available packages.

I think this is not good UX and the user should not need to specify the dependencies in spago.dhall, so the idea here would be to figure out the package names of the dependencies we're using from the import statements in the source files.
We can do this because by definition of package set a module name can refer to at most one package.

How to do this: there was an experiment for this in psc-package from @kritzcreek here, calling the compiler to parse the files, get out the import statements, and figuring out the packages from there.

If we're able to do this reliably then we could:

remove the dependencies key in the spago.dhall
remove the separation between "dependencies" of your project and "packages in the package set" - every package in the package set would be importable right away
unify the two configuration files since there wouldn't be much confusion between the two sets of "dependencies"

The text was updated successfully, but these errors were encountered:

kritzcreek · 2019-03-27T16:11:46Z

In order for this to work without having to download all the packages in the package set your project is using, you'd also need to add metadata to a package-set file which contains the mapping from PackageName -> [ModuleName].

That metadata would need to be rebuilt whenever you merge a PR into the packageset.

f-f · 2019-03-28T09:24:31Z

@kritzcreek good points. Some ideas and questions:

downloading all the packages is not that crazy of an idea once we have global caching (tracked in Local Global Cache of Packages #133). I'm more worried about parsing time for all source files (at least with the current parser)
Moreover, building the metadata on package-sets CI is fine, but since you can override/add packages in your project we'd still need to do it locally too
if the compiler is package-aware, do we still need this metadata? Because we would know which package to refer to from the parse, and we wouldn't need to get this information from elsewhere

kritzcreek · 2019-03-28T10:26:05Z

Moreover, building the metadata on package-sets CI is fine, but since you can override/add packages in your project we'd still need to do it locally too

Your override/addition would need to change the package -> module mapping accordingly, or spago would need to make sure all additions and overrides are present and regenerate the mapping information for those locally. I think ideally the metadata generator lives in a different binary than spago though, so we can avoid the large dependency footprint.

in the compiler is package-aware, do we still need this metadata? Because we would know which package to refer to from the parse, and we wouldn't need to get this information from elsewhere

That would only be true if we required that every import statement mentions the package it's importing from. I think it's very unlikely that we can require that as it would break every PS file in existence. That's going to need a fair amount of thought still (it might be possible to require it if we can provide an automatic migration tool)

f-f · 2019-05-01T19:04:55Z

I think ideally the metadata generator lives in a different binary than spago though, so we can avoid the large dependency footprint

@kritzcreek I just realized that purs itself could implement this - spago depends on it already so this would avoid the operational complications of shipping another executable.
Would this be a welcome addition to the compiler?

f-f · 2019-05-12T12:22:37Z

Update: purescript-nix generates the aforementioned metadata file by implementing a small parser for imports. Since the imports syntax is not going to change dramatically anytime soon, maybe this is a viable approach..

f-f · 2019-05-12T12:27:21Z

There were questions on Slack from @reactormonk:

How would auto-completion work with this though?

With the current idea purs ide would suggest you stuff from the packages you already depend on.
To add another package you’d have to add one import from that package and then restart purs ide (this is basically the same as it is right now, where you have to instead add the package to spago.dhall and then restart)

Does the LSP support different relevantness of autocomplete? Otherwise you could complete all of spago 😄

The problem would be that you’d have to include the whole of the package set in the purs ide call, which would mean downloading the whole package set (which as I said above is not a crazy idea, but it starts to be borderline)

Could be preprocessed

If spago (or the compiler) would generate a metadata file from the package set with the mapping “module → package” (and/or “definition → modules”) then we’d know which dependencies we have from the files.
However, in purs ide we’d need to figure this out “dynamically” (because you might want names from a new dependency), and I think this would require passing this metadata file to purs ide? (so it could figure out things about packages that are not passed as globs, so packages that it didn't parse)
An alternative would be to have purs ide preprocess all of the package set, which would basically be equivalent to passing the whole package set as globs to purs ide though? (cc @kritzcreek)

hdgarrood · 2019-05-27T01:38:27Z

@jmackie has proposed a new purs subcommand which would parse and analyse the modules you feed it, and tell you about how they depend on each other: purescript/purescript#3635. If I've understood correctly, adding this subcommand would mean that spago wouldn't have any need to depend on the compiler as a library to be able to do this properly.

f-f · 2019-05-27T12:22:49Z

@hdgarrood I would say so, yes!

E.g. this is what you get for the Main module of the sample project generated by spago init:

  "Main": {
    "path": "src/Main.purs",
    "depends": [
      "Prelude",
      "Control.Applicative",
      "Control.Apply",
      "Data.Functor",
      "Data.Function",
      "Control.Category",
      "Control.Semigroupoid",
      "Data.Boolean",
      "Data.Ord",
      "Data.Eq",
      "Data.HeytingAlgebra",
      "Data.Symbol",
      "Data.Unit",
      "Data.Show",
      "Record.Unsafe",
      "Type.Data.RowList",
      "Type.Data.Row",
      "Data.Void",
      "Data.Ord.Unsafe",
      "Data.Ordering",
      "Data.Semigroup",
      "Data.Ring",
      "Data.Semiring",
      "Control.Bind",
      "Control.Monad",
      "Data.BooleanAlgebra",
      "Data.Bounded",
      "Data.CommutativeRing",
      "Data.DivisionRing",
      "Data.EuclideanRing",
      "Data.Field",
      "Data.Monoid",
      "Data.NaturalTransformation",
      "Effect",
      "Effect.Console"
    ]
  }

Then to get the package names one has to lookup for all the other names in the purs graph result, and do the match "path → glob → package name" (which should be pretty straightforward)

However I'm still about concerned about the UX implications of all of this (see my previous comment), but maybe it will be fine (e.g. it would work so that to add a dependency one would have to import a module from that package and restart purs ide)

joneshf · 2019-06-22T04:21:02Z

An alternate to removing dependencies and parsing .purs files could be to add more information to the manifests, keep the dependencies, but use them strictly. I.e. if you only list console as a dependency all you have access to are Effect.Console and Effect.Class.Console. The extra information would be listing the modules each package exposes and where they are located on disk. Given those two changes, the information about which modules/files are necessary for spago build and friends could be computed extraordinarily quickly. Generating the list of modules should be something that could happen as a migration once and going forward make a requirement for future packages.

Some benefits to this approach:

There's no need to wait on (or depend on) upstream releasing an import parser. It's not clear when that's going to happen.
It's statically known where every dependency module lives. There's no need to glob anything. Generating the list of sources boils down to concatenating lists of file paths.
It makes it possible to actually have "internal" modules (don't list them in the module list). Right now, every .purs file that exists in the src directory is always available to be imported and used, even if your package has invariants that the "internal" module not be used in any way.
Auditing dependencies becomes loads easier. The current state of affairs makes it much harder than necessary to know which packages are actually being used and which are along for the ride. By explicitly stating dependencies, and having the mapping statically known, auditing dependencies can happen with tooling that only needs to understand the manifests.

I'm not pushing for this approach, but wanted to throw it out there in case it was appealing.

f-f · 2019-06-23T20:59:09Z

@joneshf thanks for bringing this up! I did consider this kind of approach before (as this is what you do with Cabal too), and I think that:

explicitly listing the modules in the package manifests is kind of orthogonal to this issue (which is about removing the need for the user to declare the packages they depend on basically twice: once in the sources and once in the spago.dhall)
explicitly listing the modules is not very good UX anyways IMO, as that information is already in the source tree (this implies that I also disagree on the "internal modules" point, I think they should always be exposed and people free to use them)
I recall your experiment of having Dhall download PureScript sources as Text, and I think the idea is absolutely brilliant (because you can just use Dhall to get checksums, etc) but I wouldn't really want to manually list my source files in the manifest, so the UX there is tricky 😄
re waiting for upstream: I'd be fine just with the "graph command" from Add graph command for graphing module dependencies purescript#3635 (which I think is better than depending on a parser), and I'm also fine waiting - after all the current situation is not optimal but it's also not super bad so we can wait for a bit
re auditing dependencies: if we manage to get rid of the dependencies list it will also get easier because the dependencies will dynamically come from source so it cannot get out of sync with reality (i.e. you don't risk anymore leaving in a dependency you don't use)

joneshf · 2019-06-23T23:11:44Z

No worries. As mentioned, I'm not trying to push these changes on you.

removing the need for the user to declare the packages they depend on basically twice: once in the sources and once in the spago.dhall

Oh, that's the two places? I thought we were talking about packages.dhall and spago.dhall. Sorry for misunderstanding.

re auditing dependencies: if we manage to get rid of the dependencies list it will also get easier because the dependencies will dynamically come from source so it cannot get out of sync with reality (i.e. you don't risk anymore leaving in a dependency you don't use)

Doesn't this get harder? You have to parse every module in the source tree, then download and parse every package in the package set to find which package contains the modules, then you can audit the dependencies.

f-f · 2019-06-24T19:01:55Z

Doesn't this get harder? You have to parse every module in the source tree, then download and parse every package in the package set to find which package contains the modules, then you can audit the dependencies.

Oh that's right, that would add an additional step because you still need to compute somewhere the map module name → package name for all the packages in the set. We could ship a cache with the package set (we already ship some of this stuff to make the global cache work) to skip most of the work, but the implementation should still be there because people can have overriden and/or local packages.

Benjmhart · 2019-11-22T03:55:16Z

after looking at this issue, it appears to be blocked waiting for the graph command, subsequently #415 is blocked as well

f-f · 2021-04-07T14:01:18Z

Now that we shipped a build-time check for "packages you should have in your project" (in #730), and given that we'll soon change the config format to be able to support version ranges, I don't think this is viable anymore - I'll close.

f-f added the feature label Mar 27, 2019

f-f mentioned this issue Mar 27, 2019

Handling of the packages config key is surprising #158

Closed

f-f added the UX label Mar 27, 2019

f-f added the blocked label May 5, 2019

f-f added this to the 1.0 milestone May 6, 2019

f-f mentioned this issue May 12, 2019

Consider splitting up the compiler into multiple packages purescript/purescript#3633

Closed

This was referenced May 27, 2019

Add graph command for graphing module dependencies purescript/purescript#3635

Closed

Add spago publish #204

Closed

f-f removed this from the 1.0 milestone Jul 19, 2019

f-f mentioned this issue Aug 21, 2019

spago test errors when there are no tests #383

Closed

f-f mentioned this issue Sep 21, 2019

Feature request: Decline build if modules from indirect dependencies are imported #415

Closed

elliotdavies mentioned this issue Oct 23, 2019

Show warning if package is added to packages.dhall's additions, but not listed as dependency in spago.dhall #462

Closed

f-f closed this as completed Apr 7, 2021

f-f mentioned this issue Apr 15, 2021

Make it easier to fix "direct dependencies" error message #776

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automatically figure out dependencies for the project #165

Automatically figure out dependencies for the project #165

f-f commented Mar 27, 2019 •

edited

Loading

kritzcreek commented Mar 27, 2019

f-f commented Mar 28, 2019 •

edited

Loading

kritzcreek commented Mar 28, 2019

f-f commented May 1, 2019

f-f commented May 12, 2019

f-f commented May 12, 2019 •

edited

Loading

hdgarrood commented May 27, 2019

f-f commented May 27, 2019

joneshf commented Jun 22, 2019

f-f commented Jun 23, 2019 •

edited

Loading

joneshf commented Jun 23, 2019

f-f commented Jun 24, 2019

Benjmhart commented Nov 22, 2019

f-f commented Apr 7, 2021

Automatically figure out dependencies for the project #165

Automatically figure out dependencies for the project #165

Comments

f-f commented Mar 27, 2019 • edited Loading

kritzcreek commented Mar 27, 2019

f-f commented Mar 28, 2019 • edited Loading

kritzcreek commented Mar 28, 2019

f-f commented May 1, 2019

f-f commented May 12, 2019

f-f commented May 12, 2019 • edited Loading

hdgarrood commented May 27, 2019

f-f commented May 27, 2019

joneshf commented Jun 22, 2019

f-f commented Jun 23, 2019 • edited Loading

joneshf commented Jun 23, 2019

f-f commented Jun 24, 2019

Benjmhart commented Nov 22, 2019

f-f commented Apr 7, 2021

f-f commented Mar 27, 2019 •

edited

Loading

f-f commented Mar 28, 2019 •

edited

Loading

f-f commented May 12, 2019 •

edited

Loading

f-f commented Jun 23, 2019 •

edited

Loading