Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolving package vs. local import ambiguities #2819

Closed
dom96 opened this issue May 27, 2015 · 25 comments
Closed

Resolving package vs. local import ambiguities #2819

dom96 opened this issue May 27, 2015 · 25 comments

Comments

@dom96
Copy link
Contributor

dom96 commented May 27, 2015

The import for io-oculus is import ovr, so this will already cause problems. I guess it is even more dangerous with something like heap: Doesn't this mean that users cannot have their own heap.nim in their sources when the heap package is installed? Even if local sources have precedence over nimble packages this feels very dangerous.

(nim-lang/packages#173 (comment))

The scenario described above is problematic as the user has no way to distinguish between a package-level module and a local-level module.

In that scenario import heap will be ambiguous. The compiler won't know whether we want to import the "heap" package's heap module or a local heap module.

After some discussion with @Araq, he suggested that we can tell the compiler that we want the local heap module by writing import "./heap". But how do we tell it that we want to import the package's heap module?

Any ideas welcome.

@dom96 dom96 added the RFC label May 27, 2015
@flaviut
Copy link
Contributor

flaviut commented May 27, 2015

import com.mycompay.projectname.heap

😉

@Varriount
Copy link
Contributor

What do other languages do in these scenarios? I know in python that this is mitigated by having namespaces (eg, twisted.utils.heap), as well as a modifiable list of paths to search.

@josephwecker
Copy link

in Ruby:

require 'some_lib/some_module'  # namespace == (more or less) directory hierarchy in lib search path
require '/opt/blah/another'  # ignore search path- look here exactly
require './../includes/yet-another.rb'  # also ignore search path - exact file relative to **working directory**
require_relative './and_another'  # relative to current file

@ozra
Copy link
Contributor

ozra commented May 27, 2015

I posted a draft that's been lying around a few days at #2820, since it's related to modules / importing.

iojs solves it in a similar fashion to ruby. A package is a dir. The primary module can be imported by the package name alone, in which case an "index" module is loaded (I'm sparing some details), require "the_amazing_module" - will look for "mod_dir/the_amazing_module/index.js". The behaviour of the "root" module is up to the mod-author. For instance, it could import all module-files available in the module-package and export them all collectively. It could also just spit out an error "import the specific parts you need" - or whatever. If one wants a specific sub-part, one uses require "the_module/a_specific_thing". Any level of dirs is ofc possible. Modules are generally sourced from a sub-dir in the project - every project stores its own deps to reduce unnecessary trouble shooting with PATH based, centralized solutions. The package manager (as mentioned in #2820) resolves deps to correct versions and downloads if necessary. Referencing modules in the "self dir" is also done fs style: require "./idents" - which is reasonable. And same goes for the exact and relative paths.

I like the idea of having all modules a project depend on under the proj dir. This reduces dep. resolvment problems. Storage space is cheap. Fetched modules go in to a "./nim_mods/" or something like that, preferably confable through "myapp_name.nim.cfg". import heap looks in "./nim_mods/heap/heap.nim", import heap/foo looks in "./nim_mods/heap/foo.nim", import ./heap looks in "./heap.nim". Something like that.

@bluenote10
Copy link
Contributor

A possible pragmatic solution would be to introduce the following rule for nimble packages: The code must not reside in the top level directory.

This would essentially force to have import <subdir>/heap where <subdir> could be anything from a domain name, a company name, a github name, or a project name. Somehow a "light" version of namespaces or Java's package hierarchy. Like with other approaches conflicts can never be fully avoided (just like two domain-less developers in Java might end up with the same fictitious domain), but it would already reduce the conflict probability significantly.

Another advantage of such a rule could be that it avoids to manually check whether a package developer uses a reasonable directory structure. For instance, what if the heap module also contains a utils.nim, which is not meant to be seen externally, and is only used from the heap implementation. Obviously, the file should not be in the search path. By allowing code to reside in the top level directory, we would have to manually check whether a package properly moves all its supplementary code into subdirectories. As soon as all code is located in some subdir the problem is just much less severe. Clearly, the property not visible externally is still not yet fulfilled, but I guess it is impossible without introducing some rule/pattern for excluding certain filename/paths.

@josephwecker
Copy link

I think it would be fine for nimble to assume any convention it wants (convention over configuration, FTW), but I would be highly disappointed if I were ever required to do import <subdir>/heap unless it was to disambiguate- same goes for java-like namespaces. I hate writing "plumbing" code and one of the biggest appeals that Nim has for us is its conspicuous lack of such plumbing.

@dom96
Copy link
Contributor Author

dom96 commented May 27, 2015

Perhaps absolute import paths should behave the same way as they do in Ruby. We could then introduce a $pkgs "path variable" which would expand to /home/user/.nimble/pkgs/ or whatever the --nimblePath is set to.

The disambiguation would then work by import "$pkgs/heap"... but damn, that wouldn't work. It would expand to /home/user/.nimble/pkgs/heap.nim which is not correct.


I can't come up with anything that doesn't require the introduction of a special syntax to disambiguate it. I agree with @josephwecker that requiring the redundant pkgname in imports is not the "Nim way".

@bluenote10
Copy link
Contributor

I agree that import heap looks much better than import bluenote/heap, but it is still far away from the com.something.cant.remember.silliness (not to mention the time wasted for navigating such a directory structure).

However, in order to scale to a large number of packages, there probably must be some mechanism to ensure a clean search path. Just try to install 20+ random nimble packages and see what ends up on the search path. You probably can import anything. Example: I put some of my tests in tests/alltests.nim but I make a mistake and import tests/alltest, and suddenly Jester is making jokes at http://127.0.0.1:5454/foo (I have accidentally imported one of jesters tests). Imho the problem is not just to disambiguate between meaningful imports like heap but also to avoid all the issues of a polluted search path in the first place.

@ozra
Copy link
Contributor

ozra commented May 27, 2015

I think an iojs'ish way would fit Nim well. There, any stdlibs are importable without pathing them. I would find it reasonable, as already mentioned, that project dir is prioritized.

Let's say we have heap in stdlib/voodoo/, in project-dir and in a package named heap:

import heap  # imports "./heap.nim"
import ./heap  # explicitly imports "./heap.nim" - (could be unneeded..)
import voodoo/heap   # disambiguate - use "{std-lib-path}/voodoo/heap.nim"
import heap/heap   # disambiguate - use "{pkg-path}/heap/heap.nim"

In contrast - if the 'heap' mod was only in one of those, 'import heap' maps to anyone of the above without qualification:

import heap   # iff proj-local exists: imports "./heap.nim"
import heap   # iff no proj-local exists: "{std-lib-path}/voodoo/heap.nim"
import heap   # iff only pkg exists: "{pkg-path}/heap/heap.nim"

If the package 'heap' has a module name 'investigator', we must:

import heap/investigator   # import specific mod from heap pkg, not main/default/index mod

In the above examples, under the assumption that 'main module' for a package is named the same, other alts. are ofc. "index.nim", "main.nim", etc. I like the repeated name better.. But it could be an option in the package-description-file. { defaultModule: "fooYaa.nim", "rootDir": "src"} - this simplifies having a clean proj-dir, while still being able to publish the mod so it's accessible without crufty import tediousMod/src/fooYaa - it's just import tediousMod

Search prio: 1. projectdir. 2. stdlib. 3. packages.

And as I've already mentioned, I'd prefer the lib-path to be a sub-dir in the project-dir - ie local copies of all dependencies, versioned to the projects needs without affecting other projects.

Reason? This is just a current example: When I got started hacking the compiler the other day, I was dumb founded - "why the fuck doesn't it work?" - til I realized that the dev-fork I was running nim in, used the global nim libs, and prioritized them. So the compiler is dependant on source both in it's own dir, and in libs-dir - I edit both - but one of them is not used, but instead one a file from home-dir is prioritized. So the compiler didn't work. This was rather un-expected to me, having to specify the working-dir as lib path explicitly. Granted std-libs are a bit special, but all in all - having "all" dependencies project-locally solves a lot of problems. And I'll say it again - storage is cheap ;-)

@refi64
Copy link
Contributor

refi64 commented May 27, 2015

Well, Python has relative imports. If I'm in foo/foo.py and want to import foo/bar.py, I can do:

from . import bar

Maybe Nim could do something similar:

import .bar

would import the bar module contained within the current package.

@josephwecker
Copy link

How about this (for illustration. obviously code can be more optimized- also writing it out verbosely because part of the confusion stems from oversimplifying search paths / packages vs modules vs paths, etc.)

Usually import search starts at the following top level directories in this order:

  1. $NIM_PATH pseudopackage (equivalent to the toplevel of Nim's repository)
  2. current directory of calling code
  3. nimblePath

1 and 2 could be implemented as basically pseudo-packages. If the import statement has a 'local' prefix, . (including ../ or ./), 1 and 3 above go away.

If import x.y.z, mod = "x/y/z" and modParts = @[x, y, z], also, it's implied that x,y, and z = (module_name|moduleName)

Within each of those 3 top-level directories, search for, in order:

  1. ./mod.nim
  2. ./lib/mod.nim
  3. ./lib/(pure|impure|packages)/mod.nim
  4. ./modParts[0]/mod.nim
  5. ./modParts[0]/lib/mod.nim
  6. ./modParts[0]/lib/modParts[1..^1].nim

In this scenario, one could say:

import hashes  # will resolve to $NIM/lib/pure/hashes.nim
import ./hashes # will likely resolve relative to current file ./hashes.nim, or, failing that,
                          # ./hashes/hashes.nim
import heap # resolves to local heap.nim since at the moment it's not part of the standard library...
import superHashes/hashes # will likely resolve to a nimble package called
                                              # superHashes, and inside look for hashes.nim then
                                              # hashes/hashes.nim
# ...

This seems to follow the principle of least surprise- things will do pretty much what you'd expect- allows for standard library, nimble packages, and the local project all to be treated uniformly, and allows for the most likely and most useful subdirectory hierarchies.

@dom96
Copy link
Contributor Author

dom96 commented May 27, 2015

Example: I put some of my tests in tests/alltests.nim but I make a mistake and import tests/alltest, and suddenly Jester is making jokes at http://127.0.0.1:5454/foo (I have accidentally imported one of jesters tests).

@bluenote10 That's not good, and it's a bug in jester's nimble file. It should not install the tests directory at all. That is something that Nimble should enforce on packages.


@ozra Your solution will not work unless all Nimble packages are namespaced, which currently isn't the case. You don't import Jester via import jester/jester and I personally don't think you should need to.

@josephwecker
Copy link

BTW- my comment glossed over nimble's package versioning. Aside from that though it does allow for nimble in the future (if desired) to optionally install packages local to a project and just have everything work as usual. It's off-topic but it has been extremely useful to me in just about every language I've used- js, ruby (rails especially), c, erlang, ...

@dom96
Copy link
Contributor Author

dom96 commented May 27, 2015

@josephwecker In what way is that useful?

@ozra
Copy link
Contributor

ozra commented May 27, 2015

@dom96. Yes, all packages would be self-contained dirs - so one could even circumvent the pkg-manager when deving, and git clone from a repo into ones projects 'dep_mods' dir. And of course on writes only 'import jester' - the default module is implied, when not "specifying a module in the package".

Disclaimer - I might have misunderstood Nimble completely, but here goes:

It's up to author how to manage ones project-dir, but one's able to both use it easily locally, and publish it as is, by saying the source dir is the root export dir for instance. Importing it as my-project, will look in this package, my-project, and in the published-root dir and then for the file my-project.nim (unless confed differently in .nimble file). This way the dev-dir and the package is the same thing. So no matter if you get it from github - because you dev on it - or via nimble, it's usable all the same.

my-project/
    .git/
    my-project.nimble   # "exportRoot = src/"
    docs/
    ext-src-deps/
        my-super-cool-other-mod # -> symlink to other actual local git-repo-dir
    tests/
        # blah blah...
    src/
        my-utils/
            some-funky-util-used-by-my-project.nim
        my-project.nim
    nim_mods/
        jester/
            jester.nim   # - or it can be in src/, dist/, whatever...
        some_cool_mod_i_depend_on/
             # the files here, similarly to my-project - or just nim files
             # in the root - whatever.. proj.nimble confs it
        my-super-cool-other-mod   # -> symlink to above ext-src-deps link..

And a pseudo example of the main mod..

import some_cool_mod_i_depend_on, jester, strutils
import my_super_cool_other_mod/my_super_gc
import my_super_cool_other_mod/my_super_allocator

doMyProjectDefaultStuff()

And some app using my mod:

someApp/
    nim_mods/
        my-project/   (... the above example proj dir; gotten via Nimble)
    myAppMainInTheRootFtw.nim
    # the obvious other files...
import myProject, strutils, unicode

doStuffWithMyProjectAndStuff()

The reason I personally symlink two times, like above, is because I can re-use generalized make scripts that loop all ext-src-deps dirs and runs make in them. This would ofc. be "koch" for nim proj..

Here's where, when someApp uses the other mods, it is the "authorative" versioner. The same modules may exist in several depths down in sub-modules, so here's where the version unification must come in, for compilation. We can't use jester@0.42 and jester@0.47 ...

Well, just an example. I'd be happy to learn a more efficient, modularized, way to develop.

@josephwecker
Copy link

EDIT: (Responses about nimble local packages moved here where they belong).

@rbehrends
Copy link
Contributor

Any practical solution to this problem must at least address the following issues:

  1. Local modules overriding other modules; if you want the feature of being able to parameterize imports, the proper solution is to have modules that can be parameterized (by other modules, types, procedures, constants, etc.). E.g. something like import somemodule[someothermodule].
  2. Polluting the global namespace with a package's internal modules that should not be visible to the outside world in the first place. Having a main module foo.nim and putting submodules in foo/sub1.nim, foo/sub2.nim is a hack.
  3. Disambiguating between two packages (or a package and the stdlib) having identically named modules.

@rbehrends
Copy link
Contributor

Eiffel organizes its modules in a hierarchy of clusters. Each cluster comprises a set of modules and can also import other clusters (subclusters), hide purely internal modules or rename modules imported from subclusters. You can think of clusters as modules of modules.

Each visible module name must be unique within a cluster. If there are conflicts because a cluster imports multiple subclusters that all define the same module, those conflicts must be resolved by either renaming the module (it will still be visible as having its original name within the subcluster) or hiding it. Local modules take priority over ones imported from a cluster.

This mechanism actually allows Eiffel to get away with a single flat namespace for modules; the downside is that each cluster needs a definition file to specify its characteristics (though you can have sane defaults, e.g. the standard library being imported by default and having a cluster automatically export certain modules).

Caveat: I doubt that this can easily be adapted for Nim (or at least not without some changes); I'm just throwing this out to show that there are alternatives to search path hacking or very.long.qualified.module.names.

@Araq
Copy link
Member

Araq commented May 31, 2016

In nim devel @rbehrends (1) is now not possible anymore and (3) is resolved by the following rule:

Import paths are local to the actual nim file that's importing and only if not found the --path is consulted.

IMO this fixes most problems, but we need to revive this discussion.

@dom96 dom96 added the Feature label Sep 23, 2016
@dom96
Copy link
Contributor Author

dom96 commented Mar 18, 2017

Import paths are local to the actual nim file that's importing and only if not found the --path is consulted.

Based on my tests this doesn't seem to be the case. In addition project_name.nims doesn't work for setting the --path.

Here is the file in question: https://github.com/dom96/nim-in-action-code/blob/master/Chapter7/Tweeter/tests/database_test.nim.cfg. I still need to specify both paths to support compilation from Tweeter/tests and Tweeter/.

@Araq
Copy link
Member

Araq commented Apr 26, 2017

Based on my tests this doesn't seem to be the case. In addition project_name.nims doesn't work for setting the --path.

This has been fixed.

@dom96
Copy link
Contributor Author

dom96 commented Sep 3, 2017

Can we revive this discussion again? I just ran into this case:

  • nimblepkg/options

Inside nimblepkg/nimscriptexecutor I want to import both options and stdlib/options. I can't do both (as far as I can see).

@dom96 dom96 added the Severe label Jan 21, 2018
@Araq Araq closed this as completed in cddc389 Feb 12, 2018
@dom96
Copy link
Contributor Author

dom96 commented Feb 12, 2018

I thought the plan was to just move the stdlib to a std?

@Araq
Copy link
Member

Araq commented Feb 12, 2018

That would either break too much code or not solve the import sha1 debacle.

@dom96
Copy link
Contributor Author

dom96 commented Feb 13, 2018

I thought we agreed on that already somewhere... but oh well. I don't care enough to complain.

But please explain how this new std/ syntax works in the changelog.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants