-
-
Notifications
You must be signed in to change notification settings - Fork 512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve virtualenv support & egg-link resolution #562
Conversation
I have tried to use this with emacs and it turns out that storing sys_path per Script (or Evaluator) instances is not very useful as long as Script objects are not persistent. Otherwise one might as well just calculate the sys_path to store it in their IDE of choice and send it with the regular completion/goto-definition requests. Another alternative would be to make Script objects stateful, enable changing source piecewise (e.g. |
a64a55f
to
c0ea437
Compare
I have encountered #539 which uses the same "live interpreter" approach and see it is turned down because of non-cross-version-safe imports happening in jedi. Ping me if you need this venv path detection scheme removed. |
Wow! That's quite a lot! Thanks for all the work that went into this! I really like the general direction your pushing this. Please give me some time to review it. |
if sys_path is None: | ||
venv = os.getenv('VIRTUAL_ENV') | ||
if venv: | ||
sys_path = list(get_venv_path(venv)) + sys.path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if VIRTUAL_ENV is set, why is it not enough to just take the current sys.path
? I have never really understood this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is to have jedi installed somewhere once per system (or per user) and to have it look into various virtual environments without installing jedi in all of these environments. In Emacs one of the easiest ways to accomplish this (without pulling virtualenv names through all the APIs) is to launch jedi process with this extra environment variable whenever I change the venv I'm working on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But if you change the virtualenv variable, wouldn't the sys.path
also change automatically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so. AFAIR, sys.path
is generated once at the startup and the virtualenv magic really relies on looking up candidate directories relative to the python interpreter location. Here's an article that probably can explain it better than I can (especially, within the scope of a comment): http://mikeboers.com/blog/2014/05/23/where-does-the-sys-path-start
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I've read it, very good article! I think I finally understand why site.py
exists and what it does.
If you start a new process with Jedi in an activated virtualenv, wouldn't we see the correct sys.path
? Are you talking about a changing just the VIRTUAL_ENV
variable without restarting the process?
In general, have you read the idea of #385, to make virtualenv support easier?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gave this some thought at the time and I couldn't find a way around the client being able to supply the requested sys.path
to Jedi in public API. Now, to my liking, full sys_path
is too long for a proper interface parameter and most of the time it will refer to one or more virtualenvs, so probably specifying virtualenv paths will look better. For example, we could add venvs=[path1, path2]
to use those venvs and extra_sys_path=[path1, path2]
just in case someone wants, say, to additionally use some source code checkouts without installing them into the said venvs.
If this API addition goes through, I'm not sure if it is worth keeping the VIRTUAL_ENV
logic:
it boils down to having both VIRTUAL_ENV
's and jedi's sys.path
in one interpreter and this is better accomplished the other way around. I mean, that logic starts with jedi on sys.path
and then tries to add venv path afterwards, but as long as jedi has no external dependencies, we can just start up PATH/TO/VENV/bin/python PATH/TO/JEDI/portable.py
with portable.py
adding its location to sys.path
and then running jedi as usual. That way it will have more binary compatibility with code in the venv.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
most of the time it will refer to one or more virtualenvs, so probably specifying virtualenv paths will look better.
Why would it refer to multiple virtualenvs? That's not something that happens in Python itself?
If this API addition goes through, I'm not sure if it is worth keeping the VIRTUAL_ENV logic: it boils down to having both VIRTUAL_ENV's and jedi's sys.path in one interpreter and this is better accomplished the other way around. I mean, that logic starts with jedi on sys.path and then tries to add venv path afterwards,
Hmm I know what you mean. I personally think that adding sys_path
as an API is really valuable, because some people really want to play with it. Most of the time this is not the case and we should focus on what happens when it is None
. This still includes virtual envs. However, I agree totally with the portable.py
idea. I think Jedi should be started with just the virtualenv as its path. This is what I'm talking about when I talk about Jedi's future async server/client structure. You name a certain virtualenv and Jedi starts a new process that includes that virtualenv. This way, we don't have segfaults and all other crazy stuff...
What do you think? This would however mean that we don't include this pull request, we would rather just add the sys_path logic. And BTW this is not to say that your work is bad. I'm really thankful for it. It taught me a lot! :)
~ Dave
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would it refer to multiple virtualenvs? That's not something that happens in Python itself?
Yes, that was a mistake. Some time ago I tried conserving precious SSD space using a virtualenv for project dependencies and another virtualenv for common tools, like ipython, and pudb, and profilers, and whatnot. But then I realized that with conda you don't need that as it hard-links packages where possible.
What do you think? This would however mean that we don't include this pull request, we would rather just add the sys_path logic. And BTW this is not to say that your work is bad. I'm really thankful for it. It taught me a lot! :)
I am not completely against ditching this PR altogether, although I do think that this PR offers better support of the current Jedi workflow. It also doesn't seem to me that the new, arguably more correct, workflow involving portable.py
and dropping all that sys_path
magic is coming soon. But if it is and there's no time window during which this PR can be useful then feel free to close it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also doesn't seem to me that the new, arguably more correct, workflow involving portable.py and dropping all that sys_path magic is coming soon.
Unfortunately and probably true :( Can you quote yourself from earlier how this PR improves virtualenvs? I still don't see it :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you quote yourself from earlier how this PR improves virtualenvs?
- fetching
sys_path
from a live interpreter in the specified virtualenv, if possible - if online fetching fails, offline detector now additionally follows
.pth
links (cannot seem to find theaddsitedir
invocation in the old code) egg-link
files are consulted in all new path directories, not just thesite-packages
one
Ok. I don't see the advantage from fetching it from a different interpreter (ecept for pth
stuff). However, I feel that pth
is something we shouldn't really execute like this, because it can contain arbitrary Python code and I'm not sure if I want to execute that (addsitedir
does this).
What are "new path directories"? I'm strongly for splitting up this PR, because it tries to change too many things at once. We also needs tests. It would be way less messy, then.
I don't see how this solves #443. Can you explain?
Good question, I think it's ok as it is. The overhead shouldn't be that crazy - let's at least try. If the overhead is too big, we could work with caches that are deleted over time and take
Well, good question. Virtualenv stuff hasn't been really tested anyways. I guess we should start testing it. Those tests might be a bit "slow", but as long as there's an easy switch (py.test -k or --ignore), that's not a big issue. You could setup a virtualenv with code or just fake it (with a few symlinks). Whatever you prefer.
As written above, Jedi supports time caches. However, I stopped caching stuff without actually profiling it first. So if you think that this really takes quite a lot of time, try to profile it first. Get a good use case and then we can actually work on improving the speed there. My intuition is that actually checking all of that is not as time consuming as you might think it is. The big parts are probably still parsing and evaluation of the tree, but I might be wrong.
I don't see a reason for that. |
When using live interpreter, the interpreter's startup procedure will resolve .pth links as it happens for the regular Python scripts. For offline detection, this section is an approximation of what happens during startup.
I'm not sure I understand jedi/travis testing quite well, is it OK if I ask you to address the testing infrastructure (UPD: and I will do the tests themselves) ? re:drop PYTHONSTARTUP ok, will do re:caching The problem with caching this stuff is that the cache will never know when I install or remove packages into the environment, so there has to be an API to reset it. I was looking at this from emacs-jedi perspective and couldn't find a better solution than just calculating |
What do you mean by that? I'm certainly willing to help, I just don't know what you mean :)
That would certainly be an option! I think at the same time it would also be nice to have something lightweight, pure Python. |
"""Get sys.path for specified virtual environment.""" | ||
try: | ||
sys_path = _get_venv_path_online(venv) | ||
except Exception as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like this. Exceptions should not be catched like that. Only catch what you really know can go wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea behind this catch-all expression is that whatever prevents this sys.path detection heuristic from working should not prevent normal jedi functioning and only report the error as a warning (which is done one line below that) as "An indication that something unexpected happened... The software is still working as expected."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Problem is: Whenever you do something like this, it's really hard to tell what the issue is from the outside. That's why I never do this.
I've made vim-conda which modifies I would much prefer to pass in a particular |
@cjrh I agree with you: It would be nice if we had a |
@davidhalter i'm sorry for being rather inactive, my schedule is a bit tight. I was under impression that you rather wanted something client-server like with a portable jedi server to be run by a python in the target environment (portable in a sense that it doesn't need to be installed in that environment). |
@immerrr No problem, I don't expect you to be active, thanks for the PR anyway :) Hmm you're probably right. I wasn't actually reading the old conversation anymore. I just realized that this solution was an improvment, but has still certain short comings. I was waiting for the questions to be answered. There are some things in this PR that are really nice (if the virtualenv stuff gets merged or not, like egg-link resolution and |
- add sys_path= kwarg to Script & Evaluator constructors - store sys_path for each evaluator instance - add get_venv_path(venv) function to determine virtualenv's path: - by default, dump path from live python interpreter - use old path extension variant for fallback - in old variant, use addsitedir to load .pth extension files - look for egg-link files in all directories in path
e26583b
to
96b87b5
Compare
hey @davidhalter is there anything else you want me to do here before merging? I have stopped env-vars leaking into subprocesses by using |
Ok, I'm probably going to close this, because the offline sys path detection after #636 is good enough for me. More accurate detection of paths that involves running a subprocess will invariably pick up some incompatible binary libraries which might crash the interpreter. |
Ok. Thanks still. I think some of this code might even be useful when we're trying to improve virtualenv stuff, but for now I agree that this is an issue. |
AFAIU, it is generally considered ok to mix jedi's own sys.path and virtualenv paths, but recently I have run into a conflict between setuptools/distribute found in:
$HOME/.local/lib/pythonX.X/site-packages
)This conflict caused problems in resolving something as simple as
setuptools.setup
and made me rethink module resolution towards "purity".This PR adds the following:
Several discussion questions here:
PYTHONSTARTUP
env var?site.USER_SITE
from lookups, too?