-
-
Notifications
You must be signed in to change notification settings - Fork 512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve virtualenv support & egg-link resolution #562
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -10,7 +10,6 @@ | |
from jedi._compatibility import builtins as _builtins, unicode | ||
from jedi import debug | ||
from jedi.cache import underscore_memoization, memoize_method | ||
from jedi.evaluate.sys_path import get_sys_path | ||
from jedi.parser.tree import Param, Base, Operator, zero_position_modifier | ||
from jedi.evaluate.helpers import FakeName | ||
from . import fake | ||
|
@@ -309,15 +308,12 @@ def parent(self, value): | |
pass # Just ignore this, FakeName tries to overwrite the parent attribute. | ||
|
||
|
||
def dotted_from_fs_path(fs_path, sys_path=None): | ||
def dotted_from_fs_path(fs_path, sys_path): | ||
""" | ||
Changes `/usr/lib/python3.4/email/utils.py` to `email.utils`. I.e. | ||
compares the path with sys.path and then returns the dotted_path. If the | ||
path is not in the sys.path, just returns None. | ||
""" | ||
if sys_path is None: | ||
sys_path = get_sys_path() | ||
|
||
if os.path.basename(fs_path).startswith('__init__.'): | ||
# We are calculating the path. __init__ files are not interesting. | ||
fs_path = os.path.dirname(fs_path) | ||
|
@@ -341,13 +337,13 @@ def dotted_from_fs_path(fs_path, sys_path=None): | |
return _path_re.sub('', fs_path[len(path):].lstrip(os.path.sep)).replace(os.path.sep, '.') | ||
|
||
|
||
def load_module(path=None, name=None): | ||
def load_module(evaluator, path=None, name=None): | ||
sys_path = evaluator.sys_path | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm actually worried now that this line might cause some issues. It should probably be a I just realized this, but even if it causes issues, there's not a big implication for all normal use cases, so don't care about it too much. It's much more that I should remember this. |
||
if path is not None: | ||
dotted_path = dotted_from_fs_path(path) | ||
dotted_path = dotted_from_fs_path(path, sys_path=sys_path) | ||
else: | ||
dotted_path = name | ||
|
||
sys_path = get_sys_path() | ||
if dotted_path is None: | ||
p, _, dotted_path = path.partition(os.path.sep) | ||
sys_path.insert(0, p) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,9 @@ | ||
import glob | ||
import os | ||
import sys | ||
from subprocess import check_output | ||
from ast import literal_eval | ||
from site import addsitedir | ||
|
||
from jedi._compatibility import exec_function, unicode | ||
from jedi.parser import tree | ||
|
@@ -11,24 +14,66 @@ | |
from jedi import cache | ||
|
||
|
||
def get_sys_path(): | ||
def check_virtual_env(sys_path): | ||
""" Add virtualenv's site-packages to the `sys.path`.""" | ||
venv = os.getenv('VIRTUAL_ENV') | ||
if not venv: | ||
return | ||
venv = os.path.abspath(venv) | ||
p = _get_venv_sitepackages(venv) | ||
if p not in sys_path: | ||
sys_path.insert(0, p) | ||
|
||
# Add all egg-links from the virtualenv. | ||
def get_venv_path(venv): | ||
"""Get sys.path for specified virtual environment.""" | ||
try: | ||
sys_path = _get_venv_path_online(venv) | ||
except Exception as e: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't like this. Exceptions should not be catched like that. Only catch what you really know can go wrong. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The idea behind this catch-all expression is that whatever prevents this sys.path detection heuristic from working should not prevent normal jedi functioning and only report the error as a warning (which is done one line below that) as "An indication that something unexpected happened... The software is still working as expected." There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Problem is: Whenever you do something like this, it's really hard to tell what the issue is from the outside. That's why I never do this. |
||
debug.warning("Error when getting venv path: %s" % e) | ||
sys_path = _get_venv_path_offline(venv) | ||
with common.ignored(ValueError): | ||
sys_path.remove('') | ||
return _get_sys_path_with_egglinks(sys_path) | ||
|
||
|
||
def _get_sys_path_with_egglinks(sys_path): | ||
"""Find all paths including those referenced by egg-links. | ||
|
||
Egg-link-referenced directories are inserted into path immediately after | ||
the directory on which their links were found. Such directories are not | ||
taken into consideration by normal import mechanism, but they are traversed | ||
when doing pkg_resources.require. | ||
""" | ||
result = [] | ||
for p in sys_path: | ||
result.append(p) | ||
for egg_link in glob.glob(os.path.join(p, '*.egg-link')): | ||
with open(egg_link) as fd: | ||
sys_path.insert(0, fd.readline().rstrip()) | ||
for line in fd: | ||
line = line.strip() | ||
if line: | ||
result.append(os.path.join(p, line)) | ||
# pkg_resources package only interprets the first | ||
# non-empty line in egg-link files. | ||
break | ||
return result | ||
|
||
|
||
def _get_venv_path_offline(venv): | ||
"""Get sys.path for venv without starting up the interpreter.""" | ||
venv = os.path.abspath(venv) | ||
sitedir = _get_venv_sitepackages(venv) | ||
sys.path, old_sys_path = [], sys.path | ||
try: | ||
addsitedir(sitedir) | ||
return sys.path | ||
finally: | ||
sys.path = old_sys_path | ||
|
||
|
||
def _get_venv_path_online(venv): | ||
"""Get sys.path for venv by running its python interpreter.""" | ||
venv = os.path.abspath(os.path.expanduser(venv)) | ||
for python_binary in ('python', 'python3', 'python.exe', | ||
'python3.exe'): | ||
python_path = os.path.join(venv, 'bin', python_binary) | ||
if os.path.isfile(python_path): | ||
break | ||
else: | ||
raise RuntimeError("Cannot find python executable in venv: %s" % venv) | ||
command = [python_path, '-E', '-c', 'import sys; print(sys.path)'] | ||
return literal_eval(check_output(command)) | ||
|
||
check_virtual_env(sys.path) | ||
return [p for p in sys.path if p != ""] | ||
|
||
|
||
def _get_venv_sitepackages(venv): | ||
|
@@ -109,14 +154,16 @@ def _paths_from_list_modifications(module_path, trailer1, trailer2): | |
name = trailer1.children[1].value | ||
if name not in ['insert', 'append']: | ||
return [] | ||
|
||
arg = trailer2.children[1] | ||
if name == 'insert' and len(arg.children) in (3, 4): # Possible trailing comma. | ||
arg = arg.children[2] | ||
return _execute_code(module_path, arg.get_code()) | ||
|
||
|
||
def _check_module(evaluator, module): | ||
""" | ||
Detect sys.path modifications within module. | ||
""" | ||
def get_sys_path_powers(names): | ||
for name in names: | ||
power = name.parent.parent | ||
|
@@ -128,10 +175,12 @@ def get_sys_path_powers(names): | |
if isinstance(n, tree.Name) and n.value == 'path': | ||
yield name, power | ||
|
||
sys_path = list(get_sys_path()) # copy | ||
sys_path = list(evaluator.sys_path) # copy | ||
try: | ||
possible_names = module.used_names['path'] | ||
except KeyError: | ||
# module.used_names is MergedNamesDict whose getitem never throws | ||
# keyerror, this is superfluous. | ||
pass | ||
else: | ||
for name, power in get_sys_path_powers(possible_names): | ||
|
@@ -148,7 +197,7 @@ def sys_path_with_modifications(evaluator, module): | |
if module.path is None: | ||
# Support for modules without a path is bad, therefore return the | ||
# normal path. | ||
return list(get_sys_path()) | ||
return list(evaluator.sys_path) | ||
|
||
curdir = os.path.abspath(os.curdir) | ||
with common.ignored(OSError): | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if VIRTUAL_ENV is set, why is it not enough to just take the current
sys.path
? I have never really understood this.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is to have jedi installed somewhere once per system (or per user) and to have it look into various virtual environments without installing jedi in all of these environments. In Emacs one of the easiest ways to accomplish this (without pulling virtualenv names through all the APIs) is to launch jedi process with this extra environment variable whenever I change the venv I'm working on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But if you change the virtualenv variable, wouldn't the
sys.path
also change automatically?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so. AFAIR,
sys.path
is generated once at the startup and the virtualenv magic really relies on looking up candidate directories relative to the python interpreter location. Here's an article that probably can explain it better than I can (especially, within the scope of a comment): http://mikeboers.com/blog/2014/05/23/where-does-the-sys-path-startThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I've read it, very good article! I think I finally understand why
site.py
exists and what it does.If you start a new process with Jedi in an activated virtualenv, wouldn't we see the correct
sys.path
? Are you talking about a changing just theVIRTUAL_ENV
variable without restarting the process?In general, have you read the idea of #385, to make virtualenv support easier?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gave this some thought at the time and I couldn't find a way around the client being able to supply the requested
sys.path
to Jedi in public API. Now, to my liking, fullsys_path
is too long for a proper interface parameter and most of the time it will refer to one or more virtualenvs, so probably specifying virtualenv paths will look better. For example, we could addvenvs=[path1, path2]
to use those venvs andextra_sys_path=[path1, path2]
just in case someone wants, say, to additionally use some source code checkouts without installing them into the said venvs.If this API addition goes through, I'm not sure if it is worth keeping the
VIRTUAL_ENV
logic:it boils down to having both
VIRTUAL_ENV
's and jedi'ssys.path
in one interpreter and this is better accomplished the other way around. I mean, that logic starts with jedi onsys.path
and then tries to add venv path afterwards, but as long as jedi has no external dependencies, we can just start upPATH/TO/VENV/bin/python PATH/TO/JEDI/portable.py
withportable.py
adding its location tosys.path
and then running jedi as usual. That way it will have more binary compatibility with code in the venv.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would it refer to multiple virtualenvs? That's not something that happens in Python itself?
Hmm I know what you mean. I personally think that adding
sys_path
as an API is really valuable, because some people really want to play with it. Most of the time this is not the case and we should focus on what happens when it isNone
. This still includes virtual envs. However, I agree totally with theportable.py
idea. I think Jedi should be started with just the virtualenv as its path. This is what I'm talking about when I talk about Jedi's future async server/client structure. You name a certain virtualenv and Jedi starts a new process that includes that virtualenv. This way, we don't have segfaults and all other crazy stuff...What do you think? This would however mean that we don't include this pull request, we would rather just add the sys_path logic. And BTW this is not to say that your work is bad. I'm really thankful for it. It taught me a lot! :)
~ Dave
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that was a mistake. Some time ago I tried conserving precious SSD space using a virtualenv for project dependencies and another virtualenv for common tools, like ipython, and pudb, and profilers, and whatnot. But then I realized that with conda you don't need that as it hard-links packages where possible.
I am not completely against ditching this PR altogether, although I do think that this PR offers better support of the current Jedi workflow. It also doesn't seem to me that the new, arguably more correct, workflow involving
portable.py
and dropping all thatsys_path
magic is coming soon. But if it is and there's no time window during which this PR can be useful then feel free to close it.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately and probably true :( Can you quote yourself from earlier how this PR improves virtualenvs? I still don't see it :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sys_path
from a live interpreter in the specified virtualenv, if possible.pth
links (cannot seem to find theaddsitedir
invocation in the old code)egg-link
files are consulted in all new path directories, not just thesite-packages
oneOk. I don't see the advantage from fetching it from a different interpreter (ecept for
pth
stuff). However, I feel thatpth
is something we shouldn't really execute like this, because it can contain arbitrary Python code and I'm not sure if I want to execute that (addsitedir
does this).What are "new path directories"? I'm strongly for splitting up this PR, because it tries to change too many things at once. We also needs tests. It would be way less messy, then.