Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added dependency resolving for %pip cells #1662

Closed
wants to merge 45 commits into from
Closed

Added dependency resolving for %pip cells #1662

wants to merge 45 commits into from

Conversation

nfx
Copy link
Collaborator

@nfx nfx commented May 7, 2024

This PR adds a downloader and resolver for PyPI packages and wheels, that are installable via pip subprocess.

Closes #1642
Closes #1640

This PR adds a downloader and resolver for PyPI packages and wheels, that are installable via `pip` subprocess.
@nfx nfx requested review from a team and andrascsillag-db May 7, 2024 22:17
@nfx nfx temporarily deployed to account-admin May 7, 2024 22:17 — with GitHub Actions Inactive
@nfx nfx requested review from JCZuurmond and removed request for andrascsillag-db May 7, 2024 22:17
Copy link

github-actions bot commented May 7, 2024

✅ 166/166 passed, 25 skipped, 2h7m11s total

Running from acceptance #3121

Copy link
Member

@JCZuurmond JCZuurmond left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM, some clarification questions and some nits

src/databricks/labs/ucx/source_code/jobs.py Show resolved Hide resolved
# TODO: https://github.com/databrickslabs/ucx/issues/1642
return []
# TODO: this is very basic code, we need to improve it
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.e. rewriting it to a regex? Or rewriting it ast parsing?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in pip CLI flags

return []
if splits[1] != 'install':
return []
# TODO: we need to support different formats of the library name and etc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.e., lib definitions including versioning, e.g. pandas==0.18.0

if not dist_info:
problem = DependencyProblem('library-install-failed', f'Failed to install {name}')
return MaybeDependency(None, [problem])
container = SitePackageContainer(self._file_loader, dist_info)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain the intention of the container and wrapping loader classes?

@@ -81,6 +129,7 @@ def __init__(self, packages: list[SitePackage]):
self._packages[top_level] = package

def __getitem__(self, item: str) -> SitePackage | None:
item = item.replace("-", "_")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why replace the dashes with underscores?

@pritishpai pritishpai self-assigned this May 8, 2024

def __init__(
self,
file_loader: FileLoader,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the pip resolver need a file loader?

return PipResolver(self._file_loader, resolver)

def resolve_library(self, path_lookup: PathLookup, name: str) -> MaybeDependency:
path_lookup.append_path(self._temporary_virtual_environment)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the resolve library append the path? Or should it be done outside, i.e. first resolve library then append the path if the library can actually be resolved

problem = DependencyProblem('library-install-failed', f'Failed to install {name}')
return MaybeDependency(None, [problem])
container = SitePackageContainer(self._file_loader, dist_info)
dependency = Dependency(WrappingLoader(container), Path(name))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be just the Path(name) or a absolute path

@JCZuurmond JCZuurmond force-pushed the feat/pip-download branch from 103eb09 to abf77da Compare May 13, 2024 10:34
from tests.unit import locate_site_packages


def test_pip_resolver_resolve_library(mock_path_lookup):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests require internet access, maybe should be a integration test because of that

assert maybe.dependency == dependency


@pytest.mark.fail("Fails because pytest has a try-except ImportError")
Copy link
Member

@JCZuurmond JCZuurmond May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ericvergnaud : This is a lovely exception, that we do not handle and which causes problems

__all__ = ["__version__", "version_tuple"]

try:
    from ._version import version as __version__
    from ._version import version_tuple
except ImportError:  # pragma: no cover
    # broken installation, we don't even try
    # unknown only works because we do poor mans version compare
    __version__ = "unknown"
    version_tuple = (0, 0, "unknown")

JCZuurmond added a commit that referenced this pull request May 14, 2024
Copy from [PR](#1662)
JCZuurmond added a commit that referenced this pull request May 14, 2024
Copy from [PR](#1662)
JCZuurmond added a commit that referenced this pull request May 14, 2024
Partial copy from [PR](#1662)
@JCZuurmond JCZuurmond added the pr/do-not-merge this pull request is not ready to merge label May 14, 2024
JCZuurmond added a commit that referenced this pull request May 15, 2024
Copy from [PR](#1662)
JCZuurmond added a commit that referenced this pull request May 15, 2024
Partial copy from [PR](#1662)
@nfx nfx marked this pull request as ready for review May 15, 2024 09:11
@ericvergnaud
Copy link
Contributor

@JCZuurmond I guess this is already merged ?

@JCZuurmond
Copy link
Member

Yes, there are still some code snippets in there which was not merged yet (like appending a file to path only when it is not already there), but the PR is broken down into:

@JCZuurmond JCZuurmond closed this May 16, 2024
@nfx nfx deleted the feat/pip-download branch July 19, 2024 16:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr/do-not-merge this pull request is not ready to merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Workflow linter to analyse PyPI library Workflow linter to analyse python_wheel_task
4 participants