Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workspaces and monorepo support (add sync --all-packages) #6935

Closed
carderne opened this issue Sep 2, 2024 · 56 comments
Closed

Workspaces and monorepo support (add sync --all-packages) #6935

carderne opened this issue Sep 2, 2024 · 56 comments
Assignees
Labels
question Asking for clarification or support

Comments

@carderne
Copy link

carderne commented Sep 2, 2024

I've put a decent amount of effort trying to figure out a workable "monorepo" solution with pip-tools/Rye/etc and now uv. What I mean by a monorepo:

  1. 2+ packages with interdependencies.
  2. The ability to lock dependencies across packages (where not needed, split into multiple workspaces). More sophisticated multi-version handling would be great but out of scope.
  3. Multiple entrypoints. Packages are peers and there is no "root" package.
  4. Probably want to distribute the packages in a Dockerfile or similar.

I'm packaging a few thoughts into this issue as I think they're all related, but happy to split things out if any portions of this are more likely to be worked on than others.

Should uv support this?

I think yes. Pants/Bazel/etc are a big step up in complexity and lose a lot of nice UX. uv is shaping up as the defacto Python tool and I think this is a common pattern for medium-sized teams that are trying to move past multirepo but don't want more sophisticated tooling. If you (uv maintainers) are unconvinced (but convince-able), I'm happy to spend more time doing so!

Issues

1. Multiple packages with single lockfile

Unfortunately, uv v0.4.0 seems to be a step back for this. It's no longer possible to uv sync for the whole workspace (related #6874), and the root project being "virtual" is not really supported. The docs make it clear that uv workspaces aren't (currently) meant for this, but I think that's a mistake. Have separate uv packages isn't a great solution, as you lose the global version locks (which makes housekeeping 10x easier), so you have multiple venvs, multiple pyright/pytest installs/configs etc.

For clarity, I'm talking about the structure below. I think adding a tool.uv.virtual: bool flag (like Rye has) would be a great step. In that case the root is not a package and can't be built.

.
├── pyproject.toml                 # virtual
├── uv.lock
└── packages
    ├── myserver
    │   ├── pyproject.toml         # depends on mylib
    │   └── myserver
    │       └── __init__.py
    └── mylib
        ├── pyproject.toml
        └── mylib
            └── __init__.py

2. Distributing in Dockerfiles etc

This is I think orthogonal to the issue above. (And much less important, as it's possible to work around it with plugins.) Currently, there's no good way to get an efficient (cacheable) Docker build in a uv workspace. You'd like to do something like the Dockerfile below, but you can't (related #6867).

FROM python:3.12.5-slim-bookworm
COPY --from=ghcr.io/astral-sh/uv:latest /uv /bin/uv

WORKDIR /app
COPY uv.lock pyproject.toml /app/

# NB: doesn't work as the server package isn't there!
RUN uv sync --locked --no-install-project --package=server

COPY packages /app/packages
RUN uv sync --locked --package=server
ENV PATH="/app/.venv/bin:$PATH"

If that gets resolved, there's another issue, but this is very likely to be outside the scope of uv. Just sharing it for context.

  • Either you have to copy the entire packages/ directory into every Dockerfile (regardless of what they actually need), forcing tons of unnecessary rebuilds.
  • OR you have custom COPY lines in each Dockerfile, which is a mess to maintain with more than a couple of packages, and has to be constantly updated to match the dependency graph.

My own solution has been to build wheels that include any dependencies so you can just do this:

# uv is nice enough to resolve transitive dependencies of server
uv export --format=requirements-txt --package=server > reqs.txt

Then in Dockerfile:

COPY reqs.txt reqs.txt
RUN uv pip install -r reqs.txt
# add --no-index to prevent internet access to ensure only the
# hash-locked versions in reqs.txt are downloaded
RUN uv pip install server.whl --no-deps --no-index

I've written a tiny Hatch plugin here that injects all the required workspace code into the wheel. This won't work for many use-cases (local dev hot reload) but is one way around the problem of COPYing the entire workspace into the Dockerfile. I don't think there's any solution that solves both together, and at least this way permits efficient Docker builds and simple Dockerfiles. (Note: since uv v0.4.0 the plugin seems to break uv's editable builds, haven't yet looked into why.)

@Afoucaul
Copy link

Afoucaul commented Sep 2, 2024

To expand on the Docker image, this is what I would want to do:

FROM python:3.12.5-slim-bookworm AS python-builder
COPY --from=ghcr.io/astral-sh/uv:latest /uv /bin/uv

# Create a venv at a well-known location so it can be COPY'd later
RUN uv venv /opt/python
# Tell uv to use that venv
ENV UV_PYTHON=/opt/python

WORKDIR /app
COPY uv.lock pyproject.toml /app/
# No need to COPY pyproject.toml of libs - they're all well-specified in uv.lock anyway

# Install the app without all workspace members - ie all 3rd party dependencies 
RUN uv sync --locked --no-install-workspace --package=server

COPY packages /app/packages
# Install 1st party dependencies, but only those that are needed
# Also pass the fictional `--no-editable` flag to actually bundle them into the venv
RUN uv sync --locked --no-editable --package=server


FROM python:3.12.5-slim-bookworm AS runtime

# Copy the venv that has all 3rd party and 1st party dependencies, ready for use
COPY --from=python-builder /opt/python /opt/python
ENV PATH="/opt/python/bin:$PATH"

I can't do that because:

  1. uv sync --locked --no-install-workspace --package=server complains because server isn't there (nor are its dependencies anyway)
  • it seems that uv.lock already has all the information needed to resolve this: it contains workspace members, so uv can know of server, and of its dependencies, without all pyproject.toml files needing to be there
  1. There's no such flag as --no-editable - uv will install workspace members as editable packages, so COPYing the venv in the final stage won't work because the packages pointed at won't be there
  • this would allow to build a complete venv that can be shipped, with all and only the dependencies it needs
  1. uv sync doesn't support targetting a venv (although that's under discussion from what I've gathered)

@charliermarsh
Copy link
Member

(1) is easy to resolve, would that help?

@carderne
Copy link
Author

carderne commented Sep 2, 2024

(1) Yes, that would be great!
(I'll start working on a patch but I suspect I'll still be noodling by the time you merge yours.)

For (2), I suspect the only generally useful solution would be to encode the package-specific dependency tree in uv.lock (like pnpm-lock.yaml) rather than calculating it on the fly. That might make it harder to dovetail with PEP 751, but from what I understand you're planning to support pylock as an output format that uv won't use internally, so maybe not important.

@charliermarsh
Copy link
Member

For (2), we're thinking of perhaps a dedicated command like uv bundle that would handle a lot of the defaults that you want for this kind of workflow. But otherwise a --no-editable or similar seems reasonable to me.

@charliermarsh
Copy link
Member

Lets track (2) in #5792.

@charliermarsh charliermarsh self-assigned this Sep 2, 2024
@charliermarsh
Copy link
Member

I think adding a tool.uv.virtual: bool flag (like Rye has) would be a great step. In that case the root is not a package and can't be built.

How is this different than tool.uv.package = false?

@charliermarsh
Copy link
Member

I think that does what you're describing?

@charliermarsh charliermarsh added the question Asking for clarification or support label Sep 2, 2024
@charliermarsh charliermarsh reopened this Sep 2, 2024
@charliermarsh
Copy link
Member

#6943 adds support for --frozen --package.

@carderne
Copy link
Author

carderne commented Sep 3, 2024

Sorry you're moving too quickly for me!

About (1)

You're right that package=false does what is needed. It allows a very minimal root pyproject.toml that looks like the one below. The only downside is that in order for uv sync to sync the entire workspace, you need to add each package to project.dependencies and to tool.uv.sources and in tool.uv.workspace.members. I should have been more explicit in my first message that what I think is needed here is uv sync --the-entire-workspace. (This is the default behaviour in Rye and was the default in uv<0.4.0.)

Alternatively a more explicit flag in the config like tool.uv.workspace.this-project-is-virtual-so-sync-all-members-by-default: bool.

[project]
name = "monorepo-root"
version = "0"
requires-python = "==3.12"
dependencies = ["mylib", "myserver"]

[tool.uv]
dev-dependencies = []
package = false

[tool.uv.sources]
mylib = { workspace = true }
myserver = { workspace = true }

[tool.uv.workspace]
members = ["packages/mylib", "packages/myserver"]

On (2) the Docker stuff

I don't really understand how #6943 helps but seems sensible anyway. I see three obvious ways (not uv specific) of getting stuff into a Docker image:

  1. Export a package-specific requirements.txt, install those, then COPY in all needed packages.
  2. Same for requirements.txt. Then create a site-packages and COPY that in. I assume this is what the --non-editable is about in Add --non-editable to allow syncing with non-editable workspace members #5792.
  3. Same for requirements.txt. Then create sdists/wheels from the packages (the plugin I mentioned).

All of these require a little pre-Docker script to generate the requirements.txt which isn't ideal but fine. Assuming I've understood correctly on (2) above then I'll move any more comments I have to that Issue.

@charliermarsh
Copy link
Member

For (2), I thought you wanted to do this:

FROM python:3.12.5-slim-bookworm
COPY --from=ghcr.io/astral-sh/uv:latest /uv /bin/uv

WORKDIR /app
COPY uv.lock pyproject.toml /app/

# NB: doesn't work as the server package isn't there!
RUN uv sync --locked --no-install-project --package=server

COPY packages /app/packages
RUN uv sync --locked --package=server
ENV PATH="/app/.venv/bin:$PATH"

This now works as expected if you use frozen rather than locked.

@b-phi
Copy link

b-phi commented Sep 3, 2024

This is also causing some issues for me with 0.4.0+. Locally sync works fine

> uv sync
Resolved 341 packages in 76ms
Audited 307 packages in 3ms

But when adding --frozen, which we use in CI, uv ignores the workspace members

> uv sync --frozen
Uninstalled 97 packages in 7.57s
...
Audited 210 packages in 0.25ms

The different dependency resolution behavior depending on whether I pass --frozen is unexpected.

@charliermarsh
Copy link
Member

Does your root pyproject.toml have a [project] section?

@b-phi
Copy link

b-phi commented Sep 3, 2024

No, just a "virtual" workspace, effectively this.

[tool.uv]
dev-dependencies = [
    "...",
]

[tool.uv.workspace]
members = ['libs/*', 'sandbox']

@charliermarsh
Copy link
Member

I can look into why you're seeing differences (it sounds like a bug!). I'd suggest migrating to a virtual project though, i.e., adding a [project] table (but not a build-system) to your root pyproject.toml. We redesigned those in v0.4.0 and the version above is now considered legacy.

@b-phi
Copy link

b-phi commented Sep 3, 2024

Adding the [project] section as suggested now shows consistent behavior with or without --frozen. I was able to get back to the desired sync behavior by adding the workspace members to the project dependencies and a [tool.uv.sources] section enumerating the workspace members. More verbose, but more consistent. Thanks for the help!

@charliermarsh
Copy link
Member

Great! Still gonna see if I can track down and fix that bug :)

@carderne
Copy link
Author

carderne commented Sep 3, 2024

What @b-phi is talking about is exactly what I mentioned in (1) of my comment up above. Basically you have to add each workspace member in three places. Would be great if that could be made unnecessary (in one of the ways I suggested or some other way).

On (2) the Dockerfiles, the command you added helps, but it still doesn't work if there arae dependencies between packages and you haven't yet copied in the files. There's an MRE here. It fails when trying to run the --no-install-project sync because packages/server wants packages/greeter but it's not there. Currently the only way around this (afaict) is to pre-export a requirements.txt and use that.

@charliermarsh
Copy link
Member

I'm confused on (2). We have --no-install-workspace that does exactly this, right?

@carderne
Copy link
Author

carderne commented Sep 3, 2024

Oh of course, sorry. So (2) I think is resolved. The remaining stuff about getting the right files into the Dockerfile are not really uv's problem. (Although could be helped by stuff like --non-editable.)

The main point of this issue is (1) but I'm very happy to wait for you to figure out an approach that you're happy with. But I think it would be great to resolve.

@charliermarsh
Copy link
Member

👍 Part of what I'm hearing here too is that we need more + better documentation for this stuff.

@carderne
Copy link
Author

carderne commented Sep 3, 2024

Yeah I don’t blame you, it’s moving really fast.

EDIT: adding this here to make it clear to any future travellers why this issue is still open.
The question is whether the sync command could have an --all-packages command added (or some similar name).

@carderne carderne changed the title Workspaces and monorepo support Workspaces and monorepo support (add sync --all-packages) Sep 4, 2024
@Afoucaul
Copy link

Afoucaul commented Sep 4, 2024

👍 Part of what I'm hearing here too is that we need more + better documentation for this stuff.

I'm probably biased, but it seems to me that a monorepo with possibly interdependent libs, and independently buildable (most of the time into Docker images) apps is a common pattern - at least it's what workspaces promote.
With that in mind, it would indeed be great to have documentation about how Astral intends us to use uv to manage such a repo and such builds. Until now, it feels like I'm hacking my way to a satisfying set-up, although uv maintainers obviously have a "right way" in mind.

That said, I must say I'm having an amazing experience with uv (and ruff, and Astral in general), and that I'll advocate to use it in all the projects I maintain!

@gwdekker
Copy link

https://github.com/DavidVujic/python-polylith-example-uv is another example which I think supports this or similar use cases. @DavidVujic

@DavidVujic
Copy link

DavidVujic commented Sep 16, 2024

https://github.com/DavidVujic/python-polylith-example-uv is another example which I think supports this or similar use cases.

Thanks for the mention!

Yes, if I have understood the things talked about in this issue correctly I think that Polylith in combination with uv might be helpful. It's an architecture for monorepos originating from the Clojure community. There's tooling support, and I'm the maintainer of the Python tooling. It works well with uv and here's the docs if you want to know more.

@JuanoD
Copy link

JuanoD commented Sep 19, 2024

I made https://github.com/JuanoD/uv-mono as an example repo. Feel free to correct me if something is wrong

@JasperHG90
Copy link

@JasperHG90 that link is 404 for me.

Sorry was ill these past days 🦠. Is fixed now! Thanks for the heads up.

@nickmuoh
Copy link

Hey @JasperHG90 @JuanoD @gwdekker @carderne, I checked out your examples which are extremely useful as I have been trying to figure out how to setup my teams mono repo for data science and engineering workflows. One thing that I have been struggling with how to still use uv workspaces's amazing single virtualenv and lock file with packages or apps with conflicting package versions. So for example shared packageA using pandas==2.2.0 while app needs version pandas<2.0.0.

Any ideas on how to handle such a situation or if you have how did you handle it.

@zanieb
Copy link
Member

zanieb commented Sep 25, 2024

@nickmuoh we're considering a solution to that in #6981

@idlsoft
Copy link
Contributor

idlsoft commented Sep 25, 2024

@nickmuoh we're considering a solution to that in #6981

This is for optional dependencies only though.
There is also #4574, which would make it possible to explicitly require a private uv.lock and .venv for a given workspace member.

@gwdekker
Copy link

gwdekker commented Sep 26, 2024

@nickmuoh
disclaimer: I talk about the uv with polylith setup not the uv workspaces setup.

it depends on what you want the solution to look like. If you want to have a separate venv for this app I am not sure how you would do that. If you are ok with having one global lock and venv and use pandas==2.0.0 for development but not for deploying your app: in polylith you have one pyproject file on root level and a separate one for each project. So for the project you can add your lower bound version of pandas, so you can still deploy your app while working on supporting pandas 2.

@lsmurray
Copy link

Not sure if this is related. If unrelated I can open a new issue.

I'd like to publish standalone packages from a monorepo that use an (unpublished) shared library. This comment in the hatch repo echoes a similar problem.

I have the following package structure where client depends on shared.

packages/
├── client/
└── shared/

I would like to publish a standalone wheel for client that includes the source code for shared and updates the dependencies of client to also include dependencies from shared.
Ideally I can send users the client wheel and they can install it with no issues.

Right now this doesn't work because the client wheel generated by uv build --package client --wheel doesn't include the source code for shared or the dependencies for shared.

Couple of questions

  1. Is this possible with uv right now?
  2. If not, is this potentially within scope for uv?

I see that una attempts to support this but I would be more confident in a solution built directly into uv. Sadly, una doesn't seem to work right now.

Proposed solution

I feel like what I want is a --include-workspace-deps flag on uv build

uv build --package client --wheel --include-workspace-deps

@idlsoft
Copy link
Contributor

idlsoft commented Oct 22, 2024

I have the following package structure where client depends on shared.

packages/
├── client/
└── shared/
  1. Is this possible with uv right now?
  2. If not, is this potentially within scope for uv?

Assuming shared is being published at some point, you can do something like

dependencies = [
   "shared ~= 1.0",
]

[tool.uv.sources]
shared = { path = "../shared", editable = true}
# shared = { workspace = true }  # for workspace

@mmerickel
Copy link

I'd replace include with vendor in your terminology. Something like that feels out of scope to me but I'm not the decider there. You can currently build 2 wheels and then you could postprocess them to vendor one into the other:

  • extract the wheels
  • copy the shared python package into the client wheel
  • remove the shared package from the client's dependencies metadata
  • update the client's metadata to declare that it provides the shared package
  • repackage the client wheel

This feels to me like a pretty unique situation you're asking for though that is outside of any standard python project workspace and is very specific to distributing the packages from uv into a wheel.

You could probably define a separate package + pyproject.toml already that symlinks in the code from both packages and just have uv build that when needed.

I'd suggest opening a separate ticket for your feature request considering workspace support is already available and this ticket probably doesn't have a super clear scope anymore.

@valkenburg-prevue-ch
Copy link

valkenburg-prevue-ch commented Oct 23, 2024

Hi all, I think that thanks for --no-editable, uv works very nicely with dockerfiles / containerfiles. Based on all responses here, I now use this Containerfile, which lives in a package and which I build from the workspace root:

FROM python:3.12.5-slim-bookworm AS python-builder

ARG PACKAGE_PATH="packages/my_package"

COPY --from=ghcr.io/astral-sh/uv:latest /uv /bin/uv

RUN apt-get update && apt-get install tini


COPY uv.lock pyproject.toml /app/
WORKDIR /app

COPY . .

RUN --mount=type=bind,source=uv.lock,target=uv.lock \
    --mount=type=bind,source=pyproject.toml,target=pyproject.toml \
    cd $PACKAGE_PATH && uv sync --frozen --no-editable --no-dev


FROM python:3.12.5-slim-bookworm AS runtime

RUN useradd --uid 1001 appuser

USER appuser

# Copy the venv that has all 3rd party and 1st party dependencies, ready for use
COPY --from=python-builder --chown=appuser:appuser  /app/.venv /app/.venv
COPY --from=python-builder /usr/bin/tini /usr/bin/
ENV PATH="/app/.venv/bin:$PATH"

ENTRYPOINT [ "/usr/bin/tini", "--", "/app/.venv/bin/python", "-m", "packages.my_package.some_script"]

The build command is run from the workspace root:

sudo docker buildx build -t my-package:latest -f ./packages/my_package/Containerfile .

Of course, the line COPY . . can become a breakpoint, so I ended up with a funny looking .dockerignore file (also in the workspace root):

# contents of .dockerignore
.venv
*/.venv
*/*/.venv
*/*/*/.venv
tests
*/tests
*/*/tests
*/*/*/tests
Containerfile
*/Containerfile
*/*/Containerfile
*/*/*/Containerfile

It looks to me like this correctly builds a .venv with only the necessary packages and with the workspace packages installed in the .venv.

Can someone tell me if I missed something here?

@charliermarsh
Copy link
Member

uv sync --all-packages exists in the next release.

@carderne
Copy link
Author

carderne commented Nov 4, 2024

Exciting stuff, thank you!

@ydennisy
Copy link

ydennisy commented Nov 6, 2024

@charliermarsh that is great news - thank you!

Would there perhaps be any best practice guides added to the documentation at the same time? I am not sure about everyone else on this thread, but although I can "get it to work" I am not clear on the correct way of handling monorepo with services and packages and a docker build.

Thanks!

@Tremeschin
Copy link

Tremeschin commented Nov 10, 2024

Should --all-packages be the default of uv sync, or could it be an opt-in config of [tool.uv] root pyproject?

I kind of expect all projects in the workspace members to be installed when syncing, as I've already explicitly listed them as subordinates (as in rye's behavior iirc). It could be a breaking feature, so the later option seems more reasonable.

Edit: It also allows for installing but not listing private packages of the members glob pattern 😄

Just a thought though, as I'll always use --all-packages flag from now on in local dev environment, it works great!

@eruvanos
Copy link

Should --all-packages be the default of uv sync, or could it be an opt-in config of [tool.uv] root pyproject?

I kind of expect all projects in the workspace members to be installed when syncing, as I've already explicitly listed them as subordinates (as in rye's behavior iirc). It could be a breaking feature, so the later option seems more reasonable.

Just a thought though, as I'll always use --all-packages flag from now on in local dev environment, it works great!

I would assume that all packages are synced, as long as I am in the root dir 🤔

@Tremeschin
Copy link

@eruvanos I'm always on the root dir, but they didn't seem to be synced for me still 🙂. I was having to list all of [tool.uv.sources] entries in dev-dependencies for their deps and [project.scripts] to be installed on the root venv

Maybe I was doing something wrong but --all-packages installs everyone and everything properly now

@mmerickel
Copy link

mmerickel commented Nov 14, 2024

I was also surprised that --all-packages is not the default when initially setting up my workspace. At least in my case where I'm a workspace-only project with tool.uv.managed = false. However, after adding a couple of the top-level packages in the workspace as dependencies it works ok by default. I think it'd make sense to support a tool.uv.sync_all_packages = true however that would affect the default behavior.

@bdols
Copy link

bdols commented Nov 20, 2024

just wanted to report that I have publishing working on a monorepo where all projects are defined in a workspace and building with "uv build --all". I had some long-standing projects with a setup.py that were importing setuptools_scm to determine the version dynamically, but, when building with uv, these were unable to find the .git in the project root even with "root" set in the tool.setuptools_scm section. I opted to forgo setup.py entirely and use pyproject.toml for these.

@chrisfougner
Copy link

I made https://github.com/JuanoD/uv-mono as an example repo. Feel free to correct me if something is wrong

Just wanted to say thank you. This was hugely helpful in figuring out how to set up a "monorepo". It would be great to add something like this to the workspace documentation, since the existing documentation doesn't at all mention what the pyproject.toml files should look like for the workspace members, nor what the top-level pyproject.toml should look like if you don't want a top-level project.

@charliermarsh
Copy link
Member

I'm going to close this out as we now have --all-packages and similar settings. Folks are welcome to open separate issues for any follow-ups that I've missed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Asking for clarification or support
Projects
None yet
Development

No branches or pull requests