-
-
Notifications
You must be signed in to change notification settings - Fork 313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Monorepo support #233
Comments
Hello!
You can add a custom metadata hook that would modify dependencies and store it at the root for use by all packages.
You can switch to project mode (or
Hatch always has project awareness, so you can then use the
|
Hey, thanks for your quick response here. It seems I fail to grasp the project-concept fully, but what I learned now is: I can create multiple packages using hatch. Let's say I am in a root folder
this gives me the following folder structure:
then I configure the monorepo, like so:
afterwards I can run commands from the root-folder like so:
So far so good. But the requirement for which I am looking for explicitly (and I don't really know how to achieve that yet) is the following: Sitting in the Root folder as cwd ( But now when running Just a background: This requirement comes from IDE usage where when you open the monorepo as a root folder, it will fail to discover/execute tests in the sub-projects, unless everything is in 1 virtual environment. |
I have a similar use case to the one described here. I think it's worth describing, as a potential use case to reason about, even though the situation is less-than-ideal for a few reasons. There is this project which has a source tree with multiple packages (in this specific case the packages are rooted in a single namespace package, but it might be irrelevant). Something like this:
That's how this works currently (with custom scripts):
We want to adopt PEP 517, and Hatch is interesting because of its extensibility. But in its current shape I don't think that Hatch can be extended to discover projects in the tree. I'm also not sure that I can convince Hatch(ling) that:
Overall this pattern is not what I would recommend for a new project, but it still supports some real-world workflows. I'm not sure it would make sense for Hatch to support it out-of-the-box. Still, is it a legitimate use case for the plugin interface? Footnotes |
Hi
Thanks to great modularity of Hatch I was able to come up with custom builder called hatch-aws. You can use it to build AWS lambdas. I plan to add publish option in future. Hopes somebody else finds it useful too. |
I also am interested in this. More precisely, I'm looking for something like Cargo workspaces that let me:
I realize this may technically be possible with Hatch (based on #233 (comment)) but it seems like custom stuff and a good chunk of knowledge is required to make it work. It would be really nice to see a self contained writeup or plugin. |
Definite +1, would definitely be nice to be able to do this in python. As well as cargo workspaces, other points of reference are https://docs.npmjs.com/cli/v7/using-npm/workspaces / https://classic.yarnpkg.com/lang/en/docs/workspaces/ (plus other tools in JS like lerna and turbo), which I imagine Rust took inspiration from |
We are also looking for something similar in Airflow . I've learned about Hatch and future monorepo support from the Talk Python to me podcast https://talkpython.fm/episodes/show/408/hatch-a-modern-python-workflow and in Arflow we are having a monorepo with - effectively ~90 packages and we would be likely glad to help in developing it and definitely be the "test-drive" for it. For many years we run our own custom version of something that is discussed here:
We are eyeing apache/airflow#33909 a way to organize our repo a bit better (i.e. rather than generate those packages dynamically on-the-flight) we want to restructure our repo to actually have all those packages as individual "standalone" packages in the repo - and not just sub-packages of the main "airflow" package than we would then split out using our own tooling and scripts following our own conventions. That got us through the last 3 years while the standard Python packaging was not yet capable of handling our case, but seeing what Hatch is aiming at, I am personally excited that maybe we will be able to get rid some of the ugly part of our tooling (which mostly I personally wrote and iterated over through the years. Can we help somehow? Is there some concerted effort to make this happen ? We would love not only to "use" it but also maybe help in developing it (we have quite a number of talented engineers as maintainers and I am sure we can help). We have the Airflow Summit - coming in September and some "resting" planned after that - and some of the "cleanup" to be done in Airflow to make it possible - but maybe there is a chance we could form some task-force to make it happen afterwards (I personally will be much more "free" to help starting 2nd half o October? |
The initial examples in this ticket seem to say that they're ok with hatch creating a separate virtualenv for each project. I'm interested in a workspace with all projects installed into the same virtualenv - this would be much cleaner for local development and is more similar to how a yarn/npm workspace works. Then I can pin the dependencies for the entire workspace, my IDE can use that virtualenv, pytest can import everything, etc. |
@mmerickel: But this would cause massive dependency bleeding, ie. you'll not be able to easily keep track of which dependencies are required by which subproject, right? In this case it's just a root-level pyproject.toml with a proper (pytest) configuration to let the tools know where to look. |
It's not bleeding to install the workspace dependencies into one virtualenv. It's necessary to let them work together, which is what you generally want in a workspace where you have a "project" created from a bunch of python packages that you want to develop together. It does make it problematic sometimes to remember to put the right dependencies on a package, but that's the nature of how python dependencies work. It's not worth losing the ability to install all projects in the workspace into a single virtualenv for all other purposes. At the very least I want the ability to define a virtualenv that includes "package A" and if it depends on "package-b" which is defined in the workspace, hatch should be able to find it and install it in editable mode instead of trying to find it on PyPI. That would enable me to define several virtualenvs with different combinations of packages from the workspace which is nice. |
No not really. Exactly what @mmerickel explained. The whole idea is that you can easily come up with the "common" set of dependenciess that are result of merging those and running them together, while being able to have "editable" sources in all the installed packages. Generally you will have to figure out what is the "best" set of those dependencies (in case of Airflow from the example above #233 (comment) - we jut combine all dependencies from all the packages that we have (airflow + 80+ provider packages) and let pip figure the best set of dependencies with Maybe a real-life example from Airflow woudl be helpful here: To deal with the problem of having effectively 80+ packages we had to implement some awful hacks. So far what we do is we have a very complex In order to allow local development and single For local development those provider packages are put in "providers" "namespace" package and in our source repo we keep them all together in the same "source" structure as airflow. Our monorepo currently looks like that:
Howe it works for our local development, when we want to develop on any of the providers and airflow at the same time, we do:
So when we are installing it locally for development, we are effectively installing airflow + all provider sources + dependencies that we specify via extras. We also have to implement a hack "INSTALL_PROVIDERS_FROM_SOURCES" env variable hack to avoid the main package to pull some of the providers from This is all super-hacky and complex. For example in order to build provider package, we need to effectively copy the code of the provider to a new source tree, generate pyproject.toml there for this provider and build the package from there. We have it all automated and it works nicely for years but I would love to convert all those providers to be regular packages (even if we keep them in monorepo). We cannot do that (I believe):
That would not work, because a) pyproject.toml is declarative and we cannot do dynamic calculations of what is defined in dependent pyproject.toml (probably we could actually generate pyproject.toml with pre-commit so this is not a bit issue But maybe I am wrong and this is entirely normal and supported ? If I am right, then I believe we need smth like that:
Then - each project would have completely separate subfolder and be "regular" python package that I could just install independently for editable work like this:
While to install airflow as "editable" I need to do this:
And what I am looking for is a "standard" way to say: Maybe:
Where I end up with virtualenv containing airflow + all chosen subfolders in editable mode + all dependencies of both airflow and all of the selected subprojects installed. |
All of this feedback is incredibly useful and will directly assist me in creating the workspaces feature! |
Happy to help if you are open to it :). |
👋 I am new to Hatch and am learning about the features and possibilities with the tool. Since this is a conversation about Monorepos, I hope it is okay to share some of the work I am doing with a Monorepo-specific architecture called Polylith here. Previously, Polylith has been a Poetry feature only (built as a Poetry plugin). Yesterday I released a new tool called In short, it is about sharing code between projects (the artifacts to build and deploy) in a really simple way and with a "single project/single repo" developer experience. Just now, I recorded a quick intro to the tool, with a live demo and the Polylith Monorepo support the tooling is adding, using Hatch features that I have learned: https://youtu.be/K__3Uah3by0 |
@DavidVujic -> It does look promising, it's a bit difficult to wrap your head around bases/components (especially that the |
@potiuk I understand that it is a new term! It borrows ideas from LEGO: a base is just like a LEGO base plate, where you can add building blocks/bricks ( |
any plan when to implement this workspace feature ? |
I have created a sandbox where I have tried out a monorepo approach to publish multiple packages from one git repo: |
I discovered hatch through Rye. And I'm new enough to both that I'm not quite sure what best practices are, but my understanding is that Rye is aiming to offer the monorepo support a lot of people want, while using hatch for the backend, is it not? This comment isn't me saying, "Just use Rye!" But perhaps it is inspiring or helpful to see how it's handling some things? |
I am working on workspaces currently, that is my top priority. This would cover monorepo support. |
Hej! I am so excited to see that you are developing monorepo support :) I just thought I would point out a use case that I am looking into using Hatch for: libraries I would like to have project_a and project_b both use the common library such that
Is this a use case that would work in your current vision of the monorepo support? What is your expected time frame for completing this feature? |
By the fall. And yes that will be supported (that is essentially the structure of this repository with Hatch/Hatchling, which will switch also). |
Thank you for the feedback, Ofek! |
I'm looking at the hatch project feature now. Can I define three top level hatch projects in a monorepo, and have projectA and projectB both depend on projectUtils? Thanks for a great tool! |
Hi @ofek, thanks for working on this feature! Do you perhaps have an updated ETA? |
Hello,
I appreciate the introduction of hatch and what it offers. But of course we are all looking for a build-tool which can do everything.
So I ask, since it is very common with microservice based architectures, is the usage of Monrepos somehow a concern in the design of hatch?
I mean instead of a structure like this:
having something like this:
with the requirement of
pytest
and similar which are shared among all sub projects and are automatically installed in to each projects dev dependenciesThe text was updated successfully, but these errors were encountered: