Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add PythonVirtualEnv helper and CMakePythonDeps generator #11601

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

samuel-emrys
Copy link
Contributor

  • Add a PythonVirtualEnv helper to facilitate the creation and
    management of python virtual environments. This provides some much
    needed functionality to enable python dependencies to be managed from
    within conan, and makes it substantially easier to interogate virtual
    environments.
  • Add a CMakePythonDeps generator to generate CMake targets for each
    executable entry point in a given python environment. This allows
    CMake to find and use executables installed within a python virtual
    environment.

Closes #8626.

This implementation is an exemplar of what python dependency management in conan might look like. It's not perfect, so I'm seeking feedback on the implementation and any improvements that would make it more suitable for inclusion in the conan codebase. This is based largely on the legwork done by @thorntonryan in #8626. I'll leave any implementation of documentation until this PR is in a more mature state.

To describe the basic functionality of these features, it would allow for the creation of a python-virtualenv/system recipe:

from conan.tools.python import PythonVirtualEnv

class PythonVirtualEnvironment(ConanFile):
    name = "python-virtualenv"
    version = "system"
    settings = "os_build", "arch_build", "os"
    options = {"requirements": "ANY"}
    default_options = {"requirements": "[]"}
    # python venvs are not relocatable, so we will not have binaries for this on artifactory. Just build it on first use
    build_policy = "missing"

    def package(self):
        # Create the virtualenv in the package method because virtualenv's are not relocatable.
        venv = PythonVirtualEnv(self)
        venv.create(folder=os.path.join(self.package_folder))

        requirements = json.loads(str(self.options.get_safe("requirements", "[]")))
        if requirements:
            self.run(tools.args_to_string([
                venv.pip, "install", *(requirement for requirement in requirements),
            ]))

        for requirement in requirements:
            package = requirement.split("==")[0]
            # Ensure that there's an entry point for everything we've just installed.
            venv.setup_entry_points(
                str(package), os.path.join(self.package_folder, "bin")
            )

    def package_info(self):
        self.user_info.python_requirements = self.options.get_safe("requirements", "[]")
        self.user_info.python_envdir = self.package_folder

As you can see, this uses self.user_info.{python_requirements,python_envdir} variables to communicate these dependencies to the CMakePythonDeps generator. If this is integrated into conan, I assume we'd want to use something other than user_info to store this metadata?

I've also opted to allow the specification of the requirements that should be installed via a JSON string as an option to the package. This was probably the best I could hope for with a custom python_requires package and custom generator, but perhaps this should be revisited for inclusion in the conan codebase - would it be worth having a more native way of specifying python requirements? With the JSON string, it strikes me that if the same set of dependencies would generate a different package ID if they were specified in a different order.

But this would allow a downstream C++ project to specify it's own python requirements and generate CMake targets appropriately. An example of the relevant parts of this downstream project conanfile might look like:

class MyPkg(ConanFile):
    name = "mpkg"
    version = "0.1.0"

    exports_sources = "CMakeLists.txt", "src/*", "include/*", "requirements.txt"

    def requirements(self):
        # Could also be a tool_requires?
        self.requires("python-virtualenv/system")
        self.options["python-virtualenv"].requirements = json.dumps([
            "sphinx==5.0.1",
            "sphinx-book-theme==0.3.2",
        ])
        # Or read from a requirements.txt in the project root
        # with pathlib.Path("requirements.txt").open() as requirements_txt:
        #    self.options["python-virtualenv"].requirements = json.dumps([
        #        str(requirement) for requirement in pkg_resources.parse_requirements(requirements_txt)
        #    ])

    def generate(self):
        py = CMakePythonDeps(self)
        py.generate()

    def build(self):
        cmake = CMake(self)
        cmake.configure()
        cmake.build()

This would allow the following syntax in the relevant CMakeLists.txt (probably for documentation):

find_package(sphinx REQUIRED)

# Sphinx configuration
set(SPHINX_SOURCE ${CMAKE_CURRENT_SOURCE_DIR})
set(SPHINX_BUILD ${CMAKE_CURRENT_BINARY_DIR}/sphinx)
set(SPHINX_INDEX_FILE ${SPHINX_BUILD}/index.html)

# Only regenerate Sphinx when:
# - Doxygen has rerun
# - Our doc files have been updated
# - The Sphinx config has been updated
add_custom_command(
  OUTPUT ${SPHINX_INDEX_FILE}
  DEPENDS ${CMAKE_CURRENT_SOURCE_DIR}/index.rst
  COMMAND sphinx::sphinx-build -b html ${SPHINX_SOURCE} ${SPHINX_BUILD}
  WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}
  COMMENT "Generating documentation with Sphinx")

add_custom_target(Sphinx ALL DEPENDS ${SPHINX_INDEX_FILE})

# Add an install target to install the docs
include(GNUInstallDirs)
install(DIRECTORY ${SPHINX_BUILD}/ DESTINATION ${CMAKE_INSTALL_DOCDIR})

I've got working examples of both of these packages for further exploration:

Any feedback or suggestions for improvements to the way this being approached would be appreciated.

Changelog: Feature: Add PythonVirtualEnv helper class to create and manage python virtual environments, and CMakePythonDeps generator to facilitate the generation of CMake targets for executables of the packages installed within these virtual environments.

Docs: https://github.com/conan-io/docs/pull/XXXX

  • Refer to the issue that supports this Pull Request.
  • If the issue has missing info, explain the purpose/use case/pain/need that covers this Pull Request.
  • I've read the Contributing guide.
  • I've followed the PEP8 style guides for Python code.
  • I've opened another PR in the Conan docs repo to the develop branch, documenting this one.

Note: By default this PR will skip the slower tests and will use a limited set of python versions. Check here how to increase the testing level by writing some tags in the current PR body text.

* Add a PythonVirtualEnv helper to facilitate the creation and
  management of python virtual environments. This provides some much
  needed functionality to enable python dependencies to be managed from
  within conan, and makes it substantially easier to interogate virtual
  environments.
* Add a CMakePythonDeps generator to generate CMake targets for each
  executable entry point in a given python environment. This allows
  CMake to find and use executables installed within a python virtual
  environment.

Closes conan-io#8626.
@samuel-emrys samuel-emrys marked this pull request as draft July 10, 2022 14:26
@samuel-emrys samuel-emrys marked this pull request as ready for review July 10, 2022 14:28
@memsharded memsharded added this to the 2.0.0-beta3 milestone Jul 12, 2022
@thorntonryan
Copy link
Contributor

FYI @puetzk

@thorntonryan
Copy link
Contributor

thorntonryan commented Jul 14, 2022

@samuel-emrys, Very cool!

I'm still working my way through this refactoring and trying to understand how it compares to our original implementation. I'll need to do some diffing to see what (if anything) changed in the internals.

Looks like mostly whitespace / formatting / linting? changes to pyvenv.py

It may not have been apparent, but one thing our approach would let us do, is capture and package user defined pip installable things. For example, we have one such python script that processes an innosetup template file using git ignore style pattern matching and conan's import syntax to generate a supporting file. We could use our helper to turn a foo.py into a foo.exe located inside the Conan cache.

For example, something like this:

from conans import ConanFile, tools
import shutil
import os

class FooConan(ConanFile):
    name = "python-foo"
    version = "1.0.1"
    settings = { "os_build": ['Windows'] }
    build_requires = [
        'pyvenv/0.3.1'
    ]
    exports = [ 'setup.py', 'foo.py']

    # python venvs are not relocatable, so we will not have binaries for this on artifactory. Just build it on first use
    build_policy = "missing"

    def package(self):
        from pyvenv import venv
        venv = venv(self)
        venv.create(folder=self.package_folder)

        self.run('"{python}" -mpip install "{build}"'.format(python=venv.python, build=self.build_folder))

        scripts_folder = os.path.join(self.package_folder, "Scripts")
        bin_folder = os.path.join(self.package_folder, "bin")

        venv.setup_entry_points("foo", bin_folder)

Does your approach retain that capability?

I think the answer is yes. Swapping out requiring pyvenv/0.3.1 for importing PythonVirtualEnv and adjusting names accordingly

@samuel-emrys
Copy link
Contributor Author

samuel-emrys commented Jul 14, 2022

Looks like mostly whitespace / formatting / linting? changes to pyvenv.py

There aren't a huge amount of changes to pyvenv.py. The biggest functional change was the addition of the env_folder parameter to the PythonVirtualEnv constructor to allow it to attach to an existing virtualenv, rather than solely creating it. This was necessary to enumerate the existing entry points from within the CMakePythonDeps generator.

It may not have been apparent, but one thing our approach would let us do, is capture and package user defined pip installable things. For example, we have one such python script that processes an innosetup template file using git ignore style pattern matching and conan's import syntax to generate a supporting file. We could use our helper to turn a foo.py into a foo.exe located inside the Conan cache.

Yep - it retains all of the functionality that you provided with conan-pyvenv, so if all it really needs is the ability to hook into the virtualenv's python interpreter/pip, and the ability to setup the entry points then this will be compatible. Thanks for contributing the work :)

@thorntonryan
Copy link
Contributor

thorntonryan commented Jul 14, 2022

If I'm following, one conceptual difference, is that your approach would allow us to continually install new dependencies into a shared system wide venv. Is that correct?

That's certainly an interesting application I'm not sure we considered. Great idea!

But I'm not sure I'm following how consumers down stream of python-virtualenv/system get their dependencies installed.

The naming python-virtualenv/system suggests a single common / shared venv.

But the only installs happen here in the packaging step:

class PythonVirtualEnvironment(ConanFile):
    name = "python-virtualenv"
    version = "system"
    settings = "os_build", "arch_build", "os"
    options = {"requirements": "ANY"}
    default_options = {"requirements": "[]"}
    # python venvs are not relocatable, so we will not have binaries for this on artifactory. Just build it on first use
    build_policy = "missing"

!    def package(self):
        # Create the virtualenv in the package method because virtualenv's are not relocatable.
        venv = PythonVirtualEnv(self)
        venv.create(folder=os.path.join(self.package_folder))

        requirements = json.loads(str(self.options.get_safe("requirements", "[]")))
        if requirements:
!            self.run(tools.args_to_string([
!                venv.pip, "install", *(requirement for requirement in requirements),
            ]))

        for requirement in requirements:
            package = requirement.split("==")[0]
            # Ensure that there's an entry point for everything we've just installed.
            venv.setup_entry_points(
                str(package), os.path.join(self.package_folder, "bin")
            )

Does this mean that by setting different options.requiements on python-virtualenv:

self.options["python-virtualenv"].requirements = json.dumps([
    "sphinx==5.0.1",
    "sphinx-book-theme==0.3.2",
])

That we actually get a new/unique package binary for the python-virtualenv/system recipe?

@thorntonryan
Copy link
Contributor

As you can see, this uses self.user_info.{python_requirements,python_envdir} variables to communicate these dependencies to the CMakePythonDeps generator. If this is integrated into conan, I assume we'd want to use something other than user_info to store this metadata?

I'd agree that I think this is probably the biggest workflow question:

  • How much information gets communicated to the CMakePythonDeps generator?
  • How exactly do these details get communicated to the CMakePythonDeps generator?
    • python_requirements or other?

@thorntonryan
Copy link
Contributor

There aren't a huge amount of changes to pyvenv.py. The biggest functional change was the addition of the env_folder parameter to the PythonVirtualEnv constructor to allow it to attach to an existing virtualenv, rather than solely creating it. This was necessary to enumerate the existing entry points from within the CMakePythonDeps generator.

Oh, cool. I think I'm liking what you've done.

Thinking out loud. I wonder if there would be benefit of introducing a new package_type to describe these types of python packages.

Then the CMakePythonDeps generator could iterate over the package requires, tool requires, test requires, etc. looking for package_type=venv requirements. Locate the venv's package folder. And ask the venv what's installed in order to list the packages / entry points installed there. Or if we'd like to support filtering, maybe the package needs an "entry_points" property.

Maybe something like:

for package_type, package_name, dep_cpp_info in self.deps_build_info.dependencies:
    if package_type == "venv":
        python_envdir = dep_cpp_info.rootpath;
        ...

This could be an additional/alternative way to smuggle in the python envdir location.

@samuel-emrys
Copy link
Contributor Author

Does this mean that by setting different options.requiements on python-virtualenv:

self.options["python-virtualenv"].requirements = json.dumps([
    "sphinx==5.0.1",
    "sphinx-book-theme==0.3.2",
])

That we actually get a new/unique package binary for the python-virtualenv/system recipe?

Yes, the uniqueness of the options.requirements string will influence the package_id calculation of the recipe, generating a new virtual environment for a new set of packages. The one caveat to this is that it's contingent on the uniqueness of the string. It's obviously possible to provide the same list of packages in a different order - this would also lead to the creation of a new virtual environment. This is less than ideal, but as mentioned here, it's avoidable as long as the user takes some precautions - i.e., autogenerating their requirements list (with a tool like pip-compile) and making sure that they are always listed alphabetically. To provide a quick illustration of this because I know it's hard to visualise without examples:

requirements.in:

sphinx
sphinx_rtd_theme

then execute:

$ pip install pip-tools
$ pip-compile requirements.in

yields the following requirements.txt:

#
# This file is autogenerated by pip-compile with python 3.10
# To update, run:
#
#    pip-compile requirements.in
#
alabaster==0.7.12
    # via sphinx
babel==2.10.3
    # via sphinx
certifi==2022.6.15
    # via requests
charset-normalizer==2.1.0
    # via requests
docutils==0.17.1
    # via
    #   sphinx
    #   sphinx-rtd-theme
idna==3.3
    # via requests
imagesize==1.4.1
    # via sphinx
jinja2==3.1.2
    # via sphinx
markupsafe==2.1.1
    # via jinja2
packaging==21.3
    # via sphinx
pygments==2.12.0
    # via sphinx
pyparsing==3.0.9
    # via packaging
pytz==2022.1
    # via babel
requests==2.28.1
    # via sphinx
snowballstemmer==2.2.0
    # via sphinx
sphinx==5.0.2
    # via
    #   -r requirements.in
    #   sphinx-rtd-theme
sphinx-rtd-theme==1.0.0
    # via -r requirements.in
sphinxcontrib-applehelp==1.0.2
    # via sphinx
sphinxcontrib-devhelp==1.0.2
    # via sphinx
sphinxcontrib-htmlhelp==2.0.0
    # via sphinx
sphinxcontrib-jsmath==1.0.1
    # via sphinx
sphinxcontrib-qthelp==1.0.3
    # via sphinx
sphinxcontrib-serializinghtml==1.1.5
    # via sphinx
urllib3==1.26.10
    # via requests

This autogenerated file is what I'm proposing that it's good practice to read from.

The impact of this is that, effectively, each project (on a users system) gets its own single python virtual environment to facilitate builds. It's not strictly constrained to a single project - if another project also uses exactly the same dependency tree, then it could be re-used, but I would imagine that that's relatively unlikely to be the case.

I think it might be worth enumerating the reason I went for this level of abstraction as well, just to detail some of my thoughts on the technical/architectural challenges in invoking a different package manager from conan. I'll follow this up with another reply when I get some time.

How much information gets communicated to the CMakePythonDeps generator?

* Do we generate targets for the [requirements.in](https://github.com/samuel-emrys/sphinx-consumer/blob/develop/requirements.in) or the full [requirements.txt](https://github.com/samuel-emrys/sphinx-consumer/blob/develop/requirements.txt) of every dependency that was installed?

Currently, a target is generated for every console_script entry point - that does mean you get a few that you might not have been concerned with, but my thought was that it's better to have the option than not be able to use it. For example, in the sphinx-consumer recipe, it generates the following:

$ tree build/generators 
build/generators
├── babel-config.cmake
├── charset-normalizer-config.cmake
├── cmakedeps_macros.cmake
├── CMakePresets.json
├── conan_toolchain.cmake
├── pygments-config.cmake
├── python-virtualenv-config.cmake
├── python-virtualenv-config-version.cmake
├── python-virtualenv-release-x86_64-data.cmake
├── python-virtualenv-Target-release.cmake
├── python-virtualenvTargets.cmake
└── sphinx-config.cmake

So you can see, it generates a -config.cmake file for both babel and pygments, both of which were upstream dependencies of sphinx. Of course, a filter is an additional feature to this and there's no reason it couldn't be added if there's utility in it.

@thorntonryan
Copy link
Contributor

thorntonryan commented Jul 15, 2022

So you can see, it generates a -config.cmake file for both babel and pygments, both of which were upstream dependencies of sphinx. Of course, a filter is an additional feature to this and there's no reason it couldn't be added if there's utility in it.

That's maybe good to highlight.

Supposing we had two different Conan venv packages, A and B, each of which happens to have a shared pip dependency on babel.

If a consumer used CMakePythonDeps, which package would babel-config.cmake point to?

I wonder if those generated configs should be disambiguated somehow.

@thorntonryan
Copy link
Contributor

The impact of this is that, effectively, each project (on a users system) gets its own single python virtual environment to facilitate builds.

I can see that operating principle at play in the design. So that makes sense.

As an alternative approach, my team has tended towards creating very narrowly defined venv packages that do one thing and one thing only. E.g. a venv for sphinx. A venv for an rss feed generator. A venv for an innosetup snippet generator, etc.

We don't typically have one venv to rule them all, which means each project, which often shares similar patterns, can leverage the shared utilities.

But I do think it means our pattern could struggle a little unless we had more ways to control which entry points / find package files are created for each venv package.

It's not listed in the public facing history conan-pyvenv history, but we actually started out with a model where each app defined its own project based venv, and that venv was installed into the project's build folder, that way it had a venv capable of running whatever python script the project needed. But this was later modified to prefer turning each python script into a sharable venv based package that could be re-used more readily between projects.

@thorntonryan
Copy link
Contributor

thorntonryan commented Jul 15, 2022

Thinking outload, I wonder if instead of producing a per requirement find package file, we could produce a per package file that then provided all the entry point targets associated with that venv?

I think what this would mean, is that instead of producing sphinx-config.cmake and babel-config.cmake, you'd instead produce a single python-virtualenv-config.cmake, which defined those targets, letting the CMake code look more like:

-find_package(sphinx REQUIRED)
+find_package(python-virtualenv REQUIRED)

add_custom_command(
  OUTPUT ${SPHINX_INDEX_FILE}
  DEPENDS ${CMAKE_CURRENT_SOURCE_DIR}/index.rst
-  COMMAND sphinx::sphinx-build -b html ${SPHINX_SOURCE} ${SPHINX_BUILD}
+  COMMAND python-virtualenv::sphinx-build -b html ${SPHINX_SOURCE} ${SPHINX_BUILD}
  WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}
  COMMENT "Generating documentation with Sphinx")

I think that might give us a way to disambiguate between the same entry points being defined in multiple venv packages.

Also thinking out loud, but I wonder if that means we could then just take advantage of the existing CMakeDeps machinery, without introducing a new generator?

The python-virtualenv package would have to describe these files using the cmake_build_modules property of the current CMakeDeps generator.

But those generated cmake files (leveraging the same code that's generating the current sphinx-config.cmake under proposal), would then automagically be included when calling find_package(python-virtualenv) to get your python targets into CMake.

I also think this approach would move the generation of content found within sphinx-config.cmake into package creation time ... which also means we wouldn't then have to smuggle those details into a custom generator via self.user_info.python_requirements and self.user_info.python_envdir ... but instead, you'd use your helpers to generate a sidecar python-entry-points.cmake or something that's then included via cmake_build_modules.

@thorntonryan
Copy link
Contributor

thorntonryan commented Jul 15, 2022

To be fair, I don't wish to let my team's use case, or any of my proposals take priority over your approach. Your efforts to land this work should be commended and I certainly don't want to detract at all from the great work you've done or hold it up further :)

Just trying to offer some alternative use cases now that a concrete proposal has been formalized.

If my concerns / use case need to take a back seat and come in later, that's fine too.

Thanks to your hard work and efforts, I'm excited to see this proposal gaining more traction :)

Thank you @samuel-emrys! And keep up the good work!

@thorntonryan
Copy link
Contributor

Scratch my suggestions, @puetzk just spelled out another obvious approach we could take.

For my use case, I think we can just decide to keep doing what we're doing. And use setup_entry_points and put it in the package's bin folder.

Conan adds the bindirs from tool_requires to PATH, so we can just use find_program instead of find_package.

That's good enough to handle our usage and probably the direction we'll take. By including PythonVirtualEnv into Conan, we can achieve that.

Sorry for thinking out loud, feel free to hide any/all of the above comments as "off topic" if you think appropriate :)

@samuel-emrys
Copy link
Contributor Author

To be fair, I don't wish to let my team's use case, or any of my proposals take priority over your approach. Your efforts to land this work should be commended and I certainly don't want to detract at all from the great work you've done or hold it up further :)

Just trying to offer some alternative use cases now that a concrete proposal has been formalized.

No stress at all - exploring practical use cases is the only way we're going to find the best general solution :)

Supposing we had two different Conan venv packages, A and B, each of which happens to have a shared pip dependency on babel.

If a consumer used CMakePythonDeps, which package would babel-config.cmake point to?

It's a good question, and one that I don't immediately have an answer to. I guess a follow up question is, how deep do we expect the dependency tree of python packages to be? How much of this is within the scope of what conan should care about? Where do we draw that line? Do we want to turn conan into a native python package manager? Would these packages be stored on CCI? It could certainly achieve it, but I suppose I'm not sure if it's worth the duplication of effort.

My intuition is that for C/C++ projects, python packages will be utilised almost exclusively during the build phase, and rarely shipped with python code. Would like to seek wider feedback on this though.

With the recipe that I've proposed, with one venv per project, and specifying it as a tool_requires, I don't think transitive dependency conflicts will be an issue. Since tool_requires aren't propagated to downstream packages, you wouldn't see the conflict that you're talking about. Naturally, this is the approach that the python community and ecosystem actually propose - one virtualenv per project, to manage all of the project's dependencies.

As an alternative approach, my team has tended towards creating very narrowly defined venv packages that do one thing and one thing only. E.g. a venv for sphinx. A venv for an rss feed generator. A venv for an innosetup snippet generator, etc.

Yes, I had seen that you had approached it this way. This is still possible with the proposed approach, but the concern that I have here is that whilst in theory it sounds good to aim for "one package per recipe", in actuality this isn't what's happening. To do this properly, each python package should have it's own native conan recipe (I'm using the term native here to disambiguate from one installed using pip, into a virtualenv). Without going to this level of effort, you really don't get the behaviour you would expect. To illustrate this, lets suppose you do have a sphinx recipe that installs into a PythonVirtualEnv. This might look like:

from conan.tools.python import PythonVirtualEnv

class SphinxConan(ConanFile):
    name = "sphinx"
    # python venvs are not relocatable, so we will not have binaries for this on artifactory. Just build it on first use
    build_policy = "missing"
    requirements = []

    def package(self):
        # Create the virtualenv in the package method because virtualenv's are not relocatable.
        venv = PythonVirtualEnv(self)
        venv.create(folder=os.path.join(self.package_folder))

        self.requirements = [
            f"sphinx=={self.version}"
        ]
        self.run(tools.args_to_string([
            venv.pip, "install", *(requirement for requirement in self.requirements),
        ]))
        for requirement in self.requirements:
            package = requirement.split("==")[0]
            # Ensure that there's an entry point for everything we've just installed.
            venv.setup_entry_points(
                str(package), os.path.join(self.package_folder, "bin")
            )

    def package_info(self):
        self.user_info.python_requirements = self.requirements
        self.user_info.python_envdir = self.package_folder

However, you get more installed in the PythonVirtualEnv than just sphinx. Just by specifying sphinx, you also get:

alabaster==0.7.12
babel==2.10.3
certifi==2022.6.15
charset-normalizer==2.1.0
docutils==0.18.1
idna==3.3
imagesize==1.4.1
jinja2==3.1.2
markupsafe==2.1.1
packaging==21.3
pygments==2.12.0
pyparsing==3.0.9
pytz==2022.1
requests==2.28.1
snowballstemmer==2.2.0
sphinx==5.0.2
sphinxcontrib-applehelp==1.0.2
sphinxcontrib-devhelp==1.0.2
sphinxcontrib-htmlhelp==2.0.0
sphinxcontrib-jsmath==1.0.1
sphinxcontrib-qthelp==1.0.3
sphinxcontrib-serializinghtml==1.1.5
urllib3==1.26.10

Now, lets take this one step further - what do you do if you need another extension to use with sphinx? A different theme, for example? With the approach you're proposing, the rule would be to create a new package, for sphinx_rtd_theme, for example. However, how do you make these accessible to each other? With different python interpreters, the execution of sphinx-build would have no visibility of your sphinx_rtd_theme package present in another virtualenv, in another area of the conan cache. Naturally, you might suggest that we can just add it to the requirements of the sphinx recipe:

        self.requirements = [
            f"sphinx=={self.version}",
            "sphinx_rtd_theme"
        ]

And this would work, but now we've broken the encapsulation that we were striving for with this approach, and you end up with some miniature version of my proposal to do this at the project level, where you're picking and choosing which packages to install with which others. Having them all in one virtualenv allows them all to interact with each other as required, and seems to be the status quo for project package management in the python community. I agree that one-recipe-per-package would be better philosophically, but they would have to be natively managed by conan, and I'm not sure the benefits outweight the cost in pursuing that approach.

Thinking outload, I wonder if instead of producing a per requirement find package file, we could produce a per package file that then provided all the entry point targets associated with that venv?

I think what this would mean, is that instead of producing sphinx-config.cmake and babel-config.cmake, you'd instead produce a single python-virtualenv-config.cmake, which defined those targets, letting the CMake code look more like:

-find_package(sphinx REQUIRED)
+find_package(python-virtualenv REQUIRED)

add_custom_command(
  OUTPUT ${SPHINX_INDEX_FILE}
  DEPENDS ${CMAKE_CURRENT_SOURCE_DIR}/index.rst
-  COMMAND sphinx::sphinx-build -b html ${SPHINX_SOURCE} ${SPHINX_BUILD}
+  COMMAND python-virtualenv::sphinx-build -b html ${SPHINX_SOURCE} ${SPHINX_BUILD}
  WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}
  COMMENT "Generating documentation with Sphinx")

I think that might give us a way to disambiguate between the same entry points being defined in multiple venv packages.

Yes, I had seen that this was the approach you took in your generator as well. My thought that having python package targets rather than a conan package target made more sense from a philosophical perspective with what the CMakeToolchain and CMakeDeps generators are trying to do with conan 2.0. That is, maintain as much separation between conan and cmake as possible. I know there's been an effort to ensure that targets produced by recipes are consistent with their upstream dependencies, and whilst this isn't exactly a 1:1 mapping, a target named pyvenv::sphinx-build is far more coupled with the conan recipe than sphinx::sphinx-build is. I don't know if there are any approaches to generate cmake targets for python executables in other package managers, but I would imagine/hope that they would opt for something more related to the python package than their specific package manager. This seems to be more consistent with the conan 2.0 vision, but it certainly could just be a personal preference. Keen to seek wider feedback here as well.

The python-virtualenv package would have to describe these files using the cmake_build_modules property of the current CMakeDeps generator.

But those generated cmake files (leveraging the same code that's generating the current sphinx-config.cmake under proposal), would then automagically be included when calling find_package(python-virtualenv) to get your python targets into CMake.

I also think this approach would move the generation of content found within sphinx-config.cmake into package creation time ... which also means we wouldn't then have to smuggle those details into a custom generator via self.user_info.python_requirements and self.user_info.python_envdir ... but instead, you'd use your helpers to generate a sidecar python-entry-points.cmake or something that's then included via cmake_build_modules.

This was the approach that I explored initially, but it occurred to me that if each recipe is creating it's own target files, it's essentially doing the job of a generator, and violates the DRY principle to some degree, as many recipes would have largely the same logic to generate their target files, so I tried to move the generation logic over to an actual generator.

However, I agree that using self.user_info to propagate the information is less than ideal when included in the conan codebase (though probably is ideal in the case of a custom generator). I like your idea of using a different package_type to help identify these kinds of recipes - perhaps that means that the CMakeDeps generator could interpret the requires(), tool_requires() and test_requires() specifications of these packages differently? If that's the case, we might be able to do something like the following to specify the PythonVirtualEnv requirements:

def build_requirements(self):
    self.tool_requires("sphinx/4.4.0")
    self.tool_requires("sphinx_rtd_theme/1.0.0")

The idea here being that these could be parsed into something that looks like the following in the CMakeDeps generator and the python-virtualenv recipe:

tool_requirements = [
    "sphinx==4.4.0",
    "sphinx_rtd_theme==1.0.0",
]

The challenge with this is, how does a python-virtualenv (or equivalent) recipe retrieve this information? Is there a way to make the virtualenv creation part a conan utility rather than a recipe? I'm not sure I can see a good way to do this, but asking the question for brighter minds than mine to ponder. Another challenge here is that these {tool_,test_}requires() statements assume that each of these is its own recipe, which would not be the case here.

Another alternative might be to adopt a pip_requires() statement, and/or to add a new *_info variable, such as self.pip_info.{requirements,python_env}. The latter would be the closest to the current implementation. Aside from the pip_info variable, I can't see my way through is how these solutions might pass these requirements on to a python-virtualenv type recipe, which makes me think that perhaps it belongs as a feature of the conan codebase (though, again, I'm not sure how this would work either, because it would still need to function somewhat like a recipe)

Anyway, this was a bit of a brain dump. I've probably been unclear about parts, so ask away if there's anything that's unintelligible.

@thorntonryan
Copy link
Contributor

And this would work, but now we've broken the encapsulation that we were striving for with this approach, and you end up with some miniature version of my proposal to do this at the project level, where you're picking and choosing which packages to install with which others.

I should've put one recipe per package in "air quotes", because you're right. Things like supporting packages (e.g. rtd theme), get thrown into our "sphinx" package. Yet the user has the ability to pick/choose which endpoints are exposed, thus limiting which supporting packages are installed as findable programs via CMake.

In our sphinx example, the only end points that get created are:

sphinx-apidoc.exe
sphinx-build.exe
sphinx-autogen.exe
sphinx-quickstart.exe

Which from the end users point of view are simply "sphinx".

So the encapsulation we were striving for, was not quite the full 1-1 conan package to pip package encapsulation, but instead treat them more as "logical" packages, which attempt to seek/solve a single goal.

Another example we have, is a helper python script that a creates snippet for a WinSparkle appcast feed.

For this package, we require:

install_requires=['cryptography==3.4.7','jinja2==3.0.1','pytz==2021.1','pefile==2021.5.24','semver==2.13.0'],

But in the end, the only thing that's actually installed to the bin folder is:

create_appcast_item.exe
create_appcast_item-script.py

So here, we don't actually want/need any of the entry points from supporting packages in path, we just want our script to be called from the right venv launcher.

@thorntonryan
Copy link
Contributor

thorntonryan commented Jul 18, 2022

Perhaps one suggestion then, supposing I had such a script, and I've defined my pip requirements appropriately, should the CMakePythonDeps generator also include a path to the supporting venv?

Assuming I haven't turned my script into a pip installable package (somehow), this would let me call my script from the "right" python environment, which has the appropriate libraries/packages installed.

IIRC, when we were taking a similar approach and focused primarily on generating project wide venvs, we avoided putting it in "bin", that way you wouldn't have two python's in PATH.

@samuel-emrys
Copy link
Contributor Author

Perhaps one suggestion then, supposing I had such a script, and I've defined my pip requirements appropriately, should the CMakePythonDeps generator also include a path to the supporting venv?

Assuming I haven't turned my script into a pip installable package (somehow), this would let me call my script from the "right" python environment, which has the appropriate libraries/packages installed.

Just a quick reply while I have a minute - the CMakeDeps generator already supports this. It will create a python-virtualenv_PACKAGE_FOLDER_RELEASE variable, which is the same as the directory the virtual environment is installed to. Hm, actually perhaps this isn't what you're after. Could you propose an interface that would satisfy your requirement here? My understanding is that you're trying to deconflict perhaps two sphinx installations, which were installed by different packages in the upstream dependency tree?

@thorntonryan
Copy link
Contributor

Just a quick reply while I have a minute - the CMakeDeps generator already supports this. It will create a python-virtualenv_PACKAGE_FOLDER_RELEASE variable, which is the same as the directory the virtual environment is installed to.

Ah, perfect! An oversight on my part.

I think that would cover the use case. That would let consuming CMakeLists.txt setup custom commands that called into that python environment, e.g. "${virtualenv_PACKAGE_FOLDER_RELEASE}/Scripts/python.exe myscript.py".

@thorntonryan
Copy link
Contributor

As you can see, this uses self.user_info.{python_requirements,python_envdir} variables to communicate these dependencies to the CMakePythonDeps generator. If this is integrated into conan, I assume we'd want to use something other than user_info to store this metadata?

Another thought. Conan already has a conandata.yml and conan.tools.files.update_conandata().

The examples seem to show them being used to store requirement details.

If generators had access to conanfile.conan_data, then you wouldn't have to introduce your own JSON serialization of requirements. You could just add an entry in the YML file and iterate over them:

for req in self.conan_data["python_requirements"]:
    ...

I still think first class support is likely a better overall design. But I had just happened across conandata.yml and figured I'd share the thought.

@jellespijker
Copy link
Contributor

jellespijker commented Aug 23, 2022

Nice work 👍

We also created a generator for our specic use case https://github.com/Ultimaker/conan-config/blob/master/generators/VirtualPythonEnv.py

A bit less robust and more tailored to our needs. I might look into switch to this generator if/when it's merged.

We use it on our Cura repository https://github.com/Ultimaker/Cura/blob/main/conanfile.py

And our Uranium repository https://github.com/Ultimaker/Uranium/blob/main/conanfile.py

@lasote lasote self-assigned this Sep 5, 2022
@lasote
Copy link
Contributor

lasote commented Sep 5, 2022

Hello! We have considered that this deserves taking a look. I will start trying to understand correctly and try this PR and then we would like to take only (initially) the PythonVirtualEnv part so we will require to split this PR but I will give more feedback on the path we want to follow once I take a look.
Thanks!

@lasote lasote modified the milestones: 2.0.0-beta3, 2.0.0-beta4 Sep 7, 2022
@lasote
Copy link
Contributor

lasote commented Sep 7, 2022

Hi, we decided to start with this for Beta 4. We will start opening a simplified proposal for a PythonVirtualEnv and then iterating it with you.

About the CMakePythonDeps, we think that maybe what is really missing is modeling the executables that a package has, no matter if coming from a Python virtual env package or a tool require, so probably that should be solved by adding a new property exes to the self.cpp_info and making the generators use that information, in this case, the CMakeDeps could create the targets for them. I´ll open a new issue for this.

@samuel-emrys
Copy link
Contributor Author

samuel-emrys commented Sep 7, 2022

About the CMakePythonDeps, we think that maybe what is really missing is modeling the executables that a package has, no matter if coming from a Python virtual env package or a tool require, so probably that should be solved by adding a new property exes to the self.cpp_info and making the generators use that information, in this case, the CMakeDeps could create the targets for them

Yeah, that's a great idea. I can see that this would have benefits for a number of recipes - for example protoc in the protobuf recipe, xsltproc in the libxslt recipe, or even python in the cpython recipe. Good to see this progressing!

@samuel-emrys
Copy link
Contributor Author

Hello! We have considered that this deserves taking a look. I will start trying to understand correctly and try this PR and then we would like to take only (initially) the PythonVirtualEnv part so we will require to split this PR but I will give more feedback on the path we want to follow once I take a look. Thanks!

Is there anything I can do to help with this @lasote?

@memsharded memsharded modified the milestones: 2.0.0-beta5, 2.0.0-beta6 Nov 11, 2022
@czoido czoido modified the milestones: 2.0.0-beta6, 2.0.0-beta7 Dec 2, 2022
Copy link
Member

@memsharded memsharded left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am checking this PR to see what would be the best way to move this forward.
I think this would be the best roadmap:

  • This feature goes directly to 2.0 (develop2 branch)
  • Lets start with pure Python virtualenv management, leaving the CMakePythonDeps integration for a later stage.
  • We need to start with some test that clearly summarize the expected usage patterns, UX, and behaviors. They can be quite basic, like not really pip-installing things, just start with very basic usage, but tested
  • Lets try to start it with the very minimum, if possible to avoid the which, for example, better avoid it.
  • Conan 2.0 does an effort to be more user-configuration driven, and less "smart" trying to figure out the user intentions. It is totally fine, even very recommended to drive the core implementation (and probably simplify it), with confs like tools.env.pythonenv:python=/my/path/to/my/python. (the which example above, is a typical indicator that Conan is trying to guess something, instead of getting the configuration explicitly from user conf)

We keep pushing hard to get Conan 2.0 out, and migrate recipes in ConanCenter to be 2.0 ready, so we still don't have enough resources to put dev time on this, but we think that with this roadmap it will be easier to move things little by little.

@samuel-emrys
Copy link
Contributor Author

samuel-emrys commented Dec 13, 2022

  • Conan 2.0 does an effort to be more user-configuration driven, and less "smart" trying to figure out the user intentions. It is totally fine, even very recommended to drive the core implementation (and probably simplify it), with confs like tools.env.pythonenv:python=/my/path/to/my/python

I just want to clarify your idea here. I've normally seen conf configured through profiles, but profile-only configuration of interpreter locations would mean that using the cpython recipe to create a python executable would not be possible (this is a challenge that I've seen with using packaged compilers - fortunately, there seems to be some capacity to propagate these paths through using environment variables, but it's not clear to me at present if these will overwrite whatever is set in the profile, for example CC, CXX or FC).

To use cpython as an example, would you envision cpython would have a conf_info entry to make itself available to consumers?

def package_info(self):
    self.conf_info.define("tools.env.pythonenv:python", os.path.join(self.package_folder, "bin", "python"))

How would this interact with a profile that also sets tools.env.pythonenv:python?

Additionally, is this something that the PythonVirtualEnv should always read from, i.e.

venv = PythonVirtualEnv(self) # uses tools.env.pythonenv:python always

or

venv = PythonVirtualEnv(self, interpreter=self.conf_info.get("tools.env.pythonenv:python")) # not the default, but allows for other usage, such as
venv = PythonVirtualEnv(self, interpreter=os.path.join(self.dependencies["cpython"].package_folder, "bin", "python")) # or, the cpython recipe could export a CPYTHON_INTERPRETER, or something to make discovery easier
       

I guess these aren't mutually exclusive, we could always have tools.env.pythonenv:python be the default value for an optional keyword arg to the constructor so that it could accept an override.

@thorntonryan
Copy link
Contributor

FWIW, the current implementation for PythonVirtualEnv supports passing another interpreter already via constructor arguments:
https://github.com/conan-io/conan/pull/11601/files#diff-977ebace008712e4861389d940b2f3f2a0fbce81358c5d9f6e1348062154d01fR91-R95

By default it finds the Python that conan is using (or the backing python if Conan is running in a venv), but if you wish to supply your own Python, you just feed it into the constructor.

@memsharded
Copy link
Member

To use cpython as an example, would you envision cpython would have a conf_info entry to make itself available to consumers?

Yes, I think this would be valid, but I need to understand better how this relation works, what are the usage cases for a cpython package, etc. I need some time to think about it.

How would this interact with a profile that also sets tools.env.pythonenv:python?

In general the idea is that the profile things have higher priority. If I want the system to create Python3.8 virtual envs, then it should probably follow my definition. It is still very possible that cpython packages are used for other purposes (like C++-python integration), but the pythonenv is used for running pip installable things, and then another version different than the cpython ones is necessary.

Also I think that having the argument in the constructor shouldn't be the main or only way to define the information, but rather the profile one. Something similar to CMakeToolchain and other generators, they can default to something, then it can be changed from profiles, that is more expected than having to change it modifying recipes in the constructor.

A relevant question is how we expect users to use it, mainly from a cpython package or from a system Python? I would say in more cases using the system one, not the cpython Conan package. But definitely seems both use cases make sense, so it seems they should be equally supported.

@memsharded
Copy link
Member

I am thinking that:

def package_info(self):
    self.conf_info.define("tools.env.pythonenv:python", os.path.join(self.package_folder, "bin", "python"))

Will definitely work nicely if the cpython is a tool_require, while allowing easy customization from the user side in their profiles, and they could point to another way.
It will not work if cpython is a regular require, but it seems the right approach for this use case is to use a tool_require.

@krabbstek
Copy link

Hello!

What are the plans with this direction? Will it be released soon? Soon(TM)? I am guessing that it will only be released for conan 2 without a backport to conan 1?

We are interested in using something similar to what have been discussed here (I am not quite sure how yet, I have just started looking into this) but since we are using conan 1 and will for some time, we would need some solution and I am wondering if I can expect a conan 2 solution that we can backport ourselves in the near future, or if we need to develop our own solution.

My 2 cents about the discussions here (apologies if I have missed something, it's a long thread after all :)):

It seems strange to me to dynamically generate a python package that contains the virtual environment as was proposed initially, that will not become a package-unique venv but rather something that multiple packages can use, something similar to the PR quoted below is much more intuitive in my mind and probably something we would do were we to implement this ourselves.

FWIW, the current implementation for PythonVirtualEnv supports passing another interpreter already via constructor arguments: https://github.com/conan-io/conan/pull/11601/files#diff-977ebace008712e4861389d940b2f3f2a0fbce81358c5d9f6e1348062154d01fR91-R95

By default it finds the Python that conan is using (or the backing python if Conan is running in a venv), but if you wish to supply your own Python, you just feed it into the constructor.

I am interested in your ideas on how to configure the requirements, if there is a use-case for transitive python package requirements between conan packages or if we should expect the consumer to set up and mange all the python dependencies?

I have created a small hack where I generate a python venv, this is quite similar to in @thorntonryan 's latest PR although my code is a big uglier :). I am a bit unsure about how to use the venv, however. The approach I have gone with in my hack is to add the venv to PATH in pyvenv-config.cmake if included via find_package., but maybe this is bad practice? This only works for cmake as well, what is your suggestion for making sure that the venv is the one used when calling python in a system call in a general case (in self.run, Make, CMake, Bazel etc.)? The best way I have seen so far is with an activate() context manager, as mentioned in the PR above.

# conanfile.py

import os
import subprocess
import venv
from glob import glob
from pathlib import Path
from textwrap import dedent

from conan import ConanFile
from conan.tools.cmake import CMake, CMakeToolchain, cmake_layout


class ConanRecipe(ConanFile):
    settings = "build_type"

    def layout(self):
        cmake_layout(self)

    def generate(self):
        vg = PyVenvGenerator(self)
        vg.requirements.append("pyyaml")
        vg.generate_cmake_file = True
        vg.generate()
        tc = CMakeToolchain(self)
        tc.generate()

    def build(self):
        cmake = CMake(self)
        cmake.configure()
        cmake.build()

class PyVenvGenerator:
    def __init__(self, conanfile: ConanFile):
        self._conanfile = conanfile
        self.generate_cmake_file = False
        self.requirements = []

    @property
    def pyvenv_path(self):
        return self._conanfile.generators_path / ".pyvenv"

    @property
    def scripts_path(self):
        return self.pyvenv_path / "Scripts"

    @property
    def python(self):
        python = "python.exe"  # TODO: don't hardcode to Windows executable
        return self.scripts_path / python

    def generate(self):
        self._conanfile.output.info("Creating virtual Python environment")
        venv.create(self.pyvenv_path, with_pip=True, clear=True)
        if self.requirements:
            self._conanfile.output.info("Installing requirements into virtual Python environment")
            all_requirements = " ".join(f'"{requirement}"' for requirement in self.requirements)
            subprocess.check_call(f"{self.python} -m pip install {all_requirements}")

        if self.generate_cmake_file:
            venv_scripts_path = str(self.scripts_path).replace("\\", "/")
            contents = dedent(f"""
                set(ENV{{PATH}} "{venv_scripts_path}{os.pathsep}$ENV{{PATH}}")
            """)
            executables = glob(str(self.scripts_path / "*.exe"))  # TODO: Make OS agnostic without including (de)activate scripts
            executables = {Path(exe).stem: exe.replace("\\", "/") for exe in executables}
            for exe_name, exe_path in executables.items():
                contents += dedent(f"""
                    add_executable(pyvenv::{exe_name} IMPORTED)
                    set_target_properties(pyvenv::{exe_name} PROPERTIES IMPORTED_LOCATION "{exe_path}")
                """)
            with open(self._conanfile.generators_path / "pyvenv-config.cmake", "w") as cmake_file:
                cmake_file.write(contents)
# CMakeLists.txt

cmake_minimum_required(VERSION 3.25)
project(pyvenv-generator-test)

find_package(pyvenv REQUIRED)

execute_process(COMMAND python -c "import sys; print(sys.executable)")  # pyvenv::python doesn't seem to work here

add_custom_command(
    OUTPUT hello.txt
    COMMAND pyvenv::python -c "import sys; print(sys.executable); open('hello.txt', 'w').write('hello')"
)
add_custom_target(hello ALL
    DEPENDS hello.txt
)

@samuel-emrys
Copy link
Contributor Author

What are the plans with this direction? Will it be released soon? Soon(TM)? I am guessing that it will only be released for conan 2 without a backport to conan 1?

I'm working out some of the kinks in my own packages. When I mature things to a point where I've factored out some of the things requested here (inclusion of which, conf defined interpreter), I'll start to work a pull request together. Regardless, I have a working python-virtualenv package and CMakePythonDeps generator that will do what you need it to do. It works with both conan 1.x and 2.x at the moment:

Usage is demonstrated in both the README's and also test_package and test_v1_package for conan 2.x and 1.x compatibility respectively.

Specifically, this is probably the most relevant example (modified to hide test_package particulars):

class PythonVirtualenvTestConan(ConanFile):
    settings = "os", "compiler", "build_type", "arch"
    python_requires = "cmake-python-deps/0.3.0@mtolympus/stable"

    def configure(self):
        self.options["python-virtualenv"].requirements = json.dumps([
            "sphinx",
            "sphinx-rtd-theme",
        ])

    def build_requirements(self):
        self.tool_requires("python-virtualenv/system@mtolympus/stable")

    def generate(self):
        tc = CMakeToolchain(self)
        tc.generate()

        py = self.python_requires["cmake-python-deps"].module.CMakePythonDeps(self)
        py.generate()

    def layout(self):
        cmake_layout(self)

    def build(self):
        cmake = CMake(self)
        cmake.configure()
        cmake.build()
# CMakePythonDeps generates a target for the python package name
find_package(sphinx REQUIRED)

# Sphinx configuration
set(SPHINX_SOURCE ${CMAKE_CURRENT_SOURCE_DIR})
set(SPHINX_BUILD ${CMAKE_CURRENT_BINARY_DIR}/sphinx)
set(SPHINX_INDEX_FILE ${SPHINX_BUILD}/index.html)

# CMakePythonDeps generates a component for each binary exported by the python package
# In this case, it will invoke sphinx-build
add_custom_command(
  OUTPUT ${SPHINX_INDEX_FILE}
  DEPENDS ${CMAKE_CURRENT_SOURCE_DIR}/index.rst
  COMMAND sphinx::sphinx-build -b html ${SPHINX_SOURCE} ${SPHINX_BUILD}
  WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}
  COMMENT "Generating documentation with Sphinx")

add_custom_target(Sphinx ALL DEPENDS ${SPHINX_INDEX_FILE})

# Add an install target to install the docs
include(GNUInstallDirs)
install(DIRECTORY ${SPHINX_BUILD}/ DESTINATION ${CMAKE_INSTALL_DOCDIR})

If this is too high an abstraction for you, you can also use the underlying venv helper directly, which has derived (but diverged) from @thorntonryan and his colleagues solution:

@CLAassistant
Copy link

CLAassistant commented May 18, 2023

CLA assistant check
All committers have signed the CLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[feature] Packageable pip installable Python venv based utilities
8 participants