Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] Provide a hook for transparently overriding source downloads #11656

Closed
1 task done
lkantola opened this issue Jul 15, 2022 · 6 comments
Closed
1 task done
Milestone

Comments

@lkantola
Copy link

Please provide hooks for conans.tools.get, conans.tools.download, and corresponding Conan 2 functions, so that the download handling and caching can be overridden by custom logic.

The inputs to the custom download hook would be the URL and checksum of the source archive. The hook would return a path to a local read-only file or write content to a given target path. Checksum verification and unzipping should be handled by conan. Error handling could be done by raising an exception (abort operation) or a special return code to fallback to the default download handling.

Motivation is to allow offline builds in isolated environments using a "custom local repository" for the source code archives. One requirement is that the recipes and conandata.yml are not modified and can point to the original online resources. This should not affect the reproducibility when using the recommended SHA256 checksum.

Below is an example use case of my current setup:

  • Recipes are located in a local git repository. Most of the recipes have been manually downloaded from Conan Center Index and contain no or only minor modifications. Specifically, all recipes use conans.tools.get with URL and SHA256 specified in conandata.yml.
  • Source code has been downloaded manually into a local folder from the URLs in the conandata.yml.
  • A custom download hook has been implemented by monkey patching the conans.tools.get inside a pre_source hook. The custom download hook finds a local source file path based on the given sha256sum.
  • Packages in the local .conan cache are built from source with conan create using the recipes in the local git repository, and the download hook unzips source files directly from the local download folder, so there is no network activity.

Future ideas: (not needed for my current use case)

  • Provide similar hooks for scm / git clone.
  • Provide hooks for replacing parts of the conan server API to allow more offline use-cases without conan_server.

The feature request is somewhat related to: #6876, #6944, conan-io/docs#3505, and other issues about offline builds and source code caching.

@lkantola
Copy link
Author

For reference, below is the monkey patching pre_source hook from my example use case in the first comment. It requires a sha256sum generated checksum file in $MY_SOURCES/files.sha256. Having an official download hook API would make things less hacky and allow reducing the code to _get_repo_path function only.

The patched conans.tools.get function only supports parameters for my specific use case. It is also more efficient than the built-in cache, because it does not copy the local file before unzip, and does not remove it after.

import os
import sys

import conans.tools
from conans.client.tools.files import check_sha256, unzip
from conans.util.fallbacks import default_output


def _get_repo_path(sha256):
    assert sha256
    repo_dir = os.path.abspath(os.environ["MY_SOURCES"])
    checksum_file = os.path.join(repo_dir, "files.sha256")
    with open(checksum_file, "rt") as chechksums:
        for line in chechksums:
            cs, path = line.split(None, 1)
            if cs == sha256:
                return os.path.join(repo_dir, path.strip())


def _patched_get(
    url,
    md5="",
    sha1="",
    sha256="",
    destination=".",
    filename="",
    keep_permissions=False,
    pattern=None,
    output=None,
    strip_root=False,
):
    """Patched version of conans.tools.get."""
    myout = default_output(output, "source_replace._patched_get")
    assert not filename, "The filename argument is not used."
    filename = _get_repo_path(sha256)
    if not filename:
        raise ConanException(f"Could not find {url} with checksum {sha256}")
    myout.info(f"[source_replace]: {url} -> {filename}")
    check_sha256(filename, sha256)

    # Copy-paste from conans.tools.get()
    unzip(
        filename,
        destination=destination,
        keep_permissions=keep_permissions,
        pattern=pattern,
        output=output,
        strip_root=strip_root,
    )


def _patch_tools_get():
    if conans.tools.get is not _patched_get:
        conans.tools.get = _patched_get
        return True


def pre_source(output, conanfile, conanfile_path, **kwargs):
    output.info(f"source_replace hook called")
    _patch_tools_get()

@memsharded
Copy link
Member

This idea is interesting.
In the past we have done a proof of concept for a cache of downloaded sources, and keeping them in a server generic repository. This proof of concept would be built-in and do something very similar to what you are doing, but also with a server storage, to provide reproducibility later in time in any machine.

Maybe it is better reconsider that approach and provide this as a "plugin" (the concept of hook in Conan is a bit different, this is what we call a plugin), to allow a greater flexibility for users.

The only thing is that this idea should wait, until 2.X. We are not focused on getting 2.0 out, and this kind of new ideas and features will need to wait until 2.0 is out of beta and stabilized. Thanks for the suggestion!

@memsharded memsharded reopened this Jul 15, 2022
@memsharded memsharded added this to the 2.X milestone Jul 15, 2022
@goodtune
Copy link

I have a slightly different use case to mention, and would consider it a powerful addition for corporate use.

Our company policy disallows the retrieval of source code directly from external sites, however we do have an Artifactory instance that provides the VCS Repository, that allows us to retrieve tags or branches from hosted Git services, such as Github.

As an example, the opentelemetry-cpp recipe from conan-io/conan-center-index has the following conandata.yml:

sources:
  "1.4.1":
    url: "https://github.com/open-telemetry/opentelemetry-cpp/archive/v1.4.1.tar.gz"
    sha256: "301b1ab74a664723560f46c29f228360aff1e2d63e930b963755ea077ae67524"

patches:
  "1.4.1":
    - patch_file: "patches/1.4.0-0001-fix-cmake.patch"
      base_path: "source_subfolder"

We aren't allowed to use an HTTP proxy like Squid, but I could rewrite the URL to fetch the archive - for example "https://github.com/open-telemetry/opentelemetry-cpp/archive/v1.4.1.tar.gz" would become "https://artifactory.corp.local/artifactory/api/vcs/downloadTag/github/open-telemetry/opentelemetry-cpp/v1.4.1?ext=tar.gz". This would benefit subsequent users on our network as Artifactory can cache it forever.

The HTTP headers from this download (important ones selectively shown below) show that we've maintained the sha256 value.

HTTP/1.1 200 OK
Content-Type: application/x-gzip
X-Checksum-Sha256: 301b1ab74a664723560f46c29f228360aff1e2d63e930b963755ea077ae67524
X-Artifactory-Filename: opentelemetry-cpp-v1.4.1.tar.gz

I've just started my search for related tickets, if this has been discussed elsewhere, I can add my comments there.

@maldag
Copy link

maldag commented Nov 18, 2022

We are looking for this kind of possibility also. Is there a timeline to get 2.0 out and stabilized?

@memsharded
Copy link
Member

Our intent is to get 2.0 out before EOY, but it is a bit tight, and it also depends on the migration efforts in ConanCenter. But definitely soon. The best way to contribute atm is to test the released beta.5 and give feedback.

@memsharded
Copy link
Member

We have implemented this feature in the "backup-sources" in #13461, to be released in next 2.0.3 (as hidden initially, for team and ConanCenter testing), but it will be made available for users too.

Closing this as solved, thanks very much for the feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants