Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

conda-build traverses the whole index greedily, undoing the lazy-loading optimizations #4961

Closed
jaimergp opened this issue Aug 9, 2023 · 1 comment
Labels
locked [bot] locked due to inactivity

Comments

@jaimergp
Copy link
Contributor

jaimergp commented Aug 9, 2023

In #4431 I was investigating some differences in timings between CONDA_SOLVER=libmamba conda-build and conda mambabuild, which should be similar. However, conda-build spends a few minutes before the solver kicks in, while mambabuild does not.

My research led me to finding out that those extra minutes are spent creating the build index, which is the aggregation of all the source channels (e.g. defaults, conda-forge) and their platforms (noarch is collapsed into e.g. linux-64 🤷), plus the local channels (the CONDA_BLD_PATH cache and/or the chosen output folder).

The slow part is not the repodata fetching, but creating the million+ PackageRecord objects greedily, instead of letting conda do it lazily as needed (introduced in conda/conda#12050). This actually happens in conda, in two places, but conda-build is the sole consumer of those endpoints AFAIK.

  • In index.py, in the exports module, where an identity dict[PackageRecord, PackageRecord] is built by aggregating all the SubdirData.iter_records() instances.
  • In exports.py, where that map is processed again just to convert the keys into Dist objects.

It feels like we could do this better, as @dholth was saying #4431 (comment). The interface with the Solver could also be better; it currently overwrites Solver._index with this greedily built object, and conda-libmamba-solver won't even use that 😬

@jaimergp
Copy link
Contributor Author

Superseded by #5154

@github-project-automation github-project-automation bot moved this from 🆕 New to 🏁 Done in 🧭 Planning Apr 11, 2024
@github-actions github-actions bot added the locked [bot] locked due to inactivity label Oct 9, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 9, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
locked [bot] locked due to inactivity
Projects
Archived in project
Development

No branches or pull requests

1 participant