You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In #4431 I was investigating some differences in timings between CONDA_SOLVER=libmamba conda-build and conda mambabuild, which should be similar. However, conda-build spends a few minutes before the solver kicks in, while mambabuild does not.
My research led me to finding out that those extra minutes are spent creating the build index, which is the aggregation of all the source channels (e.g. defaults, conda-forge) and their platforms (noarch is collapsed into e.g. linux-64 🤷), plus the local channels (the CONDA_BLD_PATH cache and/or the chosen output folder).
The slow part is not the repodata fetching, but creating the million+ PackageRecord objects greedily, instead of letting conda do it lazily as needed (introduced in conda/conda#12050). This actually happens in conda, in two places, but conda-build is the sole consumer of those endpoints AFAIK.
In index.py, in the exports module, where an identity dict[PackageRecord, PackageRecord] is built by aggregating all the SubdirData.iter_records() instances.
In exports.py, where that map is processed again just to convert the keys into Dist objects.
It feels like we could do this better, as @dholth was saying #4431 (comment). The interface with the Solver could also be better; it currently overwrites Solver._index with this greedily built object, and conda-libmamba-solver won't even use that 😬
The text was updated successfully, but these errors were encountered:
In #4431 I was investigating some differences in timings between
CONDA_SOLVER=libmamba conda-build
andconda mambabuild
, which should be similar. However, conda-build spends a few minutes before the solver kicks in, while mambabuild does not.My research led me to finding out that those extra minutes are spent creating the build index, which is the aggregation of all the source channels (e.g. defaults, conda-forge) and their platforms (
noarch
is collapsed into e.g.linux-64
🤷), plus the local channels (theCONDA_BLD_PATH
cache and/or the chosen output folder).The slow part is not the repodata fetching, but creating the million+
PackageRecord
objects greedily, instead of lettingconda
do it lazily as needed (introduced in conda/conda#12050). This actually happens inconda
, in two places, butconda-build
is the sole consumer of those endpoints AFAIK.index.py
, in theexports
module, where an identitydict[PackageRecord, PackageRecord]
is built by aggregating all theSubdirData.iter_records()
instances.exports.py
, where that map is processed again just to convert the keys intoDist
objects.It feels like we could do this better, as @dholth was saying #4431 (comment). The interface with the
Solver
could also be better; it currently overwritesSolver._index
with this greedily built object, and conda-libmamba-solver won't even use that 😬The text was updated successfully, but these errors were encountered: