Speed up git destination with mostly excluded files #191
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When migrating into a git destination with excluded files, copybara calls
git add
on all of these excluded files individually. For cases where the destination repo is large, but most of the files are excluded, this drastically slows down the import process and is by far the slowest part of the import, taking minutes when the rest of the import takes seconds.This change looks for directories where all files are excluded (i.e. we plan to call
git add
with every file in a directory), and instead callsgit add
with just the directory. It also does this recursively, i.e. on entire directory trees of this form. With this change, this part of the code is no longer the bottleneck in this type of migration.There is potentially some overlap here with #112 - however, if I understand correctly, that PR would only work for CHANGE_REQUEST workflows, while this implementation will speed up all workflows with git destinations. And, I don't see any reason why these two changes are mutually exclusive.