Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rejig the git interactions code to improve performance and reliability #120

Open
blake-sc opened this issue Mar 29, 2022 · 0 comments
Open

Comments

@blake-sc
Copy link
Contributor

At the moment Bilara uses git commands (like git add, git commit) in a subprocess.

Both performance and reliability could be improved by using a library like pygit2 to directly manipulate the repository instead of relying on the "user-friendly" behavior of basic git commands. Why? Command line utilities are intended to give feedback to the user if something untoward happens

One important thing is to ensure that concurrency is handled gracefully, sometime that Bilara still occasionally fails at resulting in the requirement for an administrator to intervene, this requires that the code be stress tested by having multiple threads/processes modify the repository simultaneously in an attempt to create race conditions. If the code that updates the file can withstand a stress test of multiple threads updating dozens of files simultaneously.

Managing concurrency correctly is critical because bilara uses some multi-threading. For example when the Github repo is updated, that triggers a webhook that updates the repository. A multiprocessing friendly locking strategy must be used to ensure that files are updated without interference. Git does have its own index lock, but git commands fail rather than waiting for the index lock to be released, so this has to be accounted for by retrying.

In preliminary stress testing, pygit2 has performed much better with concurrency (like 10 threads bombarding the git repo with thousands of updates), being both much more performant in terms of speed (probably 100x faster) and it was easier to write code that doesn't result in unrecoverable situations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant