Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make GTO faster #163

Merged
merged 5 commits into from
Jun 2, 2022
Merged

Make GTO faster #163

merged 5 commits into from
Jun 2, 2022

Conversation

aguschin
Copy link
Contributor

@aguschin aguschin commented May 31, 2022

Addressing findings in #156
Checked this again, it seems and got faster by 50% only. Will see what else can be done, in the future PRs.

@aguschin aguschin self-assigned this May 31, 2022
@aguschin aguschin requested a review from amritghimire June 1, 2022 20:27
}

def check_ref(self, ref: str):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amritghimire, this is not ideal, but if you create GitRegistry, you can use this function body as an example on how to cache state. Looking for better solutions.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could work. A similar approach for get_stages will also be great. We do checkout to a commit for each get_stages call. But if get_stages use the head when creating GitRegistry, having ref there as well would be better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you call get_stages each time? What stages user sees in MR should depend on config HEAD only. So it doesn't matter what is there in all other commits.

Copy link
Contributor Author

@aguschin aguschin Jun 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I.e. you should only call $ gto stages when HEAD of default branch gets updated.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to parse everything for each commit. Because:

  1. We parse incrementally and not all at once
  2. Some commits might be removed, in this case we don't want to reparse, we want to recalc.

In perfect world we wouldn't need to create GitRegistry and manage it. Things will be cached for us on your side, not sure how hard is this to do though.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The easiest workaround is probably providing git_registry param, i.e. if not None then use it instead of recreating each time.

@aguschin aguschin merged commit 7ac36b7 into main Jun 2, 2022
@aguschin aguschin deleted the feature/make-gto-faster branch June 2, 2022 11:44
@aguschin aguschin mentioned this pull request Jul 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants