Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PGA] fix stars handling, reduce memory #80

Merged
merged 1 commit into from
Sep 24, 2018
Merged

Conversation

smola
Copy link
Contributor

@smola smola commented Sep 19, 2018

pga-create: fix logic for deleted repositories, reduce mem

  • On repositories, skip every deleted repository. This
    means we have no more duplicated project entries anymore.

  • ID deduplication logic removed from writeWatchers. This
    also means reduced memory usage.

  • Use uint32 instead of int to hold project IDs and stars in memory.

  • Use gzip for stars file.

@smola smola requested review from campoy and vmarkovtsev September 19, 2018 09:25
commaPos2 = strings.LastIndex(line[:commaPos2], ",")
commaPos1 := strings.LastIndex(line[:commaPos2], ",")
deletedFlag := line[commaPos1+1 : commaPos2]
if deletedFlag != "0" {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about the number of stars? If a repo is deleted, does it lose all the stars and the new one starts with 0?

Maybe we should take the maximum from the old and the new stars here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, deleted repositories do not keep their stars if re-created. I think getting actual stars is the least surprising one, since it accurately reflects current status. This is particularly important for cases where the re-created repository is not the main one anymore, but a fork.

- On repositories, skip every deleted repository. This
  means we have no more duplicated project entries anymore.

- ID deduplication logic removed from writeWatchers. This
  also means reduced memory usage.

- Use uint32 instead of int to hold project IDs and stars in memory.

- Use gzip for stars file.

Signed-off-by: Santiago M. Mola <santi@mola.io>
@smola smola changed the title [WIP][PGA] fix stars handling, reduce memory [PGA] fix stars handling, reduce memory Sep 21, 2018
@smola smola merged commit 07e9e13 into src-d:master Sep 24, 2018
@smola smola deleted the fix-watchers branch September 24, 2018 13:43
@@ -70,71 +70,78 @@ func reduceWatchers(stream io.Reader) map[int]int {
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of Atoi you should use strconv.ParseUint(line[:commaPos], 10, 32) to detect any overflow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants