Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API not returning all repos for given project's GitHub org #715

Closed
jmertic opened this issue Oct 5, 2024 · 7 comments
Closed

API not returning all repos for given project's GitHub org #715

jmertic opened this issue Oct 5, 2024 · 7 comments

Comments

@jmertic
Copy link

jmertic commented Oct 5, 2024

See examples in https://lfenergy.landscape2.io/api/projects/all.json; we are only having one repo per project returned when, in reality, a project org is specified so we should return all the repos.

@tegioz
Copy link
Collaborator

tegioz commented Oct 7, 2024

Hi @jmertic 👋

That endpoint returns all repositories defined in the landscape data file, which includes the primary repository as defined in the repo_url field, plus any additional one listed in the additional_repos field.

Please note that landscape v2 (intentionally) does neither collect nor process all repositories in a given GitHub organization. So only repositories explicitly listed in the data file will be returned by this endpoint.

@jmertic
Copy link
Author

jmertic commented Oct 7, 2024

Got it - so maybe could you also return project_org so we know in those cases that all repos in the org are included?

@tegioz
Copy link
Collaborator

tegioz commented Oct 8, 2024

The project_org field does not actually map to any functionality in landscape v2 at the moment. It doesn't exist in our types (and it's not documented), so we aren't even processing it.

I think this problem could also be solved by the annotations proposal I just shared with you. You could add an annotation to signal this status to your application in the way that fits you better (you could use the same field, or something completely different).

@jmertic
Copy link
Author

jmertic commented Oct 8, 2024

That is a good potential workaround, but it does require manual maintenance on a landscape to see if a project adds new repos under it's existing GH org.

@tegioz
Copy link
Collaborator

tegioz commented Oct 9, 2024

But it'd be the same as returning the project_org field, only that it'd be a different field on a different location.

I would recommend to not populate the landscape.yml file in an automated way to add all organization's repositories to landscape entries. We haven't added this feature to the landscapes generator intentionally, mainly for sustainability and reliability reasons.

Some organizations are huge, with hundreds and hundreds of repositories, and collecting data for all of them may lead to reaching GitHub rate limits. The landscape build process can also get considerably slower, as we cannot send requests to GitHub too fast or we'll hit the secondary rate limits.

In most cases, only a repository -or a few of them- may be relevant for a particular landscape item, so processing an entire organization would be overkill. In other situations, there are multiple landscape items whose repositories are hosted in the same GitHub organization, so we'd end up misleadingly displaying the same stats for both. I think it's a feature that, for convenience reasons, can be misused too easily.

IMHO the landscape is probably not the best place to explore all repositories in a GitHub organization, as the GitHub UI handles that much better 😉

Please note that if any of the landscapes we host exhausts our tokens rate limits for this reason, we may need to pause its build temporarily until the number of repositories is reduced. The GitHub tokens are reused across landscapes and leaving tokens with no requests available could affect other landscapes builds.

Will close this one in favor of #716, we'll try to have the annotations ready as soon as possible 🙂

@tegioz tegioz closed this as completed Oct 9, 2024
@jmertic
Copy link
Author

jmertic commented Oct 9, 2024

I can understand where you are coming from on this for sure. The challenge comes from inconsistencies in how stats such as stars are reported relative to landscape v1; we almost would need to bring in all the other repos for the org to have an accurate view for the case of understanding the activity of a project, as an example.

Let me chew on an approach for the projects I work with here. Appreciate all your help.

@tegioz
Copy link
Collaborator

tegioz commented Oct 9, 2024

Sure 👍 No worries, happy to help!

jmertic added a commit to jmertic/lfx-tac-actions that referenced this issue Nov 14, 2024
Blocked by cncf/landscape2#716 and
cncf/landscape2#715

---------

Signed-off-by: John Mertic <jmertic@linuxfoundation.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants