Is there a way to get which branch a commit belongs to via GraphQL? #24501
-
Hello, I’m trying to learn the GitHub v4 GraphQL API and saw this Stackoverflow answer that uses this query to get a branches of a repository:
The above query gets the branches that belong to a repository. Instead of that, is there a way to get the branch that a commit belongs to? Looking at the GitHub API v4 documention, it doesn’t list Thanks! P.S. The |
Beta Was this translation helpful? Give feedback.
Replies: 13 comments
-
👋 @penyuan , thanks for reaching out and asking this question! Given some commit in a repository, there’s not a way in the GraphQL API to list all of the branches where that commit exists on. However, one approach that you can take is fetching a list of branches (as the query you provided does) and make another request that lists the commits for one or more of those branches:
You may need to leverage pagination on the relevant fields to get all of the desired results. Once you have these results, you’ll need to write a program that checks if the commit is in the list of commits for each branch. I give credit to one of my colleagues who explained this approach using the REST API in this StackOverflow reply: stackoverflow.comList of branches a commit appears on
github, github-api
answered by I hope this helps! |
Beta Was this translation helpful? Give feedback.
-
Thank you @francisfuzz that’s super helpful! Sorry I’m very new to GraphQL, but I tried looking into pagination so that I can send multiple queries to eventually retrieve all commits on each branch, and using the following I can get the end cursors for a query:
As you can see I’m using the I hope this makes sense. Am I using EDIT: I think this thread articulates my question better than me. I.e. how do you do nested pagination? |
Beta Was this translation helpful? Give feedback.
-
penyuan:
@pengyuan - Thanks for writing back and posting that follow-up question! I think I understand what you mean around doing “nested” pagination, where you’re able to paginate through a nested fields resources. However, running a second query would only allow you to specify a single cursor for So, while you could iterate on each of those branches’ commit histories and their respective next pages, you’ll still need to make a request for each set of commit histories for each of those branches. I’m sorry I didn’t explain this more thoroughly in my initial reply, but I hope this context helps! |
Beta Was this translation helpful? Give feedback.
-
francisfuzz:
Ah yes, you perfectly explained what I was trying to describe. Thank you for the crazy-quick response!
Understood. I guess I will make my queries in the following order then:
Sorry one more question to wrap up this topic: From what I can tell base on this thread, each commit will belong to exactly one branch. Is this correct? If so, then I don’t need to worry about checking if the same commit appears under multiple branches which would save time and effort. Does this also mean all the commits belonging to each branch would collectively represent every commit to the repository, nothing more, nothing less? Another edit: Oh wait, I just realized that sometimes branches (i.e. |
Beta Was this translation helpful? Give feedback.
-
penyuan:
In context of the query that’s run, a commit that shows up in the branch’s history should only show up once for each respective branch. However, it’s possible for that same commit to be present in other branches. ExampleLet’s say that you created a repository on GitHub.com, initializing it with a README. At this point in time, the repository will have one commit at some SHA value (we’ll call it The moment you create a new branch (without committing anything) from the repository’s default branch, your repository will have two branches both of which point to I created an example repository illustrating this point. The default branch is Running the following query will showcase the same commit in both branches’ commit history:
Here’s the data returned: Result Set
If one of your primary concerns is time and effort (and perhaps performance), it may be worth exploring the approach of cloning the repository to a machine and leverage the
penyuan:
In general, “dangling” commits wouldn’t be a part of any branch. One of the reasons a “dangling” commit exist is that some change was committed to Git history, but it has been force pushed “over” where the more updated commit exists in its place. Our systems periodically run garbage collection to remove these “dangling” commits; if you come across one, its presence isn’t guaranteed for any period of time. I’m not aware of a method of querying those commits via our API (happy to have others reading this chime in if you do know 😉). However, if you do stumble upon one by whatever means, you can make a request to our I hope this helps! |
Beta Was this translation helpful? Give feedback.
-
francisfuzz:
Woah! Hold on. 🤔 Sorry this actually confuses me more. Here’s what I mean: I’ve been looking at the octocat/Hello-World example (BTW, is this an official GitHub example repository???) since this repository only has a grand total of five commits. Here is my query to retrieve this repository’s branches and the commits for each branch:
Here is the query’s response: API response
The first observation is that there are three commits shared by all branches. Yet if I follow the I chose this screenshot because it leads to my second observation. The commit 7629413 - despite it claiming to be part of the Commit 7629413 is the “dangling” commit I’m talking about. My guess is that it was part of a named branch, but that branch was deleted once its sole commit 7629413 was merged into
Put another way, what is the logic behind the Github network graph and how to I re-create it? I also want to make sure that using the query I showed above with your help, I can exhaustively find all of the commits in a repository. Whew, sorry for my highly verbose posts, but thank you for your patience you’ve helped me learn a great deal!! 🙏 😂 |
Beta Was this translation helpful? Give feedback.
-
penyuan:
@penyuan - Thanks for following up! Sorry for any confusion here––I’ll do my best to clarify, though you’re welcome to follow up with any new questions or observations. Consider this a space for everyone to learn! 👍
penyuan:
Putting myself in your shoes, I see what you mean when you say “dangling” commit. After thinking a bit more about this, I think we used the same term to describe two different things (and that’s okay!). Here’s what I mean by “dangling” commit––I describe it as a commit that does not belong to any branch in the earlier referenced example repository. Here’s an example: … and here’s the commit that was force pushed over in its place:
Coming back to your case, However, the earlier example of
While I can’t speak to the specifics of how that feature is implemented (its source is closed and a part of GitHub’s product), I think that the unnamed blue branch is either:
To determine which case this falls under, I took this approach:
Thus, this unnamed blue branch is actually
Great question! Those three commits are in each of the query’s result set because each of those commits represent the branch’s commit history. What’s rendered in the commit view UI is meant to be an indicator of which branches this commit is reachable in. Granted, this indicator isn’t a documented feature and is something that’s subject to change at anytime. Checking the commit history using Git or using our API is the best way to determine where a commit has been.
A commit can belong to many branches. I think that this section on Git branching in the official Git documentation does a better job of explaining this than I can:
Building from that context, when you create some branch
I briefly touched on this in an earlier answer––I’m not able to share how that Network graph is implemented. However, I think it might be worth checking out the |
Beta Was this translation helpful? Give feedback.
-
Amazing. Thank you @francisfuzz for your patience! ❤️ It’s finally all coming together for me now.
Understood. That makes sense now. Glad that the kind of commits I was thinking about won’t be deleted!
Understood. For the latter case (not from a fork), would a reasonable way to identify it be simply to see if it is one of multiple parents of a future commit (implying merging branches)? This way, I won’t have to rely on if the commit is associated with any
Thanks! This really made it click for me. I clearly have much more to learn about Git!
Hehe, one could hope that one day GitHub will be fully open sourced like what other most platforms have already done (wink wink). 😅 Maybe I’ll submit this through the feedback form you mentioned. Anyways, based on the wealth of knowledge I’ve learned in this thread, all commits in a repository will fall under at least one branch that my query finds (because commits form the diverging histories that branches represent). Is this correct? Put another way, all |
Beta Was this translation helpful? Give feedback.
-
penyuan:
I’m not sure that I understand the question––I’m wondering if you could share more context around what you’re looking to accomplish given some commit? Are you looking to determine if that commit was introduced in a previously existing branch, or something else? On a cursory search, if you’re ultimately looking to determine which branch a Git commit came from there’s an excellent conversation about this topic in this StackOverflow thread that might be of interest. Considering the unnamed blue branch is a branch that once existed but was later deleted after its commits were merged onto
penyuan:
That’s correct: all commits in a repository hosted on GitHub.com should be accessible via the documented REST API endpoints or GraphQL API fields.
penyuan:
Assuming that it’s a repository hosted on GitHub.com, that’s correct. I make that distinction here because it’s possible that you or other colleagues may have a copy of the repository on your own machines and have created branches and commits on those branches as a part of your work. Those branches and commits, until they’re pushed to GitHub.com, are not reachable. I don’t know if that’s over-explaining things 😅 but think it’s worth mentioning just in case! |
Beta Was this translation helpful? Give feedback.
-
francisfuzz:
@penyuan @francisfuzz thank you so much for this discussion 💫 As an aside, I have my own curiosity now: there is an estimate of when Probably should be asked in a separate topic but there is a way, after a forced push, to “recover” an “ |
Beta Was this translation helpful? Give feedback.
-
maxdevjs:
👋 @maxdevjs , hello there & thanks for asking about this! If a “dangling” (also known as “orphaned”) commit exists, we don’t guarantee any specific timeline. My colleague @lee-dohm shared some more context about this topic in another topic:
If you have any follow-up questions about this, I encourage you to open a new topic about this in the |
Beta Was this translation helpful? Give feedback.
-
Good evening house, I have gain a lot in this platform and hoping to achieve more through them. |
Beta Was this translation helpful? Give feedback.
-
francisfuzz:
Actually, that does basically answer my question. Very informative!
Really appreciate the thoroughness, and yes I was operating with the assumption of just fetching what’s represented on GitHub and not what’s locally on contributors’ computers. Once again thank you for your patience with me! I think my original question has been comprehensively answered. I don’t know how to mark all your responses as “the solution” so I will mark your first response for now. 😅 I’m sure I’ll have more questions as I continue to learn more about GitHub, so I’ll post them in other threads. Many thanks!! 🤗 🙏t5: |
Beta Was this translation helpful? Give feedback.
👋 @penyuan , thanks for reaching out and asking this question!
Given some commit in a repository, there’s not a way in the GraphQL API to list all of the branches where that commit exists on.
However, one approach that you can take is fetching a list of branches (as the query you provided does) and make another request that lists the commits for one or more of those branches: