Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in repo.active_branch if branch is in DETACHED HEAD state #633

Open
cgi1 opened this issue May 29, 2017 · 16 comments
Open

Error in repo.active_branch if branch is in DETACHED HEAD state #633

cgi1 opened this issue May 29, 2017 · 16 comments

Comments

@cgi1
Copy link

cgi1 commented May 29, 2017

Error in repo.active_branch if branch is in DETACHED HEAD state:

{TypeError}HEAD is a detached symbolic reference as it points to '0fa1f2a7d9dd9fb63fa775d2d97ad72f56f1d819'

The branch is None in def _get_ref_info(cls, repo, ref_path): because of

# its a commit
        if repo.re_hexsha_only.match(tokens[0]):
            return (tokens[0], None)

def active_branch(self) could return HEAD-{commit-hash} instead like:

    try:
        active_branch = repo.active_branch
    except:
        active_branch = 'DETACHED_' + repo.head.object.hexsha
@cgi1
Copy link
Author

cgi1 commented May 30, 2017

Looks similar to #629

@Byron
Copy link
Member

Byron commented Jun 10, 2017

I am not sure it would be right to return a commit instead. The function is called active_branch, and I would assume people expect it to return a branch, never a commit. If HEAD is detached, there is no branch to return.
What do you think?

@cgi1
Copy link
Author

cgi1 commented Jun 13, 2017

Hm good point, I don´t have a pefect answer.. To return no branch seems to be ok.

@Byron
Copy link
Member

Byron commented Jun 17, 2017

I think returning no branch would break existing code which expects an exception. On the other hand, the exception is not documented, so in a way it's running under the radar anyway.

Maybe it could be adjusted to return None if there is no active branch instead, which would make the calling code a little less cumbersome and safer, as it could start to expect that.

Would you like to contribute a respective PR?

@peterbe
Copy link
Contributor

peterbe commented Aug 16, 2019

I stumbled on this too. With submodules.
My solution wasn't to try to rely on a try: except: ... but I made it always check out at least some branch first before it started switching to the branch I ultimately wanted to. I.e. always start with repo.heads[config["master_branch"]].checkout()

akinnane added a commit to alphagov/gds-pre-commit that referenced this issue Mar 31, 2020
When concourse checks out a repo it leaves it in a detached head
state. This was causing errors as python git has slightly undefined
behaviour in this case.

```
File "detect_checker.py", line 125, in _load_repo
    self.parent_branch = self.repo.active_branch
File "/home/a/venv/local/lib/python3.7/site-packages/git/repo/base.py", line 703, in active_branch
    return self.head.reference
File "/home/a/venv/local/lib/python3.7/site-packages/git/refs/symbolic.py", line 272, in _get_reference
    raise TypeError("%s is a detached symbolic reference as it points to %r" % (self, sha))
TypeError: HEAD is a detached symbolic reference as it points to '6e589de2864ee310a46a343ae9d656405e6b8c84'
```

gitpython-developers/GitPython#633

This PR changes it use the hexsha instead which works correctly with
detached heads and normal branches.

Co-authored-by: detnon <sam.detnon@digital.cabinet-office.gov.uk>
Co-authored-by: Alex Kinnane <17098249+akinnane@users.noreply.github.com>
akinnane added a commit to alphagov/gds-pre-commit that referenced this issue Mar 31, 2020
When concourse checks out a repo it leaves it in a detached head
state. This was causing errors as python git has slightly undefined
behaviour in this case.

```
File "detect_checker.py", line 125, in _load_repo
    self.parent_branch = self.repo.active_branch
File "/home/a/venv/local/lib/python3.7/site-packages/git/repo/base.py", line 703, in active_branch
    return self.head.reference
File "/home/a/venv/local/lib/python3.7/site-packages/git/refs/symbolic.py", line 272, in _get_reference
    raise TypeError("%s is a detached symbolic reference as it points to %r" % (self, sha))
TypeError: HEAD is a detached symbolic reference as it points to '6e589de2864ee310a46a343ae9d656405e6b8c84'
```

gitpython-developers/GitPython#633

This PR changes it use the hexsha instead which works correctly with
detached heads and normal branches.

Co-authored-by: detnon <sam.detnon@digital.cabinet-office.gov.uk>
Co-authored-by: Alex Kinnane <17098249+akinnane@users.noreply.github.com>
@stas00
Copy link

stas00 commented Oct 16, 2020

3 years later and this is still a problem :(

stas00 added a commit to stas00/transformers that referenced this issue Oct 16, 2020
It looks like gitpython is broken when one tries to use it from `git checkout hash`, see: gitpython-developers/GitPython#633

While debugging/bisecting we need to be able to step through different commits and for the code to still work. Currently it fails with:
```
ERR:   File "./examples/seq2seq/distillation.py", line 504, in <module>
ERR:     distill_main(args)
ERR:   File "./examples/seq2seq/distillation.py", line 494, in distill_main
ERR:     model = create_module(args)
ERR:   File "./examples/seq2seq/distillation.py", line 411, in create_module
ERR:     model = module_cls(args)
ERR:   File "/mnt/nvme1/code/huggingface/transformers-multigpu/examples/seq2seq/finetune.py", line 58, in __init__
ERR:     save_git_info(self.hparams.output_dir)
ERR:   File "/mnt/nvme1/code/huggingface/transformers-multigpu/examples/seq2seq/utils.py", line 355, in save_git_info
ERR:     repo_infos = get_git_info()
ERR:   File "/mnt/nvme1/code/huggingface/transformers-multigpu/examples/seq2seq/utils.py", line 374, in get_git_info
ERR:     "repo_branch": str(repo.active_branch),
ERR:   File "/home/stas/anaconda3/envs/main-38/lib/python3.8/site-packages/git/repo/base.py", line 705, in active_branch
ERR:     return self.head.reference
ERR:   File "/home/stas/anaconda3/envs/main-38/lib/python3.8/site-packages/git/refs/symbolic.py", line 272, in _get_reference
ERR:     raise TypeError("%s is a detached symbolic reference as it points to %r" % (self, sha))
ERR: TypeError: HEAD is a detached symbolic reference as it points to 'fb94b8f1e16eb21d166174f52e3e49e669ef0ac4'
```
This PR provides a workaround.

@sshleifer
@Byron
Copy link
Member

Byron commented Oct 16, 2020

I think returning no branch would break existing code which expects an exception. On the other hand, the exception is not documented, so in a way it's running under the radar anyway.

Maybe it could be adjusted to return None if there is no active branch instead, which would make the calling code a little less cumbersome and safer, as it could start to expect that.

Would you like to contribute a respective PR?

I believe this comment still holds.

@stas00
Copy link

stas00 commented Oct 16, 2020

Thank you for this follow up, @Byron

My suggestion for the best solution would be this: In order to keep backward compatibility (exception that is) adding a new argument that allows for what OP suggested:

git.Repo(search_parent_directories=True, active_branch_flexible=False)

and then:

if active_branch_flexible:
    try:
        active_branch = repo.active_branch
    except:
        active_branch = 'DETACHED_' + repo.head.object.hexsha

I haven't looked at the code, so this is totally a pseudocode based on what I saw in this issue.

Of course, if that suggestion resonates, you'd know better what to call it, active_branch_flexible is just a quick idea ;)

Thank you.

@Byron
Copy link
Member

Byron commented Oct 17, 2020

I think at this point it's clear that the methods behaviour is surprising, so having alternatives seems valuable.

Thanks @stas00 for the suggestion, which certainly works as to remove the exception that keeps tripping people up.
Making the returned branch a string as proposed above could be just as dangerous though, as the type it will return has never been a string. Also I would hope to keep things more local and avoid adding another, very specialized configuration flag to the repository.

My proposition is to add a new method or property along the lines of active_branch_or_none with documentation explaining under which circumstances None would be expected.

@stas00
Copy link

stas00 commented Oct 17, 2020

Surely at the very least active_branch_or_none will remove the element of surprise.

Thank you for your thoughtful commentary, @Byron.

@Byron
Copy link
Member

Byron commented Oct 17, 2020

Indeed, ideally there is a way to not enforce any branching, but I don't believe that exists since there either is a branch, or there is none. With a method name that makes this clear there should be no surprise, but I invite you to experiment with styles and names and submit a PR with what works best for you.

It really would be great if you could be the one to put an end to this issue :), so imagine me cheering you on from the sidelines :).

@stas00
Copy link

stas00 commented Oct 17, 2020

Thank you for this invitation, but I will pass at the moment. I have never used gitpython and it was in someone else's code totally unrelated to what I was doing, so I had to remove that obstacle to proceed. That's how I ended up here.

Should I start using the library and get to learn how it works, I'd be happy to submit a PR.

@Eoin-McMahon
Copy link

3.5 years later and this is still a problem :(

@Byron
Copy link
Member

Byron commented Feb 26, 2021

This comment might help.

@adnanmuttaleb
Copy link

I read the last comment, but it was 4 years ago, any new update on this?

@Byron
Copy link
Member

Byron commented Jan 31, 2025

Soon this issue is 8 years old!

With typing it would make sense to return None if there is no active branch.
That should be an easy fix and PRs are welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

6 participants