Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while running gitstats with linux kernel repository and old git versions #20

Closed
athanrous opened this issue May 23, 2013 · 6 comments

Comments

@athanrous
Copy link

Original title: Error while running gitstats with opensuse kernel

I've cloned the following repository

git://gitorious.org/opensuse/kernel.git

While running ./gitstats /path/to/git/repo /output/folder I face this error

"[2.11899] >> git shortlog -s "v3.0.78" "^v3.9.1"
[1.76242] >> git shortlog -s "v3.4.45" "^v3.0.78"
[1.95104] >> git shortlog -s "v3.8.13" "^v3.4.45"
[0.57330] >> git shortlog -s "v3.9.2" "^v3.8.13"
[0.69536] >> git shortlog -s "v3.10-rc1" "^v3.9.2"
[1.95851] >> git shortlog -s "v3.2.45" "^v3.10-rc1"
[6.77937] >> git shortlog -s "rpm-3.0.74-0.6.8" "^v3.2.45"
[0.03402] >> git shortlog -s "v3.0.79" "^rpm-3.0.74-0.6.8"
[1.77202] >> git shortlog -s "v3.4.46" "^v3.0.79"
[2.23483] >> git shortlog -s "v3.9.3" "^v3.4.46"
[0.71021] >> git shortlog -s "v3.10-rc2" "^v3.9.3"
[12.44785] >> git rev-list --pretty=format:"%at %ai %aN <%aE>" HEAD | grep -v ^commit
Traceback (most recent call last):
File "./gitstats", line 1430, in
g.run(sys.argv[1:])
File "./gitstats", line 1407, in run
data.collect(gitpath)
File "./gitstats", line 312, in collect
author, mail = parts[4].split('<', 1)
IndexError: list index out of range
"
FYI
a) I repeated the process more than 3 times and the error still appears.
b) With other repositories I had no issue or any error while running ./gitstats.

@hoxu
Copy link
Owner

hoxu commented Jun 1, 2013

The problem is something unexpected from the git rev-list command, but I can't figure out what by cloning the repository what it is.

Can you run this command in your clone and tell me what it outputs:

$ git rev-list --pretty=format:"%at %ai %aN <%aE>" HEAD | grep -v ^commit |egrep -v '^[0-9]{10} [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2} [+-][0-9]{4} '

For me it outputs only this line, and it shouldn't cause the problem:

1 1970-01-01 01:00:01 +0100 Ursula Braun <braunu@de.ibm.com>

Also, please show me the output of git show-ref HEAD in your opensuse kernel clone.

@ffagerho
Copy link

The same occurs with the mainline kernel from git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

The reason seems to be that some of the output lines from git rev-list are not in the expected format:

$ git rev-list --pretty=format:"%at %ai %aN <%aE>" HEAD | grep -v ^commit |egrep -v '^[0-9]{10} [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2} [+-][0-9]{4} '
1 1970-01-01 01:00:01 +0100 Ursula Braun <braunu@de.ibm.com>
  %aN <%aE>
  [...]

That last line ("%aN <%aE>") is repeated 98 times. It seems that for some commits, e.g. febf7ea4bedcd36fba0843db726bba28d22bf89a, git rev-list doesn't display any author information when using --pretty:format although when using --pretty:raw, at least some of the information is there:

commit febf7ea4bedcd36fba0843db726bba28d22bf89a
tree 0a0d398e0637fba8292d8b139afbd41b102bb9c4
parent 00213b17cec87d2cd4df75bcc79aea7a91d8532d
author  <sam@mars.ravnborg.org> 1136284526 +0100
committer  <sam@mars.ravnborg.org> 1136284526 +0100

    gitignore: ignore more generated files

    Signed-off-by: Sam Ravnborg <sam@ravnborg.org>

@hoxu
Copy link
Owner

hoxu commented Aug 26, 2013

git 1.8.3.rc2 here, and this is the result:

$ git rev-list --pretty=format:"%at %ai %aN <%aE>" febf7ea4bedcd36fba0843db726bba28d22bf89a |head -n2
commit febf7ea4bedcd36fba0843db726bba28d22bf89a
1136284526 2006-01-03 11:35:26 +0100 Sam Ravnborg <sam@mars.ravnborg.org>
$ git log -1 febf7ea4bedcd36fba0843db726bba28d22bf89a
commit febf7ea4bedcd36fba0843db726bba28d22bf89a
Author:  <sam@mars.ravnborg.org>
Date:   Tue Jan 3 11:35:26 2006 +0100

    gitignore: ignore more generated files

    Signed-off-by: Sam Ravnborg <sam@ravnborg.org>

With git version 1.7.10.4:

$ git rev-list --pretty=format:"%at %ai %aN <%aE>" febf7ea4bedcd36fba0843db726bba28d22bf89a |head -n2
commit febf7ea4bedcd36fba0843db726bba28d22bf89a
  %aN <%aE>

So this might be either a git bug fixed in some revision, or just a difference in behavior. What git version are you running?

@ffagerho
Copy link

I was running git version 1.7.10.4 (which is the current version in Debian stable/wheezy).

The latest dev version of git works correctly, and apparently, everything from at least 1.8.3.rc2. So as you say, it's an issue with an old version of git.

I think this issue could be closed since updating to a more recent git version fixes the problem.

@hoxu
Copy link
Owner

hoxu commented Aug 26, 2013

Indeed. I was just thinking about whether this could warrant a workaround in gitstats, but I guess downstream distro releases that can't upgrade git can't upgrade gitstats either, so there's no point.

Thanks for the details.

@hoxu hoxu closed this as completed Aug 26, 2013
@amadio
Copy link

amadio commented Mar 24, 2016

We have hit this problem in Gentoo as well. The error can be reproduced by running gitstats on https://github.com/gentoo-science/sci. The corresponding Gentoo bug is here: https://bugs.gentoo.org/show_bug.cgi?id=575946

The problem is that git rev-list has invalid UTF-8 characters from authors' names.

I will create a pull request with a possible solution to this problem for @hoxu's review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants