Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

diff-so-fancy slurps entire git log -p output? very slow #140

Closed
bronson opened this issue Mar 30, 2016 · 25 comments
Closed

diff-so-fancy slurps entire git log -p output? very slow #140

bronson opened this issue Mar 30, 2016 · 25 comments

Comments

@bronson
Copy link
Contributor

bronson commented Mar 30, 2016

Using Feb 7th's diff-so-fancy, git log -p and git log --stat would display immediately.

Using today's diff-so-fancy, git log --stat takes many seconds to complete and git log -p hangs for a few minutes and then bombs out.

A guess: it looks like some change in the past 1.5 months made it so diff-so-fancy is now trying to slurp the entire output before displaying the results.

My repo has 4200 commits, so tiny. This issue appears reproducible in any moderately-sized project. For example, git log --stat on rails/rails shows a blank screen for 10 seconds before showing the log. Revert to Feb 7th and it displays immediately.

@stevemao
Copy link
Member

I don't think you should use diff-so-fancy for log pager.

As per readme:

git config --global pager.diff "diff-so-fancy | less --tabs=4 -RFX"
git config --global pager.show "diff-so-fancy | less --tabs=4 -RFX"

@scottchiefbaker
Copy link
Contributor

I agree with @stevemao. d-s-f is designed to process small diffs (commits) in to human readable form, not to parse hundreds of thousands of lines.

My guess for the slowness is that we remove all the leading + and - from the diff before we process any actual content. It'd be pretty hard to work around that for this one use case.

@bronson
Copy link
Contributor Author

bronson commented Mar 30, 2016

Ah, the recommended config changed on Feb 12: 5f7422a

Updated my config to match and all is fine. Thanks.

@bronson bronson closed this as completed Mar 30, 2016
@bronson
Copy link
Contributor Author

bronson commented Mar 30, 2016

That said.... it's interesting that Feb 7th's d-s-f could handle being the core.pager and today's can't.

bronson added a commit to bronson/dotfiles that referenced this issue Mar 30, 2016
This reverts commit a6b5801.

Too goddamn slow: so-fancy/diff-so-fancy#140
@blueyed
Copy link
Contributor

blueyed commented Mar 31, 2016

I agree that it would be better if it still could be used with log.

It would be helpful to find the commit that caused this (using git-bisect) and then improve upon that.
Even if the removal/processing of all +/- would cause this, this could still be done in a pipeline then probably?!

@bronson
Please consider re-opening this issue to keep it on the radar.. I came here because of the same issue.

blueyed added a commit to blueyed/dotfiles that referenced this issue Mar 31, 2016
@scottchiefbaker
Copy link
Contributor

I think @bronson found it at a6b5801. That's the commit where we switched most of the backend from sed to Perl so we could do fancier things with the output display. We can investigate the slowness, but I can't make any guarantees as this use case is a little outside the scope of the project.

@blueyed
Copy link
Contributor

blueyed commented Mar 31, 2016

It should still be possible to not read the whole input, but process it in chunks (e.g. 8kb).

@scottchiefbaker
Copy link
Contributor

If I use a 60MB git log -p output I see the following:

  • 1 second to load the whole input in to memory
  • 7 seconds to clean up the input before we process anything

The rest of the time is spent looping through each line one by one... I imagine the problem is the second one (depending on how much input we're talking). As a work around you can completely disable the line pre-processing by commenting out line 22

https://github.com/so-fancy/diff-so-fancy/blob/master/lib/diff-so-fancy.pl#L22

That will get you output at the 1 second mark.

@blueyed
Copy link
Contributor

blueyed commented Mar 31, 2016

@scottchiefbaker
Thanks for investigating.

Looks like the clean_up_input should be moved inside of the loop?!

@bronson
Copy link
Contributor Author

bronson commented Mar 31, 2016

@blueyed agreed, the problem appears to be that this script slurps the entire file before processing. Moving the <> into the loop seems like it would bring the old behavior back.

while(my $line = <>) {
    ...
}

It seems like this script goes out of its way to slurp though... not sure if there's a reason for that.

@scottchiefbaker
Copy link
Contributor

This actually ended up not being too insane to solve, and it will give us other performance improvements. Can you test the branch I just posted here:

https://github.com/scottchiefbaker/diff-so-fancy/tree/pipeline

If I use my 60MB input now, I get output on the screen immediately. I moved back to processing the input line by line instead of buffering the whole input. This is a little more complex code wise, but it should perform a lot better.

@scottchiefbaker
Copy link
Contributor

Also, most importantly, we still pass all tests.

@bronson
Copy link
Contributor Author

bronson commented Mar 31, 2016

@scottchiefbaker looks great!

It appears nobody uses $i anymore... seems like you can drop those lines too?

And, if anyone needs a line number in the future, maybe they could use $..

@scottchiefbaker
Copy link
Contributor

True... $i is used to check if it's the first line only. I was going to use it other places, but ended up not needing it. It's fine for now.

I'll reopen this since we have a fix and wait for feedback on whether it does what you want or not.

@bronson
Copy link
Contributor Author

bronson commented Mar 31, 2016

@scottchiefbaker just tried it out, everything appears to work perfect. Even git log -p in rails/rails.

@scottchiefbaker
Copy link
Contributor

How big is rails/rails?

@bronson
Copy link
Contributor Author

bronson commented Mar 31, 2016

Dunno, doesn't matter! :) Now d-s-f blocks before getting very far. Excellent.

Over 57,000 commits, gotta be hundreds of MB if one lets it run to completion.

@scottchiefbaker
Copy link
Contributor

@stevemao @paulirish what do you guys think? Should we merge this even though it's outside the scope of regular d-s-f?

@stevemao
Copy link
Member

stevemao commented Apr 1, 2016

It's not out of the scope. being smart on processing the lines will give performance boost on git diff too.

@blueyed
Copy link
Contributor

blueyed commented Apr 1, 2016

Thanks, @scottchiefbaker!
Please create a PR for it.

@scottchiefbaker
Copy link
Contributor

This is merged so I'm closing this issue now.

@bronson
Copy link
Contributor Author

bronson commented Apr 4, 2016

Love it, great job!

@Blaisorblade
Copy link

Blaisorblade commented Aug 2, 2017

Any chance this problem slipped in again? git log -p takes lots (on commercialhaskell/stack) on git log -p when enabled as default pager. Lots seems to be 10-20s (timed it to ~15s) rather than instantaneous, so it took me a few minutes of debugging (and looking at this issue) to confirm I just had to wait for output to appear.

EDIT: in case somebody wonders again if this is supported, the help text disagrees:

$ diff-so-fancy
Diff-so-fancy: https://github.com/so-fancy/diff-so-fancy
Version      : 1.1.1

Usage:

git diff --color | diff-so-fancy # Use d-s-f on one diff
git diff --colors                # View the commands to set the recommended colors
git diff --set-defaults          # Configure git-diff to use diff-so-fancy and suggested colors

# Configure git to use d-s-f for *all* diff operations
git config --global core.pager "diff-so-fancy | less --tabs=4 -RFX"

BTW, git diff --colors and git diff --set-defaults are wrong (it's diff-so-fancy that takes those options).

@scottchiefbaker
Copy link
Contributor

@Blaisorblade this should be fixed on the next branch. It will be released officially as 1.2.0 when we're ready. See #251 for more info.

@bryantee
Copy link

Thanks for clearing up @scottchiefbaker

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants