Git-dumper doesn't work in some cases when the git output have HTML content-type #25

DEMON1A · 2021-05-15T03:45:19Z

I found a public git folder on some website. but during using git-dumper to dump the code out from the git folder i got these errors:

[-] Testing https://example.com/.git/HEAD [200]
[-] https://example.com//.git/HEAD responded with HTML

I checked the website manually and I can clearly see the git folder content is leaked. but git-dumper refuses to dump it since the data comming out from it is in HTML content-type. that will disallow git-dumper from dumping some cases.

The text was updated successfully, but these errors were encountered:

arthaud · 2021-05-15T15:19:02Z

I think originally I was only checking whether the content contains "" but people had issues with that, see #13
@DashLt do you know what was the issue with the original check?
In the meantime you can replace line 33 of git_dumper.py with a return False.

DEMON1A · 2021-05-15T15:38:42Z

Yeah I already edited that line of code before. but the issue was still there. then i noticed there's a second layer of validation on line 73 do the same thing as 33. edited it and now it's working for me.

DashLt · 2021-05-15T17:07:10Z

Not every site has a <html> tag verbatim. Many have attributes inside the tag, e.g.:

<html class="rwd geo-override no-js vis no-rtl headerfooter-menu3 " lang="en">

It's weird that whatever webserver in the site you're attacking isn't using the application/octet-stream content-type, but it exists so it's definitely an edge case that has to be handled. As a quick and dirty thing you could check for the existence of <html, but even then that tag isn't necessarily required. I think maybe some sort of HEAD file validation is in order?

arthaud · 2021-05-16T04:04:24Z

That's also my conclusion. We would need a reference syntax checker. or we could just skip the verification on that file and fail later when we parse objects file (which need to be compressed with zlib, so that rules out html).

DEMON1A · 2021-05-16T06:05:30Z

Not every site has a tag verbatim. Many have attributes inside the tag, e.g.:

You can solve this with regex, Pattern: \<html(|.*)\>

DEMON1A · 2021-05-16T08:01:16Z

If you gonna accept the RE solution, I can do the fixes on PR if you would like.

DashLt · 2021-05-16T13:28:25Z

You can solve this with regex, Pattern: \<html(|.*)\>

https://stackoverflow.com/a/1732454

(In all seriousness, running a regex that matches that much could cause serious slowdowns on pages that can easily reach the hundreds of KB or even MB. You would also be able to send git-dumper back a very large page and make it hang as well. It's in general just a very hacky solution.)

DEMON1A · 2021-05-16T15:44:45Z

You seems to be right, but I guess in this case we don't really need that HTML content-type validation if we already know that it contains a content from the GIT folder. for example checking a string on /.git/config will be more than fine to keep fetching other stuff without caring about content-type.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Git-dumper doesn't work in some cases when the git output have HTML content-type #25

Git-dumper doesn't work in some cases when the git output have HTML content-type #25

DEMON1A commented May 15, 2021

arthaud commented May 15, 2021

DEMON1A commented May 15, 2021

DashLt commented May 15, 2021 •

edited

Loading

arthaud commented May 16, 2021

DEMON1A commented May 16, 2021

DEMON1A commented May 16, 2021

DashLt commented May 16, 2021

DEMON1A commented May 16, 2021

Git-dumper doesn't work in some cases when the git output have HTML content-type #25

Git-dumper doesn't work in some cases when the git output have HTML content-type #25

Comments

DEMON1A commented May 15, 2021

arthaud commented May 15, 2021

DEMON1A commented May 15, 2021

DashLt commented May 15, 2021 • edited Loading

arthaud commented May 16, 2021

DEMON1A commented May 16, 2021

DEMON1A commented May 16, 2021

DashLt commented May 16, 2021

DEMON1A commented May 16, 2021

DashLt commented May 15, 2021 •

edited

Loading