Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java: Add line break sanitizers to java/log-injection #10707

Merged
merged 2 commits into from
Apr 19, 2023

Conversation

atorralba
Copy link
Contributor

@atorralba atorralba commented Oct 6, 2022

Addresses #10702.

Note that this has two shortcomings that could become sources of FNs:

  1. It considers a valid sanitizer a replacement of just \n or \r instead of both.
  2. It considers a valid guard a regex match of anything containing \n or \r and not matching [^%]. This means something like matches("\n") is considered a valid guard, when it isn't.

@Marcono1234
Copy link
Contributor

For the regex checks, should this also consider \\n and \\r in the String values since those are also valid regex constructs? Though I am not sure how often that would be used compared to the String literal escape sequences \n and \r, so this might make the query too complicated.
Also what about the regex construct \R?

@atorralba
Copy link
Contributor Author

At first glance I thought that \\n and \\r would get covered by %\n% and %\r% but in retrospect it makes sense that they didn't. And I definitely overlooked \\R.

Thanks @Marcono1234, good catches! Fixed in 93d898e.

@vlkl-sap
Copy link

vlkl-sap commented Nov 9, 2022

Hello,
in general, log injection is not only about new lines but also about other problematic content (ANSI sequences, Javascript, PHP, log4j interpolators, etc.), which can open doors to further compromise of the analyzed system or of another one. I wonder if it is a conscious decision to disregard these attack vectors.

@atorralba
Copy link
Contributor Author

Hey @vlkl-sap, thanks for your comment.

While we agree that arbitrarily written logs can cause all sort of injection problems in other applications that receive and process them, this query is scoped specifically for log forging in the same application (the threat model of which is consistent with considering new line removal as valid sanitization). We disregard the other attack vectors in this query because we think they should be handled by their own, specific queries with specific sinks (XSS, JNDI injection, SQLi, OS command execution, etc). Of course that would need modeling of log reads as local sources for those queries, but that’s another topic that could be handled separately.

This approach not only would provide a better alert location for developers that need to fix the issue (it’s not realistic to expect validation/sanitization for all sorts of vulnerabilities at the log write, but rather where it’s read and used dangerously), but also allows us to better scope this query with specific sinks and sanitizers.

If the threat model for a particular codebase or team requires treating any logged values from user input as a potential security vulnerability, then a custom query could be created using the sinks defined for this query without any sanitizers.

@vlkl-sap
Copy link

Hi @atorralba! Thanks for your reply! I am wondering what exactly "log forging in the same application" means. Since applications typically do not read their own log files, log forging is almost always a problem for some entity outside of the application.

An interesting exception is log4shell, where injecting non-newline characters into a log causes a problem in the application under analysis. This furthers the point that focusing on newline injection is rather limiting.

I would also say that applications can and should try to protect other entities against a wide class of vulnerabilities by being careful of how they log things.

@jpcmonster
Copy link

@atorralba any sense of when this might get merged? :)
Facing the same problem (we think) and wondering how to make CodeQL happy :)
Is there a change we can make in order to sanitize and clear this alert that will pass CodeQL (while still being correct? :) )

egregius313
egregius313 previously approved these changes Apr 14, 2023
Copy link
Contributor

@egregius313 egregius313 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Handle more regex cases that cover line breaks
@atorralba atorralba force-pushed the atorralba/log-injection-sanitizers branch from 93d898e to f5702f5 Compare April 17, 2023 07:33
@atorralba
Copy link
Contributor Author

DCA looks uneventful. Sorry @egregius313 your review got dismissed after a rebase.

@egregius313 egregius313 self-requested a review April 18, 2023 18:11
Copy link
Contributor

@egregius313 egregius313 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@atorralba atorralba merged commit 62f5a5d into github:main Apr 19, 2023
@atorralba atorralba deleted the atorralba/log-injection-sanitizers branch April 19, 2023 06:22
@atorralba
Copy link
Contributor Author

@jpcmonster:

@atorralba any sense of when this might get merged? :)

It just was :)

Facing the same problem (we think) and wondering how to make CodeQL happy :) Is there a change we can make in order to sanitize and clear this alert that will pass CodeQL (while still being correct? :) )

The sanitizers that were just merged will be published in the next release. Until then, I'm afraid the only way to make the alerts disappear is to not append user input to logs.

@vlkl-sap:

We appreciate the feedback you gave us about this query. Based on that, we have plans to clarify the differences between log injection and log forging, where sanitizers (as well as precision and severity) may apply differently. This distinction will come in future PRs.

@jpcmonster
Copy link

Thanks again @atorralba ; we are now seeing this working in our GHE 3.8.2 instance.
Quick question: I suspect passing the check requires the use of String.replaceAll(), right?
I may have tried to get too fancy with Pattern and Matcher, and while the result is the same, it still upsets CodeQL :) - here's what I mean:

final Matcher matcher = PATTERN.matcher( input );
if( matcher.find() )
{
	final String sanitized = matcher.replaceAll( REPLACEMENT );
	LOG.warn( "possible log tampering input; replacing all '{}' in string: '{}'", PATTERN, sanitized  );
	return sanitized;
}

@atorralba
Copy link
Contributor Author

Hey @jpcmonster, unfortunately it's hard to model all variations of how this vulnerability could get mitigated. You can see examples of what we consider vulnerable and what we consider safe in the query test.

String.replaceAll, String.replace, String.matches, and Pattern.matches are all recognized solutions.

Still, your example is common enough that we could consider supporting it. I'll take note of this potential improvement, thanks :)

@otap63
Copy link

otap63 commented Feb 8, 2024

Hello,
I have a case where I sanitize user inputs using log4j2 Pattern rule to escape '\n' and '\r' using the encode pattern "%encode{%msg}{CRLF}" in order to mitigate a LogInjection high vulnebarility (CWE-117) issue reported by CodeQL in my Java code. The problem is that CodeQL is not happy as it still reports the same set of LogInjection issues after the sanitization.

If I sanitize the user input using the following method, CodeQL is happy.

private String escapeCRLF(String str) {
    return (str == null ? null : str.replace("\n", "\\n" ).replace("\r", "\\r" ));
}

So, apparently, CodeQL has a rule to recognize the 2nd mitigation method but somehow it misses the sanitization provided via log4j2 encoding CRLF rule, which has the exact same functionality as the above escapeCRLF method. I like the 1st solution which is uniform throughout the code base, requiring no code change. So, I am wondering if you would know how to make CodeQL happy if I deploy the Log4j2 solution.

Thanks in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants