-
Notifications
You must be signed in to change notification settings - Fork 264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add compacting body filters #357
Add compacting body filters #357
Conversation
} | ||
} | ||
|
||
private Document documentWithoutTextNodes(final String body) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems rather inefficient, a custom sax or stax parser should have much less overhead.
There are also some security implication when parsing the request payload, see https://en.wikipedia.org/wiki/XML_external_entity_attack and https://www.owasp.org/index.php/XML_External_Entity_(XXE)_Prevention_Cheat_Sheet#JAXP_DocumentBuilderFactory.2C_SAXParserFactory_and_DOM4J for how to prevent this in java.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would agree with a fact, that sax/stax parsing will be more effective than dom in general.
Unfortunately, potential benefit from introducing custom parser with supporting rules will be miserable, due to fact of parsing already allocated in memory string .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had applied recommended changes, according to XXE
541b54b
to
9b10a7d
Compare
But from what I see there is nothing shared, is there? |
Well, it shared not as part of every single formatter, but as part of infrastructural body filters.
|
I failed to realize that you moved the compacting from the formatter to the filter. We could think about removing it from the |
} | ||
|
||
private boolean shouldCompact(@Nullable final String contentType, final String body) { | ||
return contentTypes.test(contentType) && heuristic.isProbablyJson(body); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
&& !jsonCompactor.isCompacted(body)
?
} | ||
|
||
/** | ||
* @return configured {@link DocumentBuilderFactory} against XML External Entity (XXE). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add the links provided by Jörn here as @see
tags?
|
||
@Override | ||
public String filter(@Nullable final String contentType, final String body) { | ||
return contentTypes.test(contentType) ? compact(body) : body; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we cheaply detect whether the body may not even need compacting, e.g. when it doesn't contain a new line?
Due to backward compatibility ( |
04c271b
to
2beb3ed
Compare
} | ||
|
||
private boolean shouldCompact(final String body) { | ||
return Stream.of("<?xml", "\n", " ").anyMatch(body::contains); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The xml declaration is redundant and should be covered by the content type check, shouldn't it? Spaces would be generally ok, since the whole log message would still fit into a single line.
I'd just change this in the same way as the json compactor:
return body.indexOf('\n') != -1;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's quite possible to have xml prolog
inside of the message, but do not have \n
inside.
So, to not check <?xml
potentially means to keep it in compacted version (due to compact skip)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But if there are no newlines, what exactly would you expect to happen when compacting? Removing other whitespaces?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well I got your point. We have different points of view to minimisation. I consider extra whitespaces and xml prolog as too verbose in logs. But I could leave with, cause single line is much more important
So do we have some open tasks, preventing that PR ? |
I released your changes as 1.10.0. 🎉 |
Description
Add body filters to compact most common content types
Motivation and Context
Custom
HttpLogForrmatter
s (different fromJsonHttpLogFormatter
) have also need to minify message body. To reduce code duplication from formatter to formatter, it could be extracted as delegating body filters.Types of changes
Checklist: