-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CryptoStreamFactory creates sympathetic chunking OutputStreams #587
Conversation
For detailed analysis of the problem, see #586 CipherOutputStream appears to perform _very_ poorly on large buffers, but substantially better when data is segmented into small chunks which can be done by looping over the original buffer. With this in place, the openssl wrappeer no longer provides any benefit over JCE.
Generate changelog in
|
* in order to prevent degraded performance on large buffers as described in | ||
* <a href="https://github.com/palantir/hadoop-crypto/pull/586">hadoop-crypto#586</a>. | ||
*/ | ||
static final class ChunkingOutputStream extends FilterOutputStream { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason a BufferedOutputStream
won't work for you here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep! BufferedOutputStream actually does the opposite of what we want here! Given a large input (beyond the configured buffer size), the BufferedOutputStream will flush any data in its buffer, and write the input buffer directly to the delegate stream as is.
In most cases a BufferedOutputStream is helpful to avoid native overhead for crypto, but I think that's a bit outside of the scope of this change, and introducing a buffered stream around our chunking stream may result in additional unnecessary copies.
For detailed analysis of the problem, see
#586
CipherOutputStream appears to perform very poorly on large
buffers, but substantially better when data is segmented into
small chunks which can be done by looping over the original buffer.
With this in place, the openssl wrappeer no longer provides any
benefit over JCE.
==COMMIT_MSG==
CryptoStreamFactory creates sympathetic chunking OutputStreams with performance characteristics matching the apache commons-crypto implementation
==COMMIT_MSG==