-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(archiveupload): improve performance of archive uploads and validation #742
fix(archiveupload): improve performance of archive uploads and validation #742
Conversation
mount a testing archives directory on disk rather than using tmpfs
90f5ddd
to
71e18e2
Compare
This reverts commit 71e18e2.
9f2889a
to
6c52c80
Compare
Includes a Netty version bump which fixes a performance regression in multipart form-data decoding
@ebaron I'm not too attached to the new Beta API handler I added since the Vert.x 3.9.9 version upgrade brings the performance of the original handler back up to an acceptable level. In my testing using curl, uploading to either the V1 or the Beta endpoints and using the |
I think if the Beta handler is simply a workaround for the Netty performance issue, then it makes sense to drop it. If it has a purpose going forward then we can keep it. I did notice that the linked issue netty/netty#10508 was supposedly fixed in 4.1.60.Final, which is used by Vert.x 3.9.7. I wonder if there's another bug. |
There was another Netty bug: netty/netty#11188 I updated the linked issue to mention this but forgot to update this PR's description of the problem too, sorry. That Netty fix is in 4.1.65.Final, or Vert.x 3.9.8+. |
From what I can tell, it may have been 10508 that's responsible for the slow apparent upload speed I observed - although that should apparently be fixed in Vert.x 3.9.7, but it seems to match the description of what I see occuring. That's not actually an upload speed limitation but seems to be Netty being slow to process data out of TCP buffers, causing the transfer rate to suffer. There may be another related Vert.x/Netty bug causing the transfer rate to suffer however, other than 10508 - not sure. 11188 causes additional memory pressure when processing the file upload, so that's responsible for (or heavily contributes to) the OOM side of the original issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code changes look good. I just need to test it with that large JFR file.
I sent you an email with a link to it. |
…ordingName handler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested out in CRC with the provided JFR file. Uploaded quickly.
…tion (#742) * write archive uploads into the existing archives directory mount a testing archives directory on disk rather than using tmpfs * perform faster and lighter validation of uploaded binaries * implement binary file uploading beta recordings handler * document beta api handlers * pause/resume request to ensure all buffer data is correctly handled * Set content-type in example * Allow Content-Type headers in CORS * refactor to use more vertx async io * handle read timeouts * use ctx.fail rather than throwing, to ensure cleanup endhandler is invoked * fix(recordingupload): delete badly named uploaded temp files * fix(templatesupload): delete badly named uploaded files * fix(bodyhandlers): do not handle file uploads where not expected * create and open file in one step rather than two * improve stream-to-file handling * ensure file-uploads dir is created * Revert "improve stream-to-file handling" This reverts commit 71e18e2. * replace System.nanoTime with Clock.getMonotonicTime * replace makeAsyncResult methods with FutureFactory * fixup! ensure file-uploads dir is created * reorder request header and param validations * Update doc with required headers * apply spotless formatting * correct broken unit tests * remove debug logging * delete invalid uploaded files * Upgrade vertx to 3.9.9 Includes a Netty version bump which fixes a performance regression in multipart form-data decoding * Remove POST /api/beta/recordings handler * apply spotless formatting * remove unneeded CORS change after removal of Beta API handler * doc(httpapi): remove reference to Beta POST /api/beta/recordings/:recordingName handler (cherry picked from commit 460e2fd) # Conflicts: # run.sh
…tion (backport #742) (#746) * fix(archiveupload): improve performance of archive uploads and validation (#742) * write archive uploads into the existing archives directory mount a testing archives directory on disk rather than using tmpfs * perform faster and lighter validation of uploaded binaries * implement binary file uploading beta recordings handler * document beta api handlers * pause/resume request to ensure all buffer data is correctly handled * Set content-type in example * Allow Content-Type headers in CORS * refactor to use more vertx async io * handle read timeouts * use ctx.fail rather than throwing, to ensure cleanup endhandler is invoked * fix(recordingupload): delete badly named uploaded temp files * fix(templatesupload): delete badly named uploaded files * fix(bodyhandlers): do not handle file uploads where not expected * create and open file in one step rather than two * improve stream-to-file handling * ensure file-uploads dir is created * Revert "improve stream-to-file handling" This reverts commit 71e18e2. * replace System.nanoTime with Clock.getMonotonicTime * replace makeAsyncResult methods with FutureFactory * fixup! ensure file-uploads dir is created * reorder request header and param validations * Update doc with required headers * apply spotless formatting * correct broken unit tests * remove debug logging * delete invalid uploaded files * Upgrade vertx to 3.9.9 Includes a Netty version bump which fixes a performance regression in multipart form-data decoding * Remove POST /api/beta/recordings handler * apply spotless formatting * remove unneeded CORS change after removal of Beta API handler * doc(httpapi): remove reference to Beta POST /api/beta/recordings/:recordingName handler (cherry picked from commit 460e2fd) # Conflicts: # run.sh * fixup! fix(archiveupload): improve performance of archive uploads and validation (#742) Co-authored-by: Andrew Azores <aazores@redhat.com>
Fixes #740
The primary fix in this PR is in the V1
RecordingPostHandler
- the implementation of the function for validating JFR binary data has been changed from one that parses the data and constructs a list of events - potentially using a threadpool - to one that simply serially parses the JFR binary chunk-by-chunk, thereby ensuring that that binary is parseable as JFR data and contains at least 1 chunk of data. This is simpler, faster, and less resource intensive, and provides strong enough validation of the uploaded data for this use case.Along the way I discovered and fixed some other minor deficiencies in uploaded file handling, for
RecordingPostHandler
and others using the Vert.xBodyHandler
class - namely that body handlers in several instances were accepting file uploads (the default) where only a standard HTTP form upload was expected, or that invalid file uploads would result in a rejected API request but the uploaded file would remain in Vert.x's temporary file-upload storage.Finally, I also determined that with large (50MB+) JFR files,
POST
ing to the V1RecordingPostHandler
is very slow - before the Cryostat handler implementation even gets to run. The VertxBodyHandler
itself is actually quite slow here and consumes significant memory while processing the multipart form-data file upload:1. As an initial experimental workaround I have implemented the Beta API version of theDropped from the PRRecordingPostHandler
, which accepts the JFR data directly as the POST request body without multipart file boundaries etc., and which directly handles the streaming data instead of delegating to a Vert.x body handler. This has some caveats, including that theExpect: 100-continue
header must be sent by the requesting client - this is respected bycURL
by default but is not implemented by Firefox nor Chromium, so it is only reliably useful from the CLI and not the web-client. However, the upload performance using this handler is superior - there is no multipart decode occurring and the upload is handled using streaming IO to the filesystem, and then proceeds to the chunk-by-chunk JFR binary validation.2. Diving deeper into the cause of the poor Vertx Body Handler performance I discovered there was a Netty performance regression in the multipart decoder (netty/netty#10508). Upgrading the Vert.x version from 3.9.7 to 3.9.9 also bumps the Netty version, and the original V1 API upload handler using multipart form-data uploads performs nearly as well as the Beta API binary file upload handler after the upgrade, while still remaining compatible with web browsers (not requiring the
Expect:100-continue
header).This second point is responsible for the vast majority of the performance fix in the PR. Simply using a newer version of Netty (4.1.60.Final -> 4.1.67.Final) makes the
BodyHandler
run at a much more reasonable speed and I haven't triggered the OOM kill with it, either. Combined with the slimmed down JFR parsing, the overall resource overhead is greatly reduced.I have a ~150MB JFR file that I can share privately for reviewers to test this PR with.