Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Impl. size-dependent request body compression #4283

Merged

Conversation

pvannierop
Copy link
Contributor

@pvannierop pvannierop commented Jun 3, 2022

Problem 1

When loading studies with large number of samples there are a large number of endpoints that try to send extended lists of sample identifiers to the backend. In uncompressed for this can take minutes for ~100k samples. This bottlenet can be removed by compressing the request body before sending. At present this is implemented for specific endpoints. I found out that certain (new?) endpoints are not listed for compression and therefore lock the frontend in an inresponsive state. Therefore, the mechanism of selecting specific endpoints for compression does not effectively solve the problem of large request bodies when the configuration only partially covers problematic endpoints.

Problem 2

The current request body compression is only applied to endpoints of the public API. Certain endpoints of the internal API are very slow.

Solution

I propose to evaluate body of every POST request for size and compress the body when the number of characters in the JSON string exceeds a certain threshold (now arbitrarily set at 500 characters). The compression mechanism covers both public and internal API's.

@pvannierop pvannierop requested review from alisman and Luke-Sikina June 3, 2022 12:39
@pvannierop pvannierop self-assigned this Jun 3, 2022
@pvannierop pvannierop force-pushed the sizedependent_req_body_compression branch from 016124a to 8a1d3f3 Compare June 3, 2022 12:43
body = pako.gzip(JSON.stringify(body)).buffer;
if (method === 'POST') {
var bodyString = JSON.stringify(body);
if (bodyString.length > REQ_BODY_SIZE_CHAR_LIMIT) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How expensive would it be to do new Blob([bodyString]).size? Using char count to estimate size is a bit unreliable.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Luke-Sikina it's just a heuristic right? and why is it innacurate?

Copy link
Contributor Author

@pvannierop pvannierop Jun 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Luke-Sikina I also reasoned that a precise estimate is not needed here. We just need the least resource-intensive estimate of request size.

@pvannierop pvannierop force-pushed the sizedependent_req_body_compression branch from a1f3e6e to 8a1d3f3 Compare June 3, 2022 16:30
For both Public and Internal API
@pvannierop pvannierop force-pushed the sizedependent_req_body_compression branch from 8a1d3f3 to e2fe4da Compare June 3, 2022 16:38
@alisman alisman merged commit 1ea79f5 into cBioPortal:master Jun 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants