Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proxy] Proxy uses a lot of heap memory when uploading large function jar files #10908

Closed
lhotari opened this issue Jun 11, 2021 · 2 comments · Fixed by #10944
Closed

[Proxy] Proxy uses a lot of heap memory when uploading large function jar files #10908

lhotari opened this issue Jun 11, 2021 · 2 comments · Fixed by #10944
Labels
type/bug The PR fixed a bug or issue reported a bug

Comments

@lhotari
Copy link
Member

lhotari commented Jun 11, 2021

Describe the bug

Proxy uses a lot of heap memory when uploading large function jar files. This is due to the buffering solution added as part of #5361 . The solution buffers also very large uploads to memory.

To Reproduce

Create a function with a large jar file to reproduce.
The repro for #10906 , https://github.com/lhotari/pulsar-playground/tree/master/proxy-tls-issue-repro-2.7.2 , reproduces this issue too.

Expected behavior

The proxy heap memory usage shouldn't grow out of bounds. File uploads shouldn't be buffered in heap memory.

@lhotari lhotari added the type/bug The PR fixed a bug or issue reported a bug label Jun 11, 2021
@lhotari
Copy link
Member Author

lhotari commented Jun 11, 2021

@addisonj would you like to pick this one?

@lhotari
Copy link
Member Author

lhotari commented Jun 17, 2021

@addisonj I created #10944 to fix this issue. It also adds a unit test for the "replay" functionality added in #5361 . Please review

sijie pushed a commit that referenced this issue Jun 22, 2021
Fixes #10908

### Motivation

Pulsar Proxy uses a lot of heap memory when uploading large function jar files. This also leads to high GC activity since a continuous block of memory (byte array for the size of the upload) is allocated. GC will have to do compaction for the heap (which gets fragmented) to find a continuous block of memory. This is the reason why allocating large arrays are costly from GC perspective. 

The buffering solution added as part of #5361. The solution buffers also very large uploads to memory.

### Modifications

* Limit the replay buffer size to a configurable limit which defaults to 5MB. This is configured with the `httpInputMaxReplayBufferSize` proxy configuration parameter.
* Add unit test to see that buffer size gets limited
* Add unit test for #5361
eolivelli pushed a commit to datastax/pulsar that referenced this issue Jun 23, 2021
Fixes apache#10908

### Motivation

Pulsar Proxy uses a lot of heap memory when uploading large function jar files. This also leads to high GC activity since a continuous block of memory (byte array for the size of the upload) is allocated. GC will have to do compaction for the heap (which gets fragmented) to find a continuous block of memory. This is the reason why allocating large arrays are costly from GC perspective.

The buffering solution added as part of apache#5361. The solution buffers also very large uploads to memory.

### Modifications

* Limit the replay buffer size to a configurable limit which defaults to 5MB. This is configured with the `httpInputMaxReplayBufferSize` proxy configuration parameter.
* Add unit test to see that buffer size gets limited
* Add unit test for apache#5361

(cherry picked from commit 2324618)
yangl pushed a commit to yangl/pulsar that referenced this issue Jun 23, 2021
Fixes apache#10908

### Motivation

Pulsar Proxy uses a lot of heap memory when uploading large function jar files. This also leads to high GC activity since a continuous block of memory (byte array for the size of the upload) is allocated. GC will have to do compaction for the heap (which gets fragmented) to find a continuous block of memory. This is the reason why allocating large arrays are costly from GC perspective. 

The buffering solution added as part of apache#5361. The solution buffers also very large uploads to memory.

### Modifications

* Limit the replay buffer size to a configurable limit which defaults to 5MB. This is configured with the `httpInputMaxReplayBufferSize` proxy configuration parameter.
* Add unit test to see that buffer size gets limited
* Add unit test for apache#5361
kaushik-develop pushed a commit to kaushik-develop/pulsar that referenced this issue Jun 24, 2021
Fixes apache#10908

### Motivation

Pulsar Proxy uses a lot of heap memory when uploading large function jar files. This also leads to high GC activity since a continuous block of memory (byte array for the size of the upload) is allocated. GC will have to do compaction for the heap (which gets fragmented) to find a continuous block of memory. This is the reason why allocating large arrays are costly from GC perspective. 

The buffering solution added as part of apache#5361. The solution buffers also very large uploads to memory.

### Modifications

* Limit the replay buffer size to a configurable limit which defaults to 5MB. This is configured with the `httpInputMaxReplayBufferSize` proxy configuration parameter.
* Add unit test to see that buffer size gets limited
* Add unit test for apache#5361
nicoloboschi pushed a commit to datastax/pulsar that referenced this issue Feb 28, 2022
Fixes apache#10908

Pulsar Proxy uses a lot of heap memory when uploading large function jar files. This also leads to high GC activity since a continuous block of memory (byte array for the size of the upload) is allocated. GC will have to do compaction for the heap (which gets fragmented) to find a continuous block of memory. This is the reason why allocating large arrays are costly from GC perspective.

The buffering solution added as part of apache#5361. The solution buffers also very large uploads to memory.

* Limit the replay buffer size to a configurable limit which defaults to 5MB. This is configured with the `httpInputMaxReplayBufferSize` proxy configuration parameter.
* Add unit test to see that buffer size gets limited
* Add unit test for apache#5361

(cherry picked from commit 2324618)
(cherry picked from commit 7fa88cc)
bharanic-dev pushed a commit to bharanic-dev/pulsar that referenced this issue Mar 18, 2022
Fixes apache#10908

### Motivation

Pulsar Proxy uses a lot of heap memory when uploading large function jar files. This also leads to high GC activity since a continuous block of memory (byte array for the size of the upload) is allocated. GC will have to do compaction for the heap (which gets fragmented) to find a continuous block of memory. This is the reason why allocating large arrays are costly from GC perspective. 

The buffering solution added as part of apache#5361. The solution buffers also very large uploads to memory.

### Modifications

* Limit the replay buffer size to a configurable limit which defaults to 5MB. This is configured with the `httpInputMaxReplayBufferSize` proxy configuration parameter.
* Add unit test to see that buffer size gets limited
* Add unit test for apache#5361
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant