Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monitor memory limits in the Arrow producer #45

Closed
jmacd opened this issue Sep 14, 2023 · 3 comments
Closed

Monitor memory limits in the Arrow producer #45

jmacd opened this issue Sep 14, 2023 · 3 comments

Comments

@jmacd
Copy link
Contributor

jmacd commented Sep 14, 2023

We have observed ErrConsumerMemoryLimit being returned in production setups.

var ErrConsumerMemoryLimit = fmt.Errorf(
	"The number of decoded records is smaller than the number of received payloads. " +
		"Please increase the memory limit of the consumer.")

I was able to fix this, as recommended, by raising the memory limit, but it would help raise confidence if we could pinpoint an earlier point in the code where the number of records is corrupted, I think.

Moreover, we need a way to monitor the current allocation of the memory limit, so that I can feel confident that I have MemoryLimit set correctly.

@lquerel
Copy link
Contributor

lquerel commented Sep 14, 2023

To enhance security and prevent potential misuse at an earlier stage, it is imperative to safeguard the decoder against various forms of attacks. Here are the strategies I propose:

  1. Securing the Decompression Layer: start by fortifying the decompression layer against "zip bomb" attacks at both the gRPC and Arrow levels. Refer to my discussion with the Go zstd library maintainer for further insights: Discussion Link.
  2. Setting Protobuf Message Size Limit: Define a parameter to specify the maximum size for a protobuf message.
  3. Memory Limiter Adjustments:
    • Increased Default Limit: Once the above measures are in place, consider being more flexible with the memory limiter by setting a higher default limit.
    • Removing the Limiter: Potentially, we could even remove the limiter entirely, as the previous steps should help to maintain memory usage within a safe range.

This approach clearly necessitates further consideration. However, these are my initial thoughts on this issue.

@jmacd
Copy link
Contributor Author

jmacd commented Oct 6, 2023

Marking closed. The new instrumentation has been effective in detecting a memory accounting leak.

@jmacd jmacd closed this as completed Oct 6, 2023
@jmacd
Copy link
Contributor Author

jmacd commented Oct 6, 2023

See #56 and #52.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants