Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Filebeat][HTTP Endpoint] Add a functionality to decode gzip content and process the ndjson data #31005

Closed
darshan-elastic opened this issue Mar 25, 2022 · 7 comments · Fixed by #31061
Assignees
Labels
enhancement good first issue Indicates a good issue for first-time contributors

Comments

@darshan-elastic
Copy link
Contributor

This is a request for the following feature to be added to the http_endpoint input.

  1. Decode the gzip content when the Content-Encoding header presents with gzip value.
  2. Ability to process the ndjson data.
@elasticmachine
Copy link
Collaborator

Pinging @elastic/security-external-integrations (Team:Security-External Integrations)

@jamiehynds jamiehynds added good first issue Indicates a good issue for first-time contributors enhancement labels Mar 28, 2022
@andrewkroh
Copy link
Member

Regarding gzip, what I would expect is that incoming POST requests that contain Content-Encoding: gzip have the request bodies automatically decoded.

And similarly for requests with Content-Type: application/ndjson or Content-Type: application/x-ndjson that each JSON document generates a separate event.

@andrewkroh
Copy link
Member

The two features are essentially independent and it would be easier to review if the changes were contributed as two separate PRs.

@jamiehynds
Copy link

@andrewkroh after discussing with the field team and Zscaler, we have a few users keen to get CloudNSS up and running, but we're blocked by the gzip support. Do you think someone could work on this sooner rather than later?

@jamiehynds
Copy link

Cloudflare Logpush is one of the integration we have planned, and will leverage the http_endpoint input. Looks like like they require gzip support for their events too: https://developers.cloudflare.com/logs/about/

@andrewkroh andrewkroh self-assigned this Mar 29, 2022
andrewkroh added a commit to andrewkroh/beats that referenced this issue Mar 29, 2022
Accept requests with Content-Encoding: gzip.

Closes elastic#31005
@andrewkroh
Copy link
Member

Do you think someone could work on this sooner rather than later?

Yes. Here's a PR to add it. #31061

No changes were required to support NDJSON. That already works today with Content-Type: application/json. It doesn't matter if they are newline separated or not. It just consumes all the JSON objects that it reads and creates one event for each.

@efd6
Copy link
Contributor

efd6 commented Mar 29, 2022

I was thinking about handling of ndjson and it is more fault tolerant than using the encoding/json decoder because a corrupted message will not break the entire stream following the corrupted message, while in the case of the stream decode, it will (and depending on the approach taken to error handling may lose all messages for one corrupted message). To make use of this observation, when working on ndjson you first need to split the input stream on newlines and then work on each message individually, potentially filling the event with a message.error and event.original when the unmarshalling fails.

andrewkroh added a commit that referenced this issue Apr 4, 2022
Accept requests with Content-Encoding: gzip (or x-gzip).

There is a pool of gzip readers to improve performance by reducing allocations.

Closes #31005
emilioalvap pushed a commit to emilioalvap/beats that referenced this issue Apr 6, 2022
…31061)

Accept requests with Content-Encoding: gzip (or x-gzip).

There is a pool of gzip readers to improve performance by reducing allocations.

Closes elastic#31005
kush-elastic pushed a commit to kush-elastic/beats that referenced this issue May 2, 2022
…31061)

Accept requests with Content-Encoding: gzip (or x-gzip).

There is a pool of gzip readers to improve performance by reducing allocations.

Closes elastic#31005
chrisberkhout pushed a commit that referenced this issue Jun 1, 2023
Accept requests with Content-Encoding: gzip (or x-gzip).

There is a pool of gzip readers to improve performance by reducing allocations.

Closes #31005
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement good first issue Indicates a good issue for first-time contributors
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants