Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pkg/ottl] Support community ID network flow #34062

Closed
mashhurs opened this issue Jul 12, 2024 · 9 comments
Closed

[pkg/ottl] Support community ID network flow #34062

mashhurs opened this issue Jul 12, 2024 · 9 comments
Labels
discussion needed Community discussion needed enhancement New feature or request pkg/ottl

Comments

@mashhurs
Copy link
Contributor

mashhurs commented Jul 12, 2024

Component(s)

pkg/ottl

Is your feature request related to a problem? Please describe.

What is a community ID and why do we need it?

  • It is a single unique ID based on the network flow info. It is an additional flow identifier and doesn't replace existing flow identification mechanisms already supported by the monitors. See the specification.
  • When monitoring/analyzing network flow, for example threat hunting security use cases, it's often required to make "joins" on network source and destination info where community_id simplifies, also gives a better user experience when analyzing the data (aggregate by community ID to collect statistics, etc.)
  • visual example

The feature is widely used and here some reference applications:

Describe the solution you'd like

Introduce a converter which calculates the community ID based on the specification.

Describe alternatives you've considered

This requires a discussion of either

Additional context

No response

@mashhurs mashhurs added enhancement New feature or request needs triage New item requiring triage labels Jul 12, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@evan-bradley
Copy link
Contributor

evan-bradley commented Jul 26, 2024

Thanks for the detailed description @mashhurs. Can you comment on any usage outside Elastic? I'm not familiar with community ID, and while I see a handful of implementations on the specification repository you linked, it's not clear to me whether this function would be useful to a significant portion of OTTL users.

I'd also welcome input from others in the community if they are using community IDs and would like to see this function added to OTTL.

@evan-bradley evan-bradley added discussion needed Community discussion needed and removed needs triage New item requiring triage labels Jul 26, 2024
@evan-bradley
Copy link
Contributor

Based on the graphic you provided for computing a community ID, it also looks like OTTL could create these IDs if it had a base64 encoding function. Would that be sufficient for you?

@mashhurs
Copy link
Contributor Author

Thank you @evan-bradley for feedback.

Can you comment on any usage outside Elastic? I'm not familiar with community ID, and while I see a handful of implementations on the specification repository you linked, it's not clear to me whether this function would be useful to a significant portion of OTTL users.

Community ID is broadly used in networking solutions/services, especially in SIEM. Outside of Elastic, there are number of vendors/solutions applied community-id, some references:

By creating an OTTL community-id, we could help downstream services to correlate their datasets easily, avoiding multiple joins on tuples. AND, perform operations (creating alert, setup dashboards, etc...) on interest network flows.

Based on the graphic you provided for computing a community ID, it also looks like OTTL could create these IDs if it had a base64 encoding function. Would that be sufficient for you?

Community ID is version:hash-value-of-tuple (tuple: address, port and protocol) combination which generates an unique ID based on network address, port and protocol. Providing base64 encoding function would open a way to achieve the goal but it still requires a computation (I don't think single line config would make it).

Since community ID is a known concept in network analysis, I believe a OTTL function to generate community ID will provide lots of benefits in downstream systems.

@evan-bradley
Copy link
Contributor

Thank you for the additional details.

Community ID is version:hash-value-of-tuple (tuple: address, port and protocol)

Where do the address, port, and protocol come from? Are they attached to the data, or is it expected they come from context set by receivers? If they come from context, do you think this functionality may make sense as a separate processor instead of as an OTTL function?

@mashhurs
Copy link
Contributor Author

mashhurs commented Aug 1, 2024

Thank you for the additional details.

Community ID is version:hash-value-of-tuple (tuple: address, port and protocol)

Where do the address, port, and protocol come from? Are they attached to the data, or is it expected they come from context set by receivers? If they come from context, do you think this functionality may make sense as a separate processor instead of as an OTTL function?

I wonder if OTTL function will be useful in other processors (such as filter with community-id, delete operation if interest network flow found, etc...)

ADD more thoughts: I am not super familiar with processors and its behaviors but with we are calculating community-id, not doing any actions on context like processors do (such as batch, memory_limit, etc...), and no need reject, retry mechanisms.

I will be curious about your opinion as well.

@evan-bradley
Copy link
Contributor

ADD more thoughts: I am not super familiar with processors and its behaviors but with we are calculating community-id, not doing any actions on context like processors do (such as batch, memory_limit, etc...), and no need reject, retry mechanisms.

I mostly mean through similar mechanisms like how the k8sattributes processor works: for a given data payload, it looks at the attached connection metadata for which IP, port, etc. sent the payload to the Collector, then uses that to enrich the payload. It sounds like community ID functions in a similar way.

@mashhurs
Copy link
Contributor Author

ADD more thoughts: I am not super familiar with processors and its behaviors but with we are calculating community-id, not doing any actions on context like processors do (such as batch, memory_limit, etc...), and no need reject, retry mechanisms.

I mostly mean through similar mechanisms like how the k8sattributes processor works: for a given data payload, it looks at the attached connection metadata for which IP, port, etc. sent the payload to the Collector, then uses that to enrich the payload. It sounds like community ID functions in a similar way.

Do you want me close this issue and open with new component? Or are you able to update (labels, required fields) this issue?

@mashhurs
Copy link
Contributor Author

Proposed a communityid processor, closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion needed Community discussion needed enhancement New feature or request pkg/ottl
Projects
None yet
Development

No branches or pull requests

2 participants