Allow es sending to non datastream #73

aspacca · 2022-02-14T11:35:56Z

Breaking change

What does this PR do?

Add support for both index and datstream for sending events do elasticsearch

Why is it important?

Not every users might use datastreams, and they need a way to send to an index or an alias: before this change it was not possible.

Checklist

My code follows the style guidelines of this project
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have made corresponding change to the default configuration files
I have added tests that prove my fix is effective or that my feature works
I have added an entry in CHANGELOG.md

Author's Checklist

[ ]

How to test this PR locally

Related issues

Closes #72

Use cases

Screenshots

Logs

elasticmachine · 2022-02-14T12:00:20Z

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS

Expand to view the summary

Build stats

Start Time: 2022-02-15T15:50:08.073+0000
Duration: 26 min 32 sec

Test stats 🧪

Test	Results
Failed	0
Passed	59
Skipped	0
Total	59

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

/test : Re-trigger the build.

endorama

I'm ok with the change but I would like to discuss the implementation. I think a different option (like index_name) would be preferred over the composite es_index_or_datastream_name. If set, it overrides data stream naming, otherwise we retain the same behaviour.

This would have a set of advantages:

not a breaking change (old config would still work)
allow to configure the index name if needed
in case data streams are used the current config reduce cognitive load for customers (as they have to specify only data set and namespace and not worry about data stream naming constraints).
clearer configuration logic: either data streams or override (override supports writing to indexes too)

What to you think?

docs/README-AWS.md

Co-authored-by: endorama <526307+endorama@users.noreply.github.com>

aspacca · 2022-02-15T14:12:05Z

I'm ok with the change but I would like to discuss the implementation. I think a different option (like index_name) would be preferred over the composite es_index_or_datastream_name. If set, it overrides data stream naming, otherwise we retain the same behaviour.

This would have a set of advantages:

not a breaking change (old config would still work)

allow to configure the index name if needed

in case data streams are used the current config reduce cognitive load for customers (as they have to specify only data set and namespace and not worry about data stream naming constraints).

clearer configuration logic: either data streams or override (override supports writing to indexes too)

What to you think?

I discussed the option of an extra config param without having a breaking change with @ravikesarwani

the golden rule we agreed on was "let's make it work with defaults and/or plain config for 90% of the use cases"
that's the reason for a single config
yes, the change will be breaking, being in beta and still having users in the POC phase and not in the production one we decided it is fine to do.
regarding the override and more granular configuration (in this PR you cannot override the single value of the namespace for example), sticking to the golden rule, in a next iteration, an extra/overriding way for configuring the 10% of the use cases that don't fit will be provided. this will properly documented in a separated section of the docs as something like advanced config, for users that need to tweak the setup to their very specific cases

do you think that's reasonable?

@ravikesarwani , do you think we should reconsider the implementation according to @endorama suggestions?

aspacca · 2022-02-15T14:19:34Z

@endorama I forgot to link the issue #72

ravikesarwani · 2022-02-15T14:57:56Z

in case data streams are used the current config reduce cognitive load for customers (as they have to specify only data set and namespace and not worry about data stream naming constraints).

While in theory this may seem like it, in practical (and based on our experience talking to folks in GoDaddy) no one really knows what "dataset" & "namespace" really is (even after giving them all the links for data stream naming conventions) and how we internally create the full data stream name by combining "logs-", dataset & namespace.

Having a single value is desirable and many Elasticsearch users understand index or data stream name. This is still an optional value (for AWS service logs we send it to correct data streams and they don't have to specify this) and they will only use this config when they have created their own index template (in complex use cases) and its easier for them to put the full value or the alias here. Breaking it into many pieces is our internal logic and not something I think we need to expose to our customers via way of multiple config values.

endorama · 2022-02-16T07:56:32Z

While in theory this may seem like it, in practical (and based on our experience talking to folks in GoDaddy) no one really knows what "dataset" & "namespace" really is (even after giving them all the links for data stream naming conventions) and how we internally create the full data stream name by combining "logs-", dataset & namespace.

Having a single value is desirable and many Elasticsearch users understand index or data stream name.

@ravikesarwani this is a really interesting comment, thanks for sharing.

Approved, code was already ✔️

Andrea Spacca added 4 commits February 14, 2022 12:31

fix docs

cecf1d3

refactoring on hadler

bf8eccf

refactoring on shippers

860be3d

refactoring on config

ce26eca

aspacca requested review from endorama and kaiyan-sheng February 14, 2022 11:35

aspacca self-assigned this Feb 14, 2022

changelog

f65c074

endorama reviewed Feb 15, 2022

View reviewed changes

docs/README-AWS.md Outdated Show resolved Hide resolved

Update docs/README-AWS.md

de7dff2

Co-authored-by: endorama <526307+endorama@users.noreply.github.com>

Andrea Spacca added 2 commits February 15, 2022 15:23

Merge branch 'main' into allow-es-sending-to-non-datastream

1e5737f

Merge branch 'main' into allow-es-sending-to-non-datastream

bffe6b6

Merge branch 'main' into allow-es-sending-to-non-datastream

5364198

endorama approved these changes Feb 16, 2022

View reviewed changes

aspacca merged commit 182bf2f into elastic:main Feb 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow es sending to non datastream #73

Allow es sending to non datastream #73

aspacca commented Feb 14, 2022 •

edited

Loading

elasticmachine commented Feb 14, 2022 •

edited

Loading

Build stats

Test stats 🧪

endorama left a comment

aspacca commented Feb 15, 2022

aspacca commented Feb 15, 2022

ravikesarwani commented Feb 15, 2022

endorama commented Feb 16, 2022

Allow es sending to non datastream #73

Allow es sending to non datastream #73

Conversation

aspacca commented Feb 14, 2022 • edited Loading

What does this PR do?

Why is it important?

Checklist

Author's Checklist

How to test this PR locally

Related issues

Use cases

Screenshots

Logs

elasticmachine commented Feb 14, 2022 • edited Loading

💚 Build Succeeded

Build stats

Test stats 🧪

🤖 GitHub comments

endorama left a comment

Choose a reason for hiding this comment

aspacca commented Feb 15, 2022

aspacca commented Feb 15, 2022

ravikesarwani commented Feb 15, 2022

endorama commented Feb 16, 2022

aspacca commented Feb 14, 2022 •

edited

Loading

elasticmachine commented Feb 14, 2022 •

edited

Loading