Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add synthetic mode as default when creating a datastream #1261

Merged
merged 5 commits into from
Jun 8, 2023

Conversation

mrodm
Copy link
Contributor

@mrodm mrodm commented May 11, 2023

Closes #1026

This PR adds into the template the needed fields to enable synthetic mode when creating a new metrics data stream. New questions are shown to the user.

  • No question is shown if a logs data stream is created/requested:
    • Nothing added to the manifest neither
    Create a new data stream
    ? Data stream name: foo
    ? Data stream title: Foo
    ? Type: logs
    New data stream has been created: foo
    Done
    
  • Example adding just synthetic source:
    • Output
       $ elastic-package create data-stream
      Create a new data stream
      ? Data stream name: foo
      ? Data stream title: Foo
      ? Type: metrics
      ? Enable time series and synthetic source? No
      ? Enable synthetic source? Yes
      New data stream has been created: foo
      Done
      
    • Manifest file generated:
      title: "Foo"
      type: metrics
      streams:
        - input: sample/metrics
          title: Sample metrics
          description: Collect sample metrics
          vars:
            - name: period
              type: text
              title: Period
              default: 10s
      elasticsearch:
        source_mode: synthetic
  • Example adding both synthetic source and time series:
    • Output :
      Create a new data stream
      ? Data stream name: foo
      ? Data stream title: Foo
      ? Type: metrics
      ? Enable time series and synthetic source? Yes
      ? Enable synthetic source? Yes
      New data stream has been created: foo
      Done
      
    • Manifest file generated:
      title: "Foo"
      type: metrics
      streams:
        - input: sample/metrics
          title: Sample metrics
          description: Collect sample metrics
          vars:
            - name: period
              type: text
              title: Period
              default: 10s
      elasticsearch:
        source_mode: synthetic
        index_mode: time_series

Comment on lines 30 to 31
elasticsearch:
source_mode: synthetic
Copy link
Contributor Author

@mrodm mrodm May 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be included also time_series setting as synthetic _source is just GA for index.mode set to time_series ?
https://www.elastic.co/guide/en/elasticsearch/reference/8.7/mapping-source-field.html#synthetic-source

elasticsearch:
  source_mode: synthetic
  index_mode: time_series

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would look like a sane default, yes. But then I wonder if this should be enabled by default in logs data streams, not sure if we want to enable time series mode there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've introduced a new question to ask for enabling time_series

 $ rm -rf data_stream/foo ; elastic-package create data-stream ; cat data_stream/foo/manifest.yml 
Create a new data stream
? Data stream name: foo
? Data stream title: Foo
? Type: metrics
? Enable synthetic source? Yes
? Enable time series? Yes
New data stream has been created: foo
Done
title: "Foo"
type: metrics
streams:
  - input: sample/metrics
    title: Sample metrics
    description: Collect sample metrics
    vars:
      - name: period
        type: text
        title: Period
        default: 10s
elasticsearch:
  source_mode: synthetic
  index_mode: time_series

This last question just is shown when the user selects Yes to "Enable synthetic source?" question.

So, the user could choose:

  • not to use synthetic source
  • use just synthetic source
  • use both synthetic source and time_series

Maybe it's not needed and if so, I could revert the latest changes so this PR just introduces source_mode key. WDYT @jsoriano ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but maybe another option would be to make it more explicit, asking first if they want to enable time series and synthetic source, something like this:

? Enable time series and synthetic source? No
? Enable synthetic source? Yes
New data stream has been created: foo

If the first answer is affirmative, the second question is not done:

? Enable time series and synthetic source? Yes
New data stream has been created: foo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll change it 👍
I'll add both question that by default they are true (yes answer).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Examples:

  • Just synthetic source:
Create a new data stream
? Data stream name: foo
? Data stream title: Foo
? Type: metrics
? Enable time series and synthetic source? No
? Enable synthetic source? Yes
New data stream has been created: foo
Done
title: "Foo"
type: metrics
streams:
  - input: sample/metrics
    title: Sample metrics
    description: Collect sample metrics
    vars:
      - name: period
        type: text
        title: Period
        default: 10s
elasticsearch:
  source_mode: synthetic
  • Both synthetic and time_series:
Create a new data stream
? Data stream name: foo
? Data stream title: Foo
? Type: metrics
? Enable time series and synthetic source? Yes
New data stream has been created: foo
Done
title: "Foo"
type: metrics
streams:
  - input: sample/metrics
    title: Sample metrics
    description: Collect sample metrics
    vars:
      - name: period
        type: text
        title: Period
        default: 10s
elasticsearch:
  source_mode: synthetic
  index_mode: time_series

@mrodm mrodm marked this pull request as ready for review June 6, 2023 16:28
@mrodm mrodm requested a review from jsoriano June 6, 2023 16:28
@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

History

@mrodm mrodm requested a review from jsoriano June 8, 2023 14:06
@mrodm
Copy link
Contributor Author

mrodm commented Jun 8, 2023

buildkite test integrations

1 similar comment
@mrodm
Copy link
Contributor Author

mrodm commented Jun 8, 2023

buildkite test integrations

@mrodm mrodm self-assigned this Jun 8, 2023
@mrodm mrodm merged commit 4fd1b58 into elastic:main Jun 8, 2023
@mrodm mrodm deleted the add_synthetic_datastream_creation branch June 8, 2023 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Set synthetic source mode on new data streams
3 participants