Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[baseline/capture] use high-memory nodes with ramDisks #71894

Merged

Conversation

spalger
Copy link
Contributor

@spalger spalger commented Jul 15, 2020

We're seeing fairly regular failures in the baseline capture jobs that we suspect are caused by a resource shortage.

Resource graph from a recently failed job:

image

That sharp spike in I/O Wait is probably because we're not using ramDisks for these workers, mostly because we're running on small instances which don't have the ram necessary. Hoping that a switch to a s-highmem instance + using a ramDisk will prevent that and make these jobs more stable.

@spalger spalger added Team:Operations Team label for Operations Team v8.0.0 release_note:skip Skip the PR/issue when compiling release notes v7.10.0 v7.9.0 labels Jul 15, 2020
@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Build metrics

✅ unchanged

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@spalger spalger marked this pull request as ready for review July 15, 2020 21:10
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-operations (Team:Operations)

@spalger spalger requested a review from a team July 15, 2020 21:11
@spalger
Copy link
Contributor Author

spalger commented Jul 15, 2020

@spalger spalger merged commit b695d60 into elastic:master Jul 17, 2020
@spalger spalger deleted the implement/larger-baseline-capture-nodes branch July 17, 2020 00:13
spalger added a commit to spalger/kibana that referenced this pull request Jul 17, 2020
Co-authored-by: spalger <spalger@users.noreply.github.com>
spalger added a commit to spalger/kibana that referenced this pull request Jul 17, 2020
Co-authored-by: spalger <spalger@users.noreply.github.com>
spalger added a commit that referenced this pull request Jul 17, 2020
…#72210)

Co-authored-by: spalger <spalger@users.noreply.github.com>
spalger added a commit that referenced this pull request Jul 17, 2020
…#72209)

Co-authored-by: spalger <spalger@users.noreply.github.com>
gmmorris added a commit to gmmorris/kibana that referenced this pull request Jul 17, 2020
* master: (214 commits)
  replacing hard coded links for ela.st (elastic#72240)
  skip flaky suite (elastic#60865)
  chore(NA): teardown dynamic dll plugin (elastic#72096)
  Register navLink actions for declared applications (elastic#72109)
  Fix value for process.hash.sha256 draggable (elastic#72142)
  Call setupIngest before fleet_install tests (elastic#72214)
  [Security Solution][Detections] Better toast errors (elastic#72205)
  skip flaky suite (elastic#64696)
  [Security Solution][Detections] Disable exceptions for Threshold and ML rules (elastic#72137)
  [Security Solution][Detections,Lists] Miscellaneous post-FF fixes (elastic#71990)
  [baseline/capture] use high-memory nodes with ramDisks (elastic#71894)
  skip flaky suite (elastic#77207)
  [Maps] Fix issue preventing TMS from rendering correctly (elastic#71946)
  using test_user with minimum privs (elastic#71988)
  Fixed Webhook connector doesn't retain added HTTP header settings (elastic#71924)
  [Ingest Manager] Do not show enrolling and unenrolling agents as online in agent counters (elastic#71921)
  [Maps] fix 'New Map' from getting added to recently accessed (elastic#72125)
  [Visualizations] Pass 'aggs' parameter to custom request handlers (elastic#71423)
  [Monitoring] Out of the box alert tweaks (elastic#71942)
  [ML] Fix datafeed start time is incorrect when the job has trailing empty buckets (elastic#71976)
  ...
gmmorris added a commit to gmmorris/kibana that referenced this pull request Jul 17, 2020
* master: (55 commits)
  updates 'External alerts' tab text (elastic#72237)
  [Security Solution][Case] Fix connector's dropdown with conflicting requests (elastic#72037)
  replacing hard coded links for ela.st (elastic#72240)
  skip flaky suite (elastic#60865)
  chore(NA): teardown dynamic dll plugin (elastic#72096)
  Register navLink actions for declared applications (elastic#72109)
  Fix value for process.hash.sha256 draggable (elastic#72142)
  Call setupIngest before fleet_install tests (elastic#72214)
  [Security Solution][Detections] Better toast errors (elastic#72205)
  skip flaky suite (elastic#64696)
  [Security Solution][Detections] Disable exceptions for Threshold and ML rules (elastic#72137)
  [Security Solution][Detections,Lists] Miscellaneous post-FF fixes (elastic#71990)
  [baseline/capture] use high-memory nodes with ramDisks (elastic#71894)
  skip flaky suite (elastic#77207)
  [Maps] Fix issue preventing TMS from rendering correctly (elastic#71946)
  using test_user with minimum privs (elastic#71988)
  Fixed Webhook connector doesn't retain added HTTP header settings (elastic#71924)
  [Ingest Manager] Do not show enrolling and unenrolling agents as online in agent counters (elastic#71921)
  [Maps] fix 'New Map' from getting added to recently accessed (elastic#72125)
  [Visualizations] Pass 'aggs' parameter to custom request handlers (elastic#71423)
  ...
gmmorris added a commit to gmmorris/kibana that referenced this pull request Jul 17, 2020
…feature-privileges

* alerting/consumer-based-rbac: (56 commits)
  take into account which features available in the active space
  updates 'External alerts' tab text (elastic#72237)
  [Security Solution][Case] Fix connector's dropdown with conflicting requests (elastic#72037)
  replacing hard coded links for ela.st (elastic#72240)
  skip flaky suite (elastic#60865)
  chore(NA): teardown dynamic dll plugin (elastic#72096)
  Register navLink actions for declared applications (elastic#72109)
  Fix value for process.hash.sha256 draggable (elastic#72142)
  Call setupIngest before fleet_install tests (elastic#72214)
  [Security Solution][Detections] Better toast errors (elastic#72205)
  skip flaky suite (elastic#64696)
  [Security Solution][Detections] Disable exceptions for Threshold and ML rules (elastic#72137)
  [Security Solution][Detections,Lists] Miscellaneous post-FF fixes (elastic#71990)
  [baseline/capture] use high-memory nodes with ramDisks (elastic#71894)
  skip flaky suite (elastic#77207)
  [Maps] Fix issue preventing TMS from rendering correctly (elastic#71946)
  using test_user with minimum privs (elastic#71988)
  Fixed Webhook connector doesn't retain added HTTP header settings (elastic#71924)
  [Ingest Manager] Do not show enrolling and unenrolling agents as online in agent counters (elastic#71921)
  [Maps] fix 'New Map' from getting added to recently accessed (elastic#72125)
  ...
@spalger spalger added the v7.8.1 label Jul 22, 2020
spalger added a commit to spalger/kibana that referenced this pull request Jul 22, 2020
Co-authored-by: spalger <spalger@users.noreply.github.com>
spalger added a commit that referenced this pull request Jul 22, 2020
… and [pipeline/commitStatus] update commit status in baseline-capture job (#72366) (#72981)

* [baseline/capture] use high-memory nodes with ramDisks (#71894)

Co-authored-by: spalger <spalger@users.noreply.github.com>

* [pipeline/commitStatus] update commit status in baseline-capture job (#72366)

Co-authored-by: spalger <spalger@users.noreply.github.com>
(cherry picked from commit a221e04)

Co-authored-by: spalger <spalger@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release_note:skip Skip the PR/issue when compiling release notes Team:Operations Team label for Operations Team v7.8.1 v7.9.0 v7.10.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants