Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport high disk utilization fix #9206

Merged
merged 7 commits into from
Dec 7, 2017
Merged

Backport high disk utilization fix #9206

merged 7 commits into from
Dec 7, 2017

Conversation

jwilder
Copy link
Contributor

@jwilder jwilder commented Dec 7, 2017

Backport #9204

Required for all non-trivial PRs
  • Rebased/mergable
  • Tests pass
  • CHANGELOG.md updated
  • Sign CLA (if not already signed)
Required only if applicable

You can erase any checkboxes below this note if they are not applicable to your Pull Request.

  • InfluxQL Spec updated
  • Provide example syntax
  • Update man page when modifying a command
  • Config changes: update sample config (etc/config.sample.toml), server NewDemoConfig method, and Diagnostics methods reporting config settings, if necessary
  • InfluxData Documentation: issue filed or pull request submitted <link to issue or pull request>

O_SYNC was added with writing TSM files to fix an issue where the
final fsync at the end cause the process to stall.  This ends up
increase disk util to much so this change switches to use multiple
fsyncs while writing the TSM file instead of O_SYNC or one large
one at the end.
With the recent changes to compactions and snapshotting, the current
default can create lots of small level 1 TSM files.  This increases
the default in order to create larger level 1 files and less disk
utilization.
The default max-concurrent-compactions settings allows up to 50%
of cores to be used for compactions.  When the number of cores is
high (>8), this can lead to high disk utilization.  Capping at
4 and combined with high snapshot sizes seems to keep the compaction
backlog reasonable and not tax the disks as much.  Systems with lots
of IOPS, RAM and CPU cores may want to increase these.
This runs the scheduler every 5s instead of every 1s as well as reduces
the scope of a level 1 plan.
The disk based temp index for writing a TSM file was used for
compactions other than snapshot compactions.  That meant it was
used even for smaller compactiont that would not use much memory.
An unintended side-effect of this is higher disk IO when copying
the index to the final file.

This switches when to use the index based on the estimated size of
the new index that will be written.  This isn't exact, but seems to
work kick in at higher cardinality and larger compactions when it
is necessary to avoid OOMs.
@jwilder jwilder added this to the 1.4.3 milestone Dec 7, 2017
@jwilder jwilder requested a review from stuartcarnie December 7, 2017 15:51
@ghost ghost assigned jwilder Dec 7, 2017
@ghost ghost added the review label Dec 7, 2017
Copy link
Contributor

@stuartcarnie stuartcarnie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@jwilder jwilder merged commit 50063f9 into 1.4 Dec 7, 2017
@ghost ghost removed the review label Dec 7, 2017
@jwilder jwilder deleted the jw-14-backport branch December 7, 2017 17:06
@jwilder jwilder mentioned this pull request Dec 13, 2017
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants