Skip to content

KAFKA-9373: Reduce shutdown time by avoiding unnecessary loading of indexes#8346

Merged
ijuma merged 4 commits intoapache:trunkfrom
ijuma:kafka-9373-improve-shutdown-perf-lazy-offset
Mar 26, 2020
Merged

KAFKA-9373: Reduce shutdown time by avoiding unnecessary loading of indexes#8346
ijuma merged 4 commits intoapache:trunkfrom
ijuma:kafka-9373-improve-shutdown-perf-lazy-offset

Conversation

@ijuma
Copy link
Member

@ijuma ijuma commented Mar 25, 2020

KAFKA-7283 enabled lazy mmap on index files by initializing indices
on-demand rather than performing costly disk/memory operations when
creating all indices on broker startup. This helped reducing the startup
time of brokers. However, segment indices are still created on closing
segments, regardless of whether they need to be closed or not.

This is a cleaned up version of #7900, which was submitted by @efeg. It
eliminates unnecessary disk accesses and memory map operations while
deleting, renaming or closing offset and time indexes.

In a cluster with 31 brokers, where each broker has 13K to 20K segments,
@efeg and team observed up to 2 orders of magnitude faster LogManager
shutdown times - i.e. dropping the LogManager shutdown time of each
broker from 10s of seconds to 100s of milliseconds.

To avoid confusion between renameTo and setFile, I replaced the
latter with the more restricted updateParentDir` (it turns out that's
all we need).

Co-authored-by: Adem Efe Gencer agencer@linkedin.com
Co-authored-by: Ismael Juma ismael@juma.me.uk

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@ijuma
Copy link
Member Author

ijuma commented Mar 25, 2020

@efeg In #7900, you also had included a change that:

"Prevents illegal accesses to underlying indices of a closed segment, which would lead to memory leaks due to recreation of the underlying memory mapped objects."

Under what scenario could this happen, have you seen it in production? Or is it more of a preventative measure to catch potential bugs?

Copy link
Contributor

@junrao junrao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ijuma : Thanks for the PR. Looks good overall. Just a couple of comments below.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Identation of *. Also there seems to be tab in front of AbstractIndex.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, we didn't do that before. Is there a reason that we should hide this exception?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may call renameTo when the index file has not been created yet with this new approach. So we catch this case and ignore it. We could alternatively check if the file exists before calling atomicMoveWithFallback.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. That's true for the active segment.

@ijuma
Copy link
Member Author

ijuma commented Mar 25, 2020

Tests passed.

ijuma and others added 2 commits March 25, 2020 20:56
…ndexes

KAFKA-7283 enabled lazy mmap on index files by initializing indices
on-demand rather than performing costly disk/memory operations when
creating all indices on broker startup. This helped reducing the startup
time of brokers. However, segment indices are still created on closing
segments, regardless of whether they need to be closed or not.

This is a cleaned up version of apache#7900, which was submitted by @efeg. It
eliminates unnecessary disk accesses and memory map operations while
deleting, renaming or closing offset and time indexes.

In a cluster with 31 brokers, where each broker has 13K to 20K segments,
@efeg and team observed up to 2 orders of magnitude faster LogManager
shutdown times - i.e. dropping the LogManager shutdown time of each
broker from 10s of seconds to 100s of milliseconds.

To avoid confusion between `renameTo` and `setFile`, I replaced the
latter with the more restricted updateParentDir` (it turns out that's
all we need).

Co-authored-by: Adem Efe Gencer <agencer@linkedin.com>
Co-authored-by: Ismael Juma <ismael@juma.me.uk>
@ijuma ijuma force-pushed the kafka-9373-improve-shutdown-perf-lazy-offset branch from bf286c1 to 06c44ac Compare March 26, 2020 03:56
Copy link
Contributor

@junrao junrao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ijuma : Thanks for the updated PR. LGTM

@ijuma ijuma force-pushed the kafka-9373-improve-shutdown-perf-lazy-offset branch from 423a329 to f3fe9f7 Compare March 26, 2020 06:42
@ijuma
Copy link
Member Author

ijuma commented Mar 26, 2020

One known flake in one job:

kafka.api.ConsumerBounceTest.testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup

2 in the other:

kafka.api.PlaintextConsumerTest.testLowMaxFetchSizeForRequestAndPartition
kafka.api.SaslSslAdminIntegrationTest.testCreateDeleteTopics

Checked the failures, not related. Merging to trunk.

@ijuma ijuma merged commit 222726d into apache:trunk Mar 26, 2020
@ijuma ijuma deleted the kafka-9373-improve-shutdown-perf-lazy-offset branch March 26, 2020 11:26
@efeg
Copy link
Contributor

efeg commented Mar 26, 2020

@efeg In #7900, you also had included a change that:

"Prevents illegal accesses to underlying indices of a closed segment, which would lead to memory leaks due to recreation of the underlying memory mapped objects."

Under what scenario could this happen, have you seen it in production? Or is it more of a preventative measure to catch potential bugs?

@ijuma I saw this in a unit test (please see LogSegmentTest#testTruncateEmptySegment), but not in production. Hence, this change is preventative.

-- Thanks for this PR!

efeg pushed a commit to linkedin/kafka that referenced this pull request Jun 1, 2020
…ng unnecessary loading of indexes (apache#8346)

TICKET =
LI_DESCRIPTION =

KAFKA-7283 enabled lazy mmap on index files by initializing indices
on-demand rather than performing costly disk/memory operations when
creating all indices on broker startup. This helped reducing the startup
time of brokers. However, segment indices are still created on closing
segments, regardless of whether they need to be closed or not.

This is a cleaned up version of apache#7900, which was submitted by @efeg. It
eliminates unnecessary disk accesses and memory map operations while
deleting, renaming or closing offset and time indexes.

In a cluster with 31 brokers, where each broker has 13K to 20K segments,
@efeg and team observed up to 2 orders of magnitude faster LogManager
shutdown times - i.e. dropping the LogManager shutdown time of each
broker from 10s of seconds to 100s of milliseconds.

To avoid confusion between `renameTo` and `setFile`, I replaced the
latter with the more restricted updateParentDir` (it turns out that's
all we need).

Reviewers: Jun Rao <junrao@gmail.com>, Andrew Choi <a24choi@edu.uwaterloo.ca>

Co-authored-by: Adem Efe Gencer <agencer@linkedin.com>
Co-authored-by: Ismael Juma <ismael@juma.me.uk>

EXIT_CRITERIA = HASH [222726d]
gitlw pushed a commit to linkedin/kafka that referenced this pull request Jun 12, 2020
…ng unnecessary loading of indexes (apache#8346)

TICKET =
LI_DESCRIPTION =

KAFKA-7283 enabled lazy mmap on index files by initializing indices
on-demand rather than performing costly disk/memory operations when
creating all indices on broker startup. This helped reducing the startup
time of brokers. However, segment indices are still created on closing
segments, regardless of whether they need to be closed or not.

This is a cleaned up version of apache#7900, which was submitted by @efeg. It
eliminates unnecessary disk accesses and memory map operations while
deleting, renaming or closing offset and time indexes.

In a cluster with 31 brokers, where each broker has 13K to 20K segments,
@efeg and team observed up to 2 orders of magnitude faster LogManager
shutdown times - i.e. dropping the LogManager shutdown time of each
broker from 10s of seconds to 100s of milliseconds.

To avoid confusion between `renameTo` and `setFile`, I replaced the
latter with the more restricted updateParentDir` (it turns out that's
all we need).

Reviewers: Jun Rao <junrao@gmail.com>, Andrew Choi <a24choi@edu.uwaterloo.ca>

Co-authored-by: Adem Efe Gencer <agencer@linkedin.com>
Co-authored-by: Ismael Juma <ismael@juma.me.uk>

EXIT_CRITERIA = HASH [222726d]
gitlw pushed a commit to linkedin/kafka that referenced this pull request Jun 13, 2020
…ng unnecessary loading of indexes (apache#8346)

TICKET =
LI_DESCRIPTION =

KAFKA-7283 enabled lazy mmap on index files by initializing indices
on-demand rather than performing costly disk/memory operations when
creating all indices on broker startup. This helped reducing the startup
time of brokers. However, segment indices are still created on closing
segments, regardless of whether they need to be closed or not.

This is a cleaned up version of apache#7900, which was submitted by @efeg. It
eliminates unnecessary disk accesses and memory map operations while
deleting, renaming or closing offset and time indexes.

In a cluster with 31 brokers, where each broker has 13K to 20K segments,
@efeg and team observed up to 2 orders of magnitude faster LogManager
shutdown times - i.e. dropping the LogManager shutdown time of each
broker from 10s of seconds to 100s of milliseconds.

To avoid confusion between `renameTo` and `setFile`, I replaced the
latter with the more restricted updateParentDir` (it turns out that's
all we need).

Reviewers: Jun Rao <junrao@gmail.com>, Andrew Choi <a24choi@edu.uwaterloo.ca>

Co-authored-by: Adem Efe Gencer <agencer@linkedin.com>
Co-authored-by: Ismael Juma <ismael@juma.me.uk>

EXIT_CRITERIA = HASH [222726d]
lmr3796 pushed a commit to lmr3796/kafka that referenced this pull request Nov 19, 2021
TICKET = [KAFKA-8667, KAFKA-8668]
LI_DESCRIPTION =
- Avoid acquiring partitionMap lock in shutdownIdleFetcherThread
- ~~Avoid appending to the time index during shutdown if the time index has not yet be initialized~~
  This is covered in apache#8346 and apache#10960

EXIT_CRITERIA = TICKET [KAFKA-8667, KAFKA-8668]

The patch is a sqaush of the 2 commits:

== This is the 1st commit [155b4f8] ==

[LI-HOTFIX] Reduce lock retention and improve broker shutdown time:

TICKET = [KAFKA-8667, KAFKA-8668]
LI_DESCRIPTION =
- Avoid acquiring partitionMap lock in shutdownIdleFetcherThread
- Avoid appending to the time index during shutdown if the time index has not yet be initialized

RB=1431408
BUG=LIKAFKA-19361
G=Kafka-Code-Reviews
R=jkoshy,jonlee
A=jkoshy,jonlee

== This is the commit #2 [c53fffd] ==

[LI-HOTFIX] Update fetcher thread idle flag in addPartitions

TICKET = KAFKA-8667
LI_DESCRIPTION =

This patch fixes in bug introduced by “[LI-HOTFIX] Reduce lock retention and improve broker shutdown time” HOTFIX where the fetcher thread idle flag is not set in addPartitions, which can cause idle fetcher thread not shutdown in time.

RB=1431408
BUG=LIKAFKA-19361
G=Kafka-Code-Reviews
R=jkoshy,jonlee
A=jkoshy,jonlee
lmr3796 pushed a commit to linkedin/kafka that referenced this pull request Jan 4, 2022
TICKET = [KAFKA-8667, KAFKA-8668]
LI_DESCRIPTION =
- Avoid acquiring partitionMap lock in shutdownIdleFetcherThread
- ~~Avoid appending to the time index during shutdown if the time index has not yet be initialized~~
  This is covered in apache#8346 and apache#10960

EXIT_CRITERIA = TICKET [KAFKA-8667, KAFKA-8668]

The patch is a sqaush of the 2 commits:

== This is the 1st commit [155b4f8] ==

[LI-HOTFIX] Reduce lock retention and improve broker shutdown time:

TICKET = [KAFKA-8667, KAFKA-8668]
LI_DESCRIPTION =
- Avoid acquiring partitionMap lock in shutdownIdleFetcherThread
- Avoid appending to the time index during shutdown if the time index has not yet be initialized

RB=1431408
BUG=LIKAFKA-19361
G=Kafka-Code-Reviews
R=jkoshy,jonlee
A=jkoshy,jonlee

== This is the commit #2 [c53fffd] ==

[LI-HOTFIX] Update fetcher thread idle flag in addPartitions

TICKET = KAFKA-8667
LI_DESCRIPTION =

This patch fixes in bug introduced by “[LI-HOTFIX] Reduce lock retention and improve broker shutdown time” HOTFIX where the fetcher thread idle flag is not set in addPartitions, which can cause idle fetcher thread not shutdown in time.

RB=1431408
BUG=LIKAFKA-19361
G=Kafka-Code-Reviews
R=jkoshy,jonlee
A=jkoshy,jonlee
lmr3796 pushed a commit to lmr3796/kafka that referenced this pull request Feb 9, 2022
TICKET = [KAFKA-8667, KAFKA-8668]
LI_DESCRIPTION =
- Avoid acquiring partitionMap lock in shutdownIdleFetcherThread
- ~~Avoid appending to the time index during shutdown if the time index has not yet be initialized~~
  This is covered in apache#8346 and apache#10960

EXIT_CRITERIA = TICKET [KAFKA-8667, KAFKA-8668]

The patch is a sqaush of the 2 commits:

== This is the 1st commit [155b4f8] ==

[LI-HOTFIX] Reduce lock retention and improve broker shutdown time:

TICKET = [KAFKA-8667, KAFKA-8668]
LI_DESCRIPTION =
- Avoid acquiring partitionMap lock in shutdownIdleFetcherThread
- Avoid appending to the time index during shutdown if the time index has not yet be initialized

RB=1431408
BUG=LIKAFKA-19361
G=Kafka-Code-Reviews
R=jkoshy,jonlee
A=jkoshy,jonlee

== This is the commit #2 [c53fffd] ==

[LI-HOTFIX] Update fetcher thread idle flag in addPartitions

TICKET = KAFKA-8667
LI_DESCRIPTION =

This patch fixes in bug introduced by “[LI-HOTFIX] Reduce lock retention and improve broker shutdown time” HOTFIX where the fetcher thread idle flag is not set in addPartitions, which can cause idle fetcher thread not shutdown in time.

RB=1431408
BUG=LIKAFKA-19361
G=Kafka-Code-Reviews
R=jkoshy,jonlee
A=jkoshy,jonlee
lmr3796 pushed a commit to lmr3796/kafka that referenced this pull request Mar 25, 2022
TICKET = [KAFKA-8667, KAFKA-8668]
LI_DESCRIPTION =
- Avoid acquiring partitionMap lock in shutdownIdleFetcherThread
- ~~Avoid appending to the time index during shutdown if the time index has not yet be initialized~~
  This is covered in apache#8346 and apache#10960

EXIT_CRITERIA = TICKET [KAFKA-8667, KAFKA-8668]

The patch is a sqaush of the 2 commits:

== This is the 1st commit [155b4f8] ==

[LI-HOTFIX] Reduce lock retention and improve broker shutdown time:

TICKET = [KAFKA-8667, KAFKA-8668]
LI_DESCRIPTION =
- Avoid acquiring partitionMap lock in shutdownIdleFetcherThread
- Avoid appending to the time index during shutdown if the time index has not yet be initialized

RB=1431408
BUG=LIKAFKA-19361
G=Kafka-Code-Reviews
R=jkoshy,jonlee
A=jkoshy,jonlee

== This is the commit #2 [c53fffd] ==

[LI-HOTFIX] Update fetcher thread idle flag in addPartitions

TICKET = KAFKA-8667
LI_DESCRIPTION =

This patch fixes in bug introduced by “[LI-HOTFIX] Reduce lock retention and improve broker shutdown time” HOTFIX where the fetcher thread idle flag is not set in addPartitions, which can cause idle fetcher thread not shutdown in time.

RB=1431408
BUG=LIKAFKA-19361
G=Kafka-Code-Reviews
R=jkoshy,jonlee
A=jkoshy,jonlee
lmr3796 pushed a commit to lmr3796/kafka that referenced this pull request Jun 2, 2022
TICKET = [KAFKA-8667, KAFKA-8668]
LI_DESCRIPTION =
- Avoid acquiring partitionMap lock in shutdownIdleFetcherThread
- ~~Avoid appending to the time index during shutdown if the time index has not yet be initialized~~
  This is covered in apache#8346 and apache#10960

EXIT_CRITERIA = TICKET [KAFKA-8667, KAFKA-8668]

The patch is a sqaush of the 2 commits:

== This is the 1st commit [155b4f8] ==

[LI-HOTFIX] Reduce lock retention and improve broker shutdown time:

TICKET = [KAFKA-8667, KAFKA-8668]
LI_DESCRIPTION =
- Avoid acquiring partitionMap lock in shutdownIdleFetcherThread
- Avoid appending to the time index during shutdown if the time index has not yet be initialized

RB=1431408
BUG=LIKAFKA-19361
G=Kafka-Code-Reviews
R=jkoshy,jonlee
A=jkoshy,jonlee

== This is the commit #2 [c53fffd] ==

[LI-HOTFIX] Update fetcher thread idle flag in addPartitions

TICKET = KAFKA-8667
LI_DESCRIPTION =

This patch fixes in bug introduced by “[LI-HOTFIX] Reduce lock retention and improve broker shutdown time” HOTFIX where the fetcher thread idle flag is not set in addPartitions, which can cause idle fetcher thread not shutdown in time.

RB=1431408
BUG=LIKAFKA-19361
G=Kafka-Code-Reviews
R=jkoshy,jonlee
A=jkoshy,jonlee
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants