fix: flaky H2ScalaAllPersistenceIdsTest #199

Roiocam · 2024-05-28T09:47:51Z

Resolves: #184

Link to the investigation: #184 (comment)

core/src/main/scala/org/apache/pekko/persistence/jdbc/query/scaladsl/JdbcReadJournal.scala

Roiocam · 2024-05-28T14:58:50Z

core/src/main/scala/org/apache/pekko/persistence/jdbc/query/scaladsl/JdbcReadJournal.scala

@@ -119,7 +117,9 @@ class JdbcReadJournal(config: Config, configPath: String)(implicit val system: E
  override def persistenceIds(): Source[String, NotUsed] =
    Source
      .repeat(0)
-      .flatMapConcat(_ => delaySource.flatMapConcat(_ => currentPersistenceIds()))


I have been roll back this change, and then increasing the readJournalConfig.refreshInterval to 10x times, but it will still be flaky. And then I made a change like this.

e91daa0

It seems that throttle is more useful than Source.tick(), which can also be shown in the printing frequency of the log.

Under the throttle of this PR, the actual number of db executions is mostly about 10x times.

execute query, total count: 10 2024-05-29 08:52:18,885 - org.apache.pekko.actor.CoordinatedShutdown -> INFO [test-pekko.actor.default-dispatcher-5] [ test CoordinatedShutdown - Running CoordinatedShutdown with reason [ActorSystemTerminateReason] execute query, total count: 11 2024-05-29 08:53:01,234 - org.apache.pekko.actor.CoordinatedShutdown -> INFO [test-pekko.actor.default-dispatcher-5] [ test CoordinatedShutdown - Running CoordinatedShutdown with reason [ActorSystemTerminateReason]

However, in the original delaySource, most of the db executions were about more than 1,000 times, about 100x times that of the new approach. I guess the original way didn't limit the flow.

execute query, total count: 1042 2024-05-29 08:53:40,691 - org.apache.pekko.actor.CoordinatedShutdown -> INFO [test-pekko.actor.default-dispatcher-4] [ test CoordinatedShutdown - Running CoordinatedShutdown with reason [ActorSystemTerminateReason] execute query, total count: 1316 2024-05-29 08:55:42,865 - org.apache.pekko.actor.CoordinatedShutdown -> INFO [test-pekko.actor.default-dispatcher-6] [ test CoordinatedShutdown - Running CoordinatedShutdown with reason [ActorSystemTerminateReason]

pjfanning

lgtm - we might need to backport this to 1.0.x branch because #182 was also already backported - the delaySource that you removed has already caused us problems.

Roiocam · 2024-05-29T02:54:42Z

core/src/main/scala/org/apache/pekko/persistence/jdbc/query/scaladsl/JdbcReadJournal.scala

@@ -94,7 +94,7 @@ class JdbcReadJournal(config: Config, configPath: String)(implicit val system: E
    JournalSequenceActor.props(readJournalDao, readJournalConfig.journalSequenceRetrievalConfiguration),
    s"$configPath.pekko-persistence-jdbc-journal-sequence-actor")
  private val delaySource =
-    Source.tick(0.seconds, readJournalConfig.refreshInterval, 0).take(1)
+    Source.tick(readJournalConfig.refreshInterval, readJournalConfig.refreshInterval, 0).take(1)


I did some testing in my local and solved the issue of delaySource not being able to truly limit the flow based on the following replacement.

Source .repeat(0) + .buffer(1, OverflowStrategy.backpressure) + .throttle(1, readJournalConfig.refreshInterval) + .flatMapConcat(_ => currentPersistenceIds()) - .flatMapConcat(_ => delaySource.flatMapConcat(_ => currentPersistenceIds()))

However, it may bring another problem, which is that repeat + buffer + throttle may cause the stream to never end (I guess it's because there is always a buffer element waiting to be processed).

In conclusion, I spent some time investigating and found that persistenceIds() in each executeflatMapConcat(_ => delaySource.flatMapConcat(_ => currentPersistenceIds())) is actually a new delaySource due to the presence of take(1).

Removing take(1) or changing theinitialDelay parameter of tick() to refreshInterval can achieve the desired flow control effect.

@He-Pin can you take a look? Thanks.

Roiocam · 2024-05-30T06:10:07Z

let us merge this and then move on the release.

* fix: flaky H2ScalaAllPersistenceIdsTest * use Shaping ThrottleMode * rollback to original stream ways... * slow down the db query * fix delaySource flaky * eager fetch * not need config override

fix: flaky H2ScalaAllPersistenceIdsTest

6d0e7ad

Roiocam mentioned this pull request May 28, 2024

flaky should find persistenceIds for actors in H2ScalaAllPersistenceIdsTest #184

Closed

Roiocam requested review from pjfanning, He-Pin, raboof and nvollmar May 28, 2024 09:51

nvollmar reviewed May 28, 2024

View reviewed changes

core/src/main/scala/org/apache/pekko/persistence/jdbc/query/scaladsl/JdbcReadJournal.scala Outdated Show resolved Hide resolved

use Shaping ThrottleMode

df562d1

Roiocam mentioned this pull request May 28, 2024

docs: stream operator throttle typo apache/pekko#1348

Merged

Roiocam added 2 commits May 28, 2024 18:39

rollback to original stream ways...

e91daa0

slow down the db query

123d5fa

Roiocam commented May 28, 2024

View reviewed changes

raboof closed this in apache/pekko#1348 May 28, 2024

pjfanning reopened this May 28, 2024

pjfanning approved these changes May 28, 2024

View reviewed changes

Roiocam marked this pull request as draft May 29, 2024 02:20

Roiocam added 2 commits May 29, 2024 10:44

fix delaySource flaky

4abf989

eager fetch

1a90d8f

Roiocam commented May 29, 2024

View reviewed changes

Roiocam marked this pull request as ready for review May 29, 2024 02:55

not need config override

8cf2ad4

Roiocam merged commit 0d0a85d into apache:main May 30, 2024
23 checks passed

Roiocam deleted the flaky-test-fix branch May 30, 2024 06:10

Roiocam mentioned this pull request May 30, 2024

fix: flaky H2ScalaAllPersistenceIdsTest (#199) #200

Merged

Roiocam added a commit that referenced this pull request May 31, 2024

fix: flaky H2ScalaAllPersistenceIdsTest (#199) (#200)

a403be5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: flaky H2ScalaAllPersistenceIdsTest #199

fix: flaky H2ScalaAllPersistenceIdsTest #199

Roiocam commented May 28, 2024 •

edited

Loading

Roiocam May 28, 2024 •

edited

Loading

Roiocam May 29, 2024

pjfanning left a comment

Roiocam May 29, 2024

Roiocam May 29, 2024

Roiocam commented May 30, 2024

fix: flaky H2ScalaAllPersistenceIdsTest #199

fix: flaky H2ScalaAllPersistenceIdsTest #199

Conversation

Roiocam commented May 28, 2024 • edited Loading

Roiocam May 28, 2024 • edited Loading

Choose a reason for hiding this comment

Roiocam May 29, 2024

Choose a reason for hiding this comment

pjfanning left a comment

Choose a reason for hiding this comment

Roiocam May 29, 2024

Choose a reason for hiding this comment

Roiocam May 29, 2024

Choose a reason for hiding this comment

Roiocam commented May 30, 2024

Roiocam commented May 28, 2024 •

edited

Loading

Roiocam May 28, 2024 •

edited

Loading