Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky-test: SchemaServiceTest.testSchemaRegistryMetrics #23457

Closed
1 of 2 tasks
lhotari opened this issue Oct 14, 2024 · 4 comments · Fixed by #23566
Closed
1 of 2 tasks

Flaky-test: SchemaServiceTest.testSchemaRegistryMetrics #23457

lhotari opened this issue Oct 14, 2024 · 4 comments · Fixed by #23566

Comments

@lhotari
Copy link
Member

lhotari commented Oct 14, 2024

Search before asking

  • I searched in the issues and found nothing similar.

Example failure

https://github.com/apache/pulsar/actions/runs/11328656119/job/31502934109?pr=23442#step:11:1338

Exception stacktrace

  Error:  Tests run: 30, Failures: 1, Errors: 0, Skipped: 17, Time elapsed: 59.073 s <<< FAILURE! - in org.apache.pulsar.broker.service.schema.SchemaServiceTest
  Error:  org.apache.pulsar.broker.service.schema.SchemaServiceTest.testSchemaRegistryMetrics  Time elapsed: 0.053 s  <<< FAILURE!
  java.lang.AssertionError: expected [tenant/ns] but found [public/ns_73b1a31afce34671a5ddc48fe5ad7fc8]
  	at org.testng.Assert.fail(Assert.java:110)
  	at org.testng.Assert.failNotEquals(Assert.java:1577)
  	at org.testng.Assert.assertEqualsImpl(Assert.java:149)
  	at org.testng.Assert.assertEquals(Assert.java:131)
  	at org.testng.Assert.assertEquals(Assert.java:655)
  	at org.testng.Assert.assertEquals(Assert.java:665)
  	at org.apache.pulsar.broker.service.schema.SchemaServiceTest.testSchemaRegistryMetrics(SchemaServiceTest.java:184)
  	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
  	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
  	at org.testng.internal.invokers.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:139)
  	at org.testng.internal.invokers.InvokeMethodRunnable.runOne(InvokeMethodRunnable.java:47)
  	at org.testng.internal.invokers.InvokeMethodRunnable.call(InvokeMethodRunnable.java:76)
  	at org.testng.internal.invokers.InvokeMethodRunnable.call(InvokeMethodRunnable.java:11)
  	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
  	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
  	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
  	at java.base/java.lang.Thread.run(Thread.java:1583)

Are you willing to submit a PR?

  • I'm willing to submit a PR!
@visxu
Copy link
Contributor

visxu commented Nov 5, 2024

Hi, Lari. @lhotari Could you please assign me the issue, I wanna contribute it. :)

@lhotari
Copy link
Member Author

lhotari commented Nov 5, 2024

Hi, Lari. @lhotari Could you please assign me the issue, I wanna contribute it. :)

@visxu Thanks for volunteering. This particular issue is tricky since it seems to be caused by lack of sufficient isolation between test runs. Prometheus metrics use static fields and there could be state left over from other executions in the same JVM. One simple solution for this case is to ignore the results from other namespaces on the line where the failure happens.

Collection<Metric> deleteLatency = metrics.get("pulsar_schema_del_ops_latency_count");
for (Metric metric : deleteLatency) {
Assert.assertEquals(metric.tags.get("namespace"), namespace);
Assert.assertTrue(metric.value > 0);
}
Collection<Metric> getLatency = metrics.get("pulsar_schema_get_ops_latency_count");
for (Metric metric : getLatency) {
Assert.assertEquals(metric.tags.get("namespace"), namespace);
Assert.assertTrue(metric.value > 0);
}
Collection<Metric> putLatency = metrics.get("pulsar_schema_put_ops_latency_count");
for (Metric metric : putLatency) {
Assert.assertEquals(metric.tags.get("namespace"), namespace);
Assert.assertTrue(metric.value > 0);
}

changing the asserts to if. The test is itself not great since it wouldn't fail if there aren't any metrics available so that issue should be addressed too. I'd recommend using AssertJ fluent assertions to handle the assertions.

something like this, perhaps

        assertThat(deleteLatency).anySatisfy(metric -> {
            Assert.assertEquals(metric.tags.get("namespace"), namespace);
            Assert.assertTrue(metric.value > 0);
        });

@lhotari
Copy link
Member Author

lhotari commented Nov 5, 2024

@visxu You can work on flaky test issues without someone assigning the GitHub issue specifically to you. We don't have many contributors working on these, so commenting on the issue is a good way to indicate to others that you are working on it. There will be unnecessary delays if you wait for the assignment from an Apache Pulsar committer.

@visxu
Copy link
Contributor

visxu commented Nov 5, 2024

Hi, Lari. @lhotari Could you please assign me the issue, I wanna contribute it. :)

@visxu Thanks for volunteering. This particular issue is tricky since it seems to be caused by lack of sufficient isolation between test runs. Prometheus metrics use static fields and there could be state left over from other executions in the same JVM. One simple solution for this case is to ignore the results from other namespaces on the line where the failure happens.

Collection<Metric> deleteLatency = metrics.get("pulsar_schema_del_ops_latency_count");
for (Metric metric : deleteLatency) {
Assert.assertEquals(metric.tags.get("namespace"), namespace);
Assert.assertTrue(metric.value > 0);
}
Collection<Metric> getLatency = metrics.get("pulsar_schema_get_ops_latency_count");
for (Metric metric : getLatency) {
Assert.assertEquals(metric.tags.get("namespace"), namespace);
Assert.assertTrue(metric.value > 0);
}
Collection<Metric> putLatency = metrics.get("pulsar_schema_put_ops_latency_count");
for (Metric metric : putLatency) {
Assert.assertEquals(metric.tags.get("namespace"), namespace);
Assert.assertTrue(metric.value > 0);
}

changing the asserts to if. The test is itself not great since it wouldn't fail if there aren't any metrics available so that issue should be addressed too. I'd recommend using AssertJ fluent assertions to handle the assertions.

something like this, perhaps

        assertThat(deleteLatency).anySatisfy(metric -> {
            Assert.assertEquals(metric.tags.get("namespace"), namespace);
            Assert.assertTrue(metric.value > 0);
        });

Thanks for your comments and suggestions.

visxu added a commit to vissxu/pulsar that referenced this issue Nov 6, 2024
Use `AssertJ` fluent assertions instead of `for` iterate.
visxu added a commit to vissxu/pulsar that referenced this issue Nov 6, 2024
Use `AssertJ` fluent assertions instead of `for` iterate.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants