Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0045-subscribe-update / do_test_regex_many_mock sporadic (but frequent) failure #4259

Open
5 of 7 tasks
eig opened this issue Apr 18, 2023 · 0 comments
Open
5 of 7 tasks

Comments

@eig
Copy link

eig commented Apr 18, 2023

Read the FAQ first: https://github.com/edenhill/librdkafka/wiki/FAQ

Do NOT create issues for questions, use the discussion forum: https://github.com/edenhill/librdkafka/discussions

Description

The test do_test_regex_many_mock fails frequently on Linux and Mac. At least 50% of the time.
The error is always the same:

Test "0045_subscribe_update_mock (do_test_regex_many_mock:379: range with 50 topics)" failed at test.c:4104:test_consumer_poll_no_msgs() at Mon Apr 17 18:46:25 2023:

^topic_.* [0] error (offset -1001): Subscribed topic not available: ^topic_.*: Broker: Unknown topic or partition

I'm not sure of the cause, though I suspect some kind of timing / synchronization issue. I've tried several workarounds:

  1. Setting topic.metadata.propagation.max.ms to 60,000
  2. Decreasing topic.metadata.refresh.internal.ms to 3,000
  3. Sleeping for 1, 2 and 3 seconds right after calling "test_consumer_subscribe".
  4. Calling test_wait_metadata_update in the test loop in do_test_regex_many_mock before calling test_consumer_poll_no_msgs.

Of these, only #4 seems to have any effect -- it allows the test to pass on Linux, but the test still fails on Mac.

How to reproduce

Build librdkafka out of the box, no modifications, on Debian:
% uname -a
Linux ah-pwebb-l 5.10.0-21-amd64 #1 SMP Debian 5.10.162-1 (2023-01-21) x86_64 GNU/Linux

Also fails on MacOS, Monterey 12.1.

Mostly passes on Windows, apparently.

Error message in context (with addition of test_wait_metadata_update) :

[0045_subscribe_update_mock / 0.000s] ================= Running test 0045_subscribe_update_mock =================
[0045_subscribe_update_mock / 0.000s] ==== Stats written to file stats_0045_subscribe_update_mock_2328513349970067765.json ====
[0045_subscribe_update_mock / 0.000s] �[35m[ do_test_regex_many_mock:379: range with 50 topics ]
�[0m%5|1681771585.177|CONFWARN|MOCK#producer-35| [thrd:app]: No bootstrap.servers configured: client will not be able to connect to Kafka cluster
�[36m[0045_subscribe_update_mock / 0.004s] Setting test timeout to 1050s * 3.5
�[0m%4|1681771585.181|CONFWARN|0045_subscribe_update_mock#consumer-36| [thrd:app]: Configuration property enable.idempotence is a producer property and will be ignored by this consumer instance
%4|1681771585.182|CONFWARN|0045_subscribe_update_mock#consumer-36| [thrd:app]: Configuration property request.timeout.ms is a producer property and will be ignored by this consumer instance
�[36m[0045_subscribe_update_mock / 0.009s] Created kafka instance 0045_subscribe_update_mock#consumer-36
[0045_subscribe_update_mock / 0.011s] Creating topic topic_0 with 1 partitions
[0045_subscribe_update_mock / 0.014s] rd_kafka_mock_topic_create(mcluster, topic, partitions, 1): duration 0.020ms
[0045_subscribe_update_mock / 0.017s] Waiting for up to 15000ms for metadata update
[0025_timers / 5.555s] rd_kafka_poll(): duration 108.789ms
[0045_subscribe_update_mock / 0.020s] Metadata verification succeeded: 1 desired topics seen, 0 undesired topics not seen
[0045_subscribe_update_mock / 0.026s] All expected topics (not?) seen in metadata
[0045_subscribe_update_mock / 0.029s] METADATA.WAIT: duration 8.876ms
[0045_subscribe_update_mock / 0.033s] POLL: not expecting any messages for 300ms
[0045_subscribe_update_mock / 0.040s] TEST FAILURE

Test "0045_subscribe_update_mock (do_test_regex_many_mock:379: range with 50 topics)" failed at test.c:4104:test_consumer_poll_no_msgs() at Mon Apr 17 18:46:25 2023:

^topic_.* [0] error (offset -1001): Subscribed topic not available: ^topic_.*: Broker: Unknown topic or partition
�[36m[

/ 7.343s] Too many tests running (3 >= 3): postponing 0053_stats_timing start...

test_wait_metadata_update call in context (placed right before call to test_consumer_poll_no_msgs):

                // Verify that topics are reported by metadata 
                exp_mdtopics[i].topic = topic;
                exp_mdtopics[i].partition_cnt = 0;
                test_wait_metadata_update(rk, exp_mdtopics, i+1, NULL, 0,
                                  15 * 1000);

                test_consumer_poll_no_msgs("POLL", rk, 0,
                                           lots_of_topics ? 100 : MAX_TOPICS);

IMPORTANT: Always try to reproduce the issue on the latest released version (see https://github.com/edenhill/librdkafka/releases), if it can't be reproduced on the latest version the issue has been fixed.

Checklist

IMPORTANT: We will close issues where the checklist has not been completed.

Please provide the following information:

  • librdkafka version (release number or git tag): v2.1.0
  • Apache Kafka version: N/A, mock server used by this test
  • librdkafka client configuration: As configured in test suite: security.protocol="plaintext", bootstrap.servers="", partition.assignment.strategy="range", topic.metadata.propagation.max.ms=60000
  • Operating system: MacOS Monterey 12.1 Darwin Kernel Version 21.2.0
  • Provide logs (with debug=.. as necessary) from librdkafka
  • Provide broker log excerpts
  • Critical issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant