Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x-pack.metricbeat.module.activemq.test_activemq tests are failing with a timeout #35851

Open
cmacknz opened this issue Jun 21, 2023 · 3 comments
Labels
flaky-test Unstable or unreliable test cases. Team:Services (Deprecated) Label for the former Integrations-Services team

Comments

@cmacknz
Copy link
Member

cmacknz commented Jun 21, 2023

Flaky Test

Stack Trace

self = <test_activemq.ActiveMqTest_0 testMethod=test_queue_metrics_collected>

    @unittest.skipUnless(metricbeat.INTEGRATION_TESTS, 'integration test')
    def test_queue_metrics_collected(self):
>       self.verify_destination_metrics_collection('queue')

module/activemq/test_activemq.py:96: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
module/activemq/test_activemq.py:70: in verify_destination_metrics_collection
    conn.disconnect()
/opt/venv/lib/python3.11/site-packages/stomp/connect.py:185: in disconnect
    self.transport.stop()
/opt/venv/lib/python3.11/site-packages/stomp/transport.py:122: in stop
    self.__receiver_thread_exit_condition.wait()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <Condition(<unlocked _thread.RLock object owner=0 count=0 at 0x7f6f1d4eb980>, 0)>
timeout = None

    def wait(self, timeout=None):
        """Wait until notified or until a timeout occurs.
    
        If the calling thread has not acquired the lock when this method is
        called, a RuntimeError is raised.
    
        This method releases the underlying lock, and then blocks until it is
        awakened by a notify() or notify_all() call for the same condition
        variable in another thread, or until the optional timeout occurs. Once
        awakened or timed out, it re-acquires the lock and returns.
    
        When the timeout argument is present and not None, it should be a
        floating point number specifying a timeout for the operation in seconds
        (or fractions thereof).
    
        When the underlying lock is an RLock, it is not released using its
        release() method, since this may not actually unlock the lock when it
        was acquired multiple times recursively. Instead, an internal interface
        of the RLock class is used, which really unlocks it even when it has
        been recursively acquired several times. Another internal interface is
        then used to restore the recursion level when the lock is reacquired.
    
        """
        if not self._is_owned():
            raise RuntimeError("cannot wait on un-acquired lock")
        waiter = _allocate_lock()
        waiter.acquire()
        self._waiters.append(waiter)
        saved_state = self._release_save()
        gotit = False
        try:    # restore state no matter what (e.g., KeyboardInterrupt)
            if timeout is None:
>               waiter.acquire()
E               Failed: Timeout >90.0s

/usr/lib/python3.11/threading.py:320: Failed
@cmacknz cmacknz added flaky-test Unstable or unreliable test cases. Team:Services (Deprecated) Label for the former Integrations-Services team labels Jun 21, 2023
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations-services (Team:Services)

@cmacknz
Copy link
Member Author

cmacknz commented Jun 21, 2023

Possibly this relates to 62374dd

The odd thing is that there were no changes to activemq at all. If you start only the activemq container it never becomes healthy.

cd x-pack/metricbeat/module/activemq
docker-compose up

The health check that is failing is:

HEALTHCHECK --interval=1s --retries=90 CMD nc -w 1 -v 127.0.0.1 $ACTIVEMQ_STOMP </dev/null && \
nc -w 1 -v 127.0.0.1 $ACTIVEMQ_REST </dev/null

If you run this in the container started manually it fails with no output.

@cmacknz
Copy link
Member Author

cmacknz commented Jun 21, 2023

Possibly this relates to 62374dd

Reverting 62374dd does not fix the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flaky-test Unstable or unreliable test cases. Team:Services (Deprecated) Label for the former Integrations-Services team
Projects
None yet
Development

No branches or pull requests

2 participants