fix: protocol publisher fails to report error #25

eliteprox · 2025-08-25T15:56:33Z

This pull request introduces significant improvements to error handling, background task management, and stream lifecycle events in the pytrickle library and its example usage. The changes focus on making error callbacks consistently asynchronous, improving background task cleanup, and ensuring that stream stop events are handled gracefully. Additionally, the protocol and client classes now propagate errors and shutdown events more reliably, and global exception handling for asyncio is improved to suppress expected errors during shutdown.

Error handling and callback improvements:

All error callbacks are now required to be asynchronous functions, simplifying error propagation and handling throughout the codebase. (pytrickle/__init__.py [1] pytrickle/client.py [2] [3] [4] [5]
The protocol and client now propagate protocol shutdown and subscription end events to the client via the error callback, and the client distinguishes between clean shutdown and error conditions. (pytrickle/client.py [1] [2] [3]

Background task management and cleanup:

Added unified tracking and cleanup of background tasks in TrickleComponent, with automatic removal and exception handling on task completion. (pytrickle/base.py [1] [2]
The example video processor now starts background tasks only when the event loop is running, and cleans them up when the stream stops, using the new on_stream_stop callback. (examples/process_video_example.py [1] [2] [3]

Asyncio and shutdown robustness:

Introduced a global asyncio exception handler to suppress expected aiohttp connection reset errors during shutdown, reducing noise in logs. (pytrickle/base.py [1] pytrickle/protocol.py [2] [3]
Improved the data sending loop in the client to respond to both stop and error events, ensuring reliable shutdown and error handling. (pytrickle/client.py [1] [2]

Protocol task error handling:

Protocol subscribe and publish tasks are now wrapped with a generic error-handling wrapper, ensuring that any exceptions are logged and propagated via error callbacks. (pytrickle/protocol.py [1] [2] [3]

These changes make the library more robust, easier to debug, and safer to use in production environments by improving error visibility and resource cleanup.

add missing `component_name` to pub/sub for trickle health state protocol: track background publisher task

…op callback to stream_processor

ad-astra-video · 2025-08-27T03:50:11Z

pytrickle/client.py

            logger.info("Stopping protocol due to client loops ending")
+
+            # Call the optional on_stream_stop callback before stopping protocol
+            if hasattr(self.frame_processor, 'on_stream_stop') and self.frame_processor.on_stream_stop:


frame_processor does not have on_stream_stop right?

Correct, it's not an abstract method of frame_processor. It's appended by StreamProcessor on initialization like other callbacks registered to StreamProcessor.

StreamProcessor accepts an on_stream_stop callback as a parameter, similar to param_updater. _InternalFrameProcessor extends the abstract FrameProcessor class and stores the on_stream_stop callback as an attribute

If we want to include it I think we should add it to FrameProcessor similar to the error_callback. It seems strange to call something from TrickleClient that is setup in a higher level abstraction.

Let me know if I am missing something here tho.

In this case, on_stream_stop is more of an event triggered by the client, rather than a method called by the FrameProcessor. I added it as an abstract class now so it is available to implement in frame processors f347e41

Client still needs to call on_stream_stop at this point in the protocol shutdown sequence. There is no other coordination with FrameProcessors currently afaik.

pytrickle/pytrickle/client.py

Lines 117 to 124 in f347e41

if self.frame_processor.on_stream_stop:

try:

await self.frame_processor.on_stream_stop()

logger.info("Stream stop callback executed successfully")

except Exception as e:

logger.error(f"Error in stream stop callback: {e}")

await self.protocol.stop()

It seems strange to call something from TrickleClient that is setup in a higher level abstraction.

Yeah, it sounds off because TrickleClient was originally intended as a class for interacting with trickle protocol directly in a multimedia context. In a consumer context it's an internal component so the term client is a bit misleading.

I think once we reorganize classes into package namespaces this will be more clear

pytrickle/client.py

pytrickle/stream_processor.py

examples/process_video_example.py

ad-astra-video · 2025-08-27T04:09:32Z

pytrickle/client.py

                try:
-                    await asyncio.wait_for(self.stop_event.wait(), timeout=self.send_data_interval)
-                    break  # Stop event was set, exit loop
+                    done, pending = await asyncio.wait(


WDYT about just waiting for stop_event here?

Will error_event cause channels to close down before this could execute one last time?

Good catch!

fixed in 2703cff

I moved the waiting for stop/error to a function. Can you test to confirm still same behavior?

I tested and it works great, no issues found in stopping the publisher. Ran a high-rate publisher test and got ~120 msg/s which was expected. One message did fail, but did not catch why, likely due to message size limit from batching

We could possibly improve this by allowing _wait_for_interval to raise it's own error

…rror_event

…l_shutdown events to client

pytrickle/base.py

pytrickle/client.py

pytrickle/base.py

eliteprox linked an issue Aug 25, 2025 that may be closed by this pull request

runner error state when urls close down #19

Closed

eliteprox force-pushed the fix/trickle-pub-sub-component-name branch from ff709de to 0619a40 Compare August 25, 2025 16:16

eliteprox marked this pull request as ready for review August 25, 2025 16:16

eliteprox force-pushed the fix/trickle-pub-sub-component-name branch 2 times, most recently from 9cb577d to f134054 Compare August 25, 2025 18:51

eliteprox requested review from ad-astra-video and pschroedl August 25, 2025 21:34

eliteprox mentioned this pull request Aug 26, 2025

runner error state when urls close down #19

Closed

fix err handling in pub during client teardowm

108928c

add missing `component_name` to pub/sub for trickle health state protocol: track background publisher task

eliteprox force-pushed the fix/trickle-pub-sub-component-name branch from f134054 to 108928c Compare August 26, 2025 19:39

eliteprox added 2 commits August 26, 2025 16:42

fix publisher error notification to client, add optional on_stream_st…

109868f

…op callback to stream_processor

fix race condition on stop

74bfe69