[Bug]: Python Beam does not fail a job if the main session was present could not be unpickled #25401
Closed
2 of 15 tasks
Labels
bug
dataflow
done & done
Issue has been reviewed after it was closed for verification, followups, etc.
P2
python
Milestone
What happened?
When launching a Beam pipeline in Python, the
--save_main_session
flag can be used to serialize the local session's state and load it on the workers. However, if the deserialization of this main session fails, a log line indicating "Could not load main session" is printed, but the pipeline continues to execute - just without its global state, causingNameError
s in various places.These
NameError
exceptions don't hint at the original root cause (which is that the main session state failed to deserialize, for whatever reason) which causes a large amount of confusion, especially for those unfamiliar with the intricacies of how Beam works under the hood.I resolved a similar issue back in Beam 2.31 (#14706) that failed to stop a job if the main session was corrupted; this seems similar.
Expected behaviour: if a main session is present but fails to deserialize, this is considered a "hard error" that stops the entire Beam job from running and forces the programmer to deal with the cause of the failure, rather than proceeding to run the job's code in a half-working state.
Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
Issue Components
The text was updated successfully, but these errors were encountered: