Skip to content

Is there a way to convert a custom keras.utils.Sequence custom class to a tf.Data pipeline? #39523

Closed
@Abhishaike

Description

@Abhishaike

When I was building up my data pipeline, the Tensorflow docs were very insistent that generators are unsafe for multiprocessing, and that the best way to build up a multiprocessing streaming pipeline is to extend tensorflow.keras.utils.Sequence into your own custom class. This is written here: https://www.tensorflow.org/api_docs/python/tf/keras/utils/Sequence

So I did that, but now Tensorflow is telling me that Sequence extensions are ALSO not ideal for multiprocessing through the warning message multiprocessing can interact badly with TensorFlow, causing nondeterministic deadlocks. For high performance data pipelines tf.data is recommended.. So now the recommendation is to use tf.Data. And, as it were, I keep running into deadlocks 4~ epochs into training now.

Is there no converter between an existing sequence class and a tf.Data pipeline? It seems bizarre that the EXACT thing the Sequence extension class is recommended for seems to no longer work, and now only a brand new type of data pipeline will do the multiprocessing job. At the very least, this should be updated in the Sequence docs.

Metadata

Metadata

Assignees

Labels

comp:datatf.data related issuescomp:kerasKeras related issuesstaleThis label marks the issue/pr stale - to be closed automatically if no activitystat:awaiting responseStatus - Awaiting response from authortype:supportSupport issues

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions