Objectives:
- Memory-efficient loading of categorical data
- Communication-efficient training and evaluation at scale
- Easy to use with existing AI workflows
Features:
-
Performance:
- Support ORC format in data loading.
- Support data deduplication.
- Improve performance of data transfer.
- Improve performance of loading and shuffling string data.
- Support workers with unbalanced training data via SyncReplicasDataset.
- Support pipeline-based semi-synchronous training.
- Support a hierarchical embedding lookup. -
Usability
- Support standalone evaluation and prediction APIs of estimator and keras. -
Bugfixes:
- Fix shape calculation oftf.feature_column.shared_embeddings