-
Notifications
You must be signed in to change notification settings - Fork 577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: SavedModel Fingerprinting #415
Conversation
Waiting for approval from Cesar as the sponsor |
@ematejska Still in proposed phase but already landing in master: |
Thank you pointing that out. Sorry, the commit description should have been clearer and specify that there are no functional changes introduced in this commit since all development for SavedModel fingerprinting is gated behind a flag added in tensorflow/tensorflow@cdbc8d7. I am planning to turn on SM fingerprinting in the TF library once all doc approvals have been received and the feature has been thoroughly tested! |
Ok, but on the other side we need to understand if we want community RFC related PR merged before RFC approval. Probably not, so if we can it is better to be on the same page. |
This is now approved. |
Steps of canonicalization (1) Change the name of functions in the GraphDef nodes ("f" attribute) to a dummy name. (2) Completely clear the GraphDef library, which is another huge source of non-determinism. This is only temporary. In the future (a couple commits down the road), canonicalization will be smarter. bert1 and bert2 contain only the protobuf, no checkpoint (variable weights). RFC: tensorflow/community#415 PiperOrigin-RevId: 463478483
This CL adds another field (graph_def_program_hash) to the FingerprintDef protobuf. RFC: tensorflow/community#415 PiperOrigin-RevId: 464154116
… protobuf. This hash represents the SignatureDef map. First, we "regularize" the SignatureDef map by sorting the keys. Then we serialize and hash each SignatureDef. RFC: tensorflow/community#415 PiperOrigin-RevId: 464835702
RFC: tensorflow/community#415 PiperOrigin-RevId: 464866655
…d SavedObjectGraph. This commit only looks at the `concrete_functions` of the SavedObjectGraph, ignoring the `nodes`. RFC: tensorflow/community#415 PiperOrigin-RevId: 465475648
RFC: tensorflow/community#415 PiperOrigin-RevId: 465618721
…s of the graphdef during canonicalization. RFC: tensorflow/community#415 PiperOrigin-RevId: 466798923
For now, we only populate the `producer` field of the VersionDef, which describes the version of the code used to create the fingerprint. The `producer` field will be incremented every time the canonicalization code changes, the granularity of the change that warrants an increment is TBD. In the future, we can add a `min_consumer` field if the need arises or have additional functions for version checking. RFC: tensorflow/community#415 PiperOrigin-RevId: 467075556
467408627 by A. Unique TensorFlower<gardener@tensorflow.org>: Update sqlite version in TF -- 467380418 by A. Unique TensorFlower<gardener@tensorflow.org>: compat: Update forward compatibility horizon to 2022-08-13 -- 467378663 by A. Unique TensorFlower<gardener@tensorflow.org>: Update GraphDef version to 1222. -- 467363891 by A. Unique TensorFlower<gardener@tensorflow.org>: Update TFRT dependency to use revision http://github.com/tensorflow/runtime/commit/b750bc2999cf02abca6ad9eeff0a04ec7bf3b683. -- 467363622 by A. Unique TensorFlower<gardener@tensorflow.org>: [xla:runtime] NFC: Move constraints documentation from jitrt to xla/runtime/constraints -- 467362586 by A. Unique TensorFlower<gardener@tensorflow.org>: [xla:runtime] NFC: Extract JitCompilationContext library from jitrt and move it to xla/runtime -- 467361314 by A. Unique TensorFlower<gardener@tensorflow.org>: Update TFRT dependency to use revision http://github.com/tensorflow/runtime/commit/0a042cbb5275e6ff9a3a7c2748c74df6dcede09e. -- 467360160 by A. Unique TensorFlower<gardener@tensorflow.org>: [xla:runtime] NFC: Extract calling_convention library from jitrt and move it to xla/runtime -- 467341954 by A. Unique TensorFlower<gardener@tensorflow.org>: Op documentation update. update of g3doc/_includes/tf_passes.md -- 467341426 by A. Unique TensorFlower<gardener@tensorflow.org>: Refactor SELECT_V2 in preparation for porting to TFLM. -- 467340678 by A. Unique TensorFlower<gardener@tensorflow.org>: Create some global stat tracking for CompilationEnvironments. This tracking can be used to help debug cases in which multiple CompilationEnvironments are used to compile a single HloModule (which should not happen). -- 467339870 by A. Unique TensorFlower<gardener@tensorflow.org>: Automated rollback of changelist 467224197. 467339756 by A. Unique TensorFlower<gardener@tensorflow.org>: Update TFRT dependency to use revision http://github.com/tensorflow/runtime/commit/b20ec05d272477fa6223213687bb22145df92674. -- 467339529 by A. Unique TensorFlower<gardener@tensorflow.org>: [XLA] Bugfix for gather index parallel partitioning where the sharded non-parallel dims in indices are not handled. -- 467337900 by A. Unique TensorFlower<gardener@tensorflow.org>: [XLA] Minor renamings, refactorings, checks. -- 467337622 by A. Unique TensorFlower<gardener@tensorflow.org>: Remove unneeded dependency. -- 467337170 by A. Unique TensorFlower<gardener@tensorflow.org>: Integrate LLVM at llvm/llvm-project@2c3ca3b684bb Updates LLVM usage to match [2c3ca3b684bb](llvm/llvm-project@2c3ca3b684bb) -- 467335264 by A. Unique TensorFlower<gardener@tensorflow.org>: [SavedModel Fingerprinting] Add hash #5, which represents the checkpoint. The `checkpoint_hash` is a hash of the serialized .index file, which is the metadata file of the TensorBundle containing a string-string table of the name of a tensor to its serialized BundleEntryProto. The BundleEntryProto contains a crc32 hash of the tensor contents, but not the contents of the tensor itself. RFC: tensorflow/community#415 -- 467334010 by A. Unique TensorFlower<gardener@tensorflow.org>: Update TFRT dependency to use revision http://github.com/tensorflow/runtime/commit/76b3fea4cc9d5e7cb8a85798e41a61a55c301578. -- 467332094 by A. Unique TensorFlower<gardener@tensorflow.org>: [xla:runtime] NFC: Extract executable library from jitrt and move it to xla/runtime -- 467324078 by A. Unique TensorFlower<gardener@tensorflow.org>: #tf-data-service #codehealth Clean up clang-tidy report. missing #include <vector> for 'std::vector' -- 467322782 by A. Unique TensorFlower<gardener@tensorflow.org>: PR #57137: [oneDNN] Skip appending kernel registration to log message for MKL ops Imported from GitHub PR #57137 This PR skips printing kernel registrations for MKL ops since it leads to performance drop for some eager models caused by this commit c04f65d This is a temporary fix and the condition will be removed when support for block format is removed as a more permanent fix. Copybara import of the project: -- 89c4c20 by Kanvi Khanna <kanvi.khanna@intel.com>: [oneDNN] Skip appending kernel registration to log message for MKL ops Merging this change closes #57137 -- 467322425 by A. Unique TensorFlower<gardener@tensorflow.org>: #tf-data-service #codehealth Clean up clang-tidy report. missing #include <memory> for 'std::unique_ptr' -- 467321561 by A. Unique TensorFlower<gardener@tensorflow.org>: #tf-data-service #codehealth Clean up clang-tidy report. 'int64' is deprecated: Use int64_t instead. -- 467321058 by A. Unique TensorFlower<gardener@tensorflow.org>: PR #57089: [TF-TRT] Adjusting Conv2D Test Tolerance Imported from GitHub PR #57089 This PR adjusts & fixes the unittest tolerance for the test `Conv2DStridedNCHWTest` in INT8 mode. Copybara import of the project: -- 13e4bff by DEKHTIARJonathan <contact@jonathandekhtiar.eu>: [TF-TRT] Adjusting Conv2D Test Tolerance Merging this change closes #57089 -- 467320826 by A. Unique TensorFlower<gardener@tensorflow.org>: PR #55804: [TF-TRT] Various Cleanups & Python Debugging Assertion Improvements Imported from GitHub PR #55804 This PR cleans a few spots in the code base, improves the debuggability of assertion messages in unittests. And replace `distutils.version.LooseVersion` (deprecated) with `packaging.version.Version` (new recommended API). Copybara import of the project: -- a4d15ef by DEKHTIARJonathan <contact@jonathandekhtiar.eu>: [TF-TRT] Various Cleanups & Python Debugging Assertion Improvements Merging this change closes #55804 -- 467320083 by A. Unique TensorFlower<gardener@tensorflow.org>: #tf-data-service #codehealth Clean up clang-tidy report. missing #include <ostream> for 'std::ostream' -- 467319094 by A. Unique TensorFlower<gardener@tensorflow.org>: #tf-data-service #codehealth Clean up clang-tidy report. 'int64' is deprecated: Use int64_t instead. -- 467318151 by A. Unique TensorFlower<gardener@tensorflow.org>: #tf-data-service #codehealth Clean up clang-tidy report. missing #include <iterator> for 'std::back_inserter' -- 467316931 by A. Unique TensorFlower<gardener@tensorflow.org>: #tf-data-service #codehealth Clean up clang-tidy report. using decl 'IsSubsetOf' is unused -- 467316097 by A. Unique TensorFlower<gardener@tensorflow.org>: Move passes under tensorflow/compiler/mlir/tensorflow/. -- 467315812 by A. Unique TensorFlower<gardener@tensorflow.org>: #tf-data-service #codehealth Clean up clang-tidy report. missing #include <memory> for 'std::unique_ptr' -- 467314236 by A. Unique TensorFlower<gardener@tensorflow.org>: #tf-data-service #codehealth Clean up clang-tidy report. missing #include <memory> for 'std::unique_ptr' -- 467313254 by A. Unique TensorFlower<gardener@tensorflow.org>: #tf-data-service #codehealth Clean up clang-tidy report. missing #include <vector> for 'std::vector' missing #include <memory> for 'std::make_unique' -- 467312293 by A. Unique TensorFlower<gardener@tensorflow.org>: #tf-data-service #codehealth Clean up clang-tidy report. using decl 'RangeSquareDataset' is unused -- 467311309 by A. Unique TensorFlower<gardener@tensorflow.org>: #tf-data-service #codehealth Clean up clang-tidy report. missing #include <vector> for 'std::vector' -- 467310637 by A. Unique TensorFlower<gardener@tensorflow.org>: PR #57013: [TF-TRT] Add LogSoftmax Support for TF-TRT Imported from GitHub PR #57013 This PR adds TF-TRT support to `tf.nn.log_softmax` operation. This is performed using the formula `logsoftmax = logits - log(reduce_sum(exp(logits), axis=-1))` . The implemented TRT layers are fused into a single op. @DEKHTIARJonathan @tfeher : Please review the changes. Copybara import of the project: -- 1a8eb9a by Pavani Majety <pmajety@nvidia.com>: Add LogSoftmax conversion Fix Softmax comments [TF-TRT] Move LogSoftmax to use OpConverterBase Fix compiler errors clang-format Undo changes to convert_nodes.cc Fix comments Merging this change closes #57013 -- 467310335 by A. Unique TensorFlower<gardener@tensorflow.org>: #tf-data-service #codehealth Clean up clang-tidy report. missing #include <array> for 'std::array' -- 467310313 by A. Unique TensorFlower<gardener@tensorflow.org>: Update test config in cross device ops -- 467309032 by A. Unique TensorFlower<gardener@tensorflow.org>: Update TFRT dependency to use revision http://github.com/tensorflow/runtime/commit/eba528ef667653c3554984e5c05573b152c9893b. -- 467308765 by A. Unique TensorFlower<gardener@tensorflow.org>: #tf-data-service #codehealth Clean up clang-tidy report. missing #include <vector> for 'std::vector' -- 467307702 by A. Unique TensorFlower<gardener@tensorflow.org>: #tf-data-service #codehealth Clean up clang-tidy report. missing #include <vector> for 'std::vector' -- 467306473 by A. Unique TensorFlower<gardener@tensorflow.org>: #tf-data-service #codehealth Clean up clang-tidy report. 'int64' is deprecated: Use int64_t instead. -- 467306092 by A. Unique TensorFlower<gardener@tensorflow.org>: PR #56771: Add return_index_map argument in ssim() Imported from GitHub PR #56771 Closes #53115 Copybara import of the project: -- 8f5a1b1 by CohenAriel <ariel17112005@gmail.com>: Add return_index_map argument in ssim() Merging this change closes #56771 -- 467305190 by A. Unique TensorFlower<gardener@tensorflow.org>: #tf-data-service #codehealth Clean up clang-tidy report. missing #include <unordered_map> for 'std::unordered_map' missing #include <vector> for 'std::vector' missing #include <memory> for 'std::shared_ptr' -- 467304747 by A. Unique TensorFlower<gardener@tensorflow.org>: [tfrt:jitrt] NFC: Remove Executable::KernelContext It was added before runtime::KernelContext and is not used anywhere. Remove it to avoid confusion. In the future we should reuse runtime::KernelContext as an extension point for user-defined memory allocation etc. -- 467303335 by A. Unique TensorFlower<gardener@tensorflow.org>: #tf-data-service #codehealth Clean up clang-tidy report. 'int64' is deprecated: Use int64_t instead. -- 467301808 by A. Unique TensorFlower<gardener@tensorflow.org>: Changes all local `State` or `TaskState` enum in coordination service into `CoordinatedTaskState` enum in proto. -- 467300580 by A. Unique TensorFlower<gardener@tensorflow.org>: lite: enable variable freezing in tf_tfl_translate tester -- 467298890 by A. Unique TensorFlower<gardener@tensorflow.org>: Update TFRT dependency to use revision http://github.com/tensorflow/runtime/commit/9bb23f7d1ee0e9a55d26c7168790667e5266a74c. -- 467292686 by A. Unique TensorFlower<gardener@tensorflow.org>: [xla:runtime] NFC: Move execution_engine library from jitrt to xla/runtime -- 467280901 by A. Unique TensorFlower<gardener@tensorflow.org>: [GML] Add tests for concat in the GML tiling and fusion pipeline -- 467276349 by A. Unique TensorFlower<gardener@tensorflow.org>: [GML] Implement dim-based shape reification for concat -- 467273958 by A. Unique TensorFlower<gardener@tensorflow.org>: Change mutexes under stream_executor/gpu to use absl::Mutex and absl::MutexLock instead of tensorflow::mutex and tensorflow::mutex_lock. Change instance of absl::make_unique to std::make_unique -- 467272897 by A. Unique TensorFlower<gardener@tensorflow.org>: [tf.data] Prepend `/bufferedio/` for all paths passed to LoadDataset op. -- PiperOrigin-RevId: 467408627
…d_model_fingerprint.pb -> fingerprint.pb This follows the design in the the RFC. tensorflow/community#415 PiperOrigin-RevId: 469824087
…d_model_checksum. tensorflow/community#415 PiperOrigin-RevId: 489027244
…model_checksum". tensorflow/community#415 PiperOrigin-RevId: 492251934
[SavedModel Fingerprinting] Amend RFC #415 to include public API.
…print given the file path (`export_dir`) of the SavedModel. tensorflow/community#415 PiperOrigin-RevId: 493957227
…print given the file path (`export_dir`) of the SavedModel. tensorflow/community#415 PiperOrigin-RevId: 493957227
…_fingerprint(export_dir)` for reading a fingerprint given the file path (`export_dir`) of the SavedModel. tensorflow/community#415 PiperOrigin-RevId: 493957227
…_fingerprint(export_dir)` for reading a fingerprint given the file path (`export_dir`) of the SavedModel. tensorflow/community#415 PiperOrigin-RevId: 493957227
…mentation. tensorflow/community#415 PiperOrigin-RevId: 495996544
…_fingerprint(export_dir)` for reading a fingerprint given the file path (`export_dir`) of the SavedModel. tensorflow/community#415 PiperOrigin-RevId: 493957227
…_fingerprint(export_dir)` for reading a fingerprint given the file path (`export_dir`) of the SavedModel. tensorflow/community#415 PiperOrigin-RevId: 493957227
…_fingerprint(export_dir)` for reading a fingerprint given the file path (`export_dir`) of the SavedModel. tensorflow/community#415 PiperOrigin-RevId: 493957227
…_fingerprint(export_dir)` for reading a fingerprint given the file path (`export_dir`) of the SavedModel. tensorflow/community#415 PiperOrigin-RevId: 493957227
…_fingerprint(export_dir)` for reading a fingerprint given the file path (`export_dir`) of the SavedModel. tensorflow/community#415 PiperOrigin-RevId: 493957227
…_fingerprint(export_dir)` for reading a fingerprint given the file path (`export_dir`) of the SavedModel. tensorflow/community#415 PiperOrigin-RevId: 493957227
…rimental.read_fingerprint(export_dir)` for reading a fingerprint given the file path (`export_dir`) of the SavedModel. Also, introduce new class `tf.saved_model.experimental.Fingerprint` which contains the fingerprint. tensorflow/community#415 tensorflow/community#439 PiperOrigin-RevId: 493957227
…rimental.read_fingerprint(export_dir)` for reading a fingerprint given the file path (`export_dir`) of the SavedModel. Also, introduce new class `tf.saved_model.experimental.Fingerprint` which contains the fingerprint. tensorflow/community#415 tensorflow/community#439 PiperOrigin-RevId: 501965978
…rimental.read_fingerprint(export_dir)` for reading a fingerprint given the file path (`export_dir`) of the SavedModel. Also, introduce new class `tf.saved_model.experimental.Fingerprint` which contains the fingerprint. tensorflow/community#415 tensorflow/community#439 PiperOrigin-RevId: 501965978
…rimental.read_fingerprint(export_dir)` for reading a fingerprint given the file path (`export_dir`) of the SavedModel. Also, introduce new class `tf.saved_model.experimental.Fingerprint` which contains the fingerprint. tensorflow/community#415 tensorflow/community#439 PiperOrigin-RevId: 501965978
…the SavedModel guide. Starting in TF 2.11, all SavedModels written using `tf.saved_model.save` will contain a `fingerprint.pb` file. tensorflow/community#415 PiperOrigin-RevId: 520938759
This RFC will be open for comment until June 23, 2022.
Objective
Following the convention of other types of electronic files and artifacts, the SavedModel format would benefit from having a fingerprint that uniquely identify the program it serializes. This fingerprint will better enable users to track their SavedModels in ML pipelines and other infrastructure using native metadata.