Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: SavedModel Fingerprinting #415

Merged
merged 4 commits into from
Jul 13, 2022
Merged

RFC: SavedModel Fingerprinting #415

merged 4 commits into from
Jul 13, 2022

Conversation

monicadsong
Copy link
Member

@monicadsong monicadsong commented Jun 10, 2022

This RFC will be open for comment until June 23, 2022.

Status Approved
RFC # 415
Author(s) Monica Song (monicadsong@google.com)
Sponsor Cesar Crusius (ccrusius@google.com)
Updated 2022-07-13

Objective

Following the convention of other types of electronic files and artifacts, the SavedModel format would benefit from having a fingerprint that uniquely identify the program it serializes. This fingerprint will better enable users to track their SavedModels in ML pipelines and other infrastructure using native metadata.

@ematejska ematejska added the RFC: Proposed RFC Design Document label Jun 13, 2022
@ematejska
Copy link
Contributor

Waiting for approval from Cesar as the sponsor

@bhack
Copy link
Contributor

bhack commented Jul 11, 2022

@ematejska Still in proposed phase but already landing in master:

tensorflow/tensorflow@c075b15

@monicadsong
Copy link
Member Author

@ematejska Still in proposed phase but already landing in master:

tensorflow/tensorflow@c075b15

Thank you pointing that out. Sorry, the commit description should have been clearer and specify that there are no functional changes introduced in this commit since all development for SavedModel fingerprinting is gated behind a flag added in tensorflow/tensorflow@cdbc8d7.

I am planning to turn on SM fingerprinting in the TF library once all doc approvals have been received and the feature has been thoroughly tested!

@bhack
Copy link
Contributor

bhack commented Jul 11, 2022

Ok, but on the other side we need to understand if we want community RFC related PR merged before RFC approval.

Probably not, so if we can it is better to be on the same page.

@ematejska
Copy link
Contributor

This is now approved.

@ematejska ematejska merged commit 73e791e into tensorflow:master Jul 13, 2022
@ematejska ematejska added RFC: Accepted RFC Design Document: Accepted by Review and removed RFC: Proposed RFC Design Document labels Jul 13, 2022
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Jul 27, 2022
Steps of canonicalization
(1) Change the name of functions in the GraphDef nodes ("f" attribute) to a dummy name.
(2) Completely clear the GraphDef library, which is another huge source of non-determinism.

This is only temporary. In the future (a couple commits down the road), canonicalization will be smarter.

bert1 and bert2 contain only the protobuf, no checkpoint (variable weights).

RFC: tensorflow/community#415
PiperOrigin-RevId: 463478483
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Jul 29, 2022
This CL adds another field (graph_def_program_hash) to the FingerprintDef protobuf.

RFC: tensorflow/community#415
PiperOrigin-RevId: 464154116
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Aug 2, 2022
… protobuf.

This hash represents the SignatureDef map. First, we "regularize" the SignatureDef map by sorting the keys. Then we serialize and hash each SignatureDef.

RFC: tensorflow/community#415
PiperOrigin-RevId: 464835702
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Aug 2, 2022
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Aug 5, 2022
…d SavedObjectGraph.

This commit only looks at the `concrete_functions` of the SavedObjectGraph, ignoring the `nodes`.

RFC: tensorflow/community#415
PiperOrigin-RevId: 465475648
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Aug 5, 2022
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Aug 10, 2022
…s of the graphdef during canonicalization.

RFC: tensorflow/community#415
PiperOrigin-RevId: 466798923
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Aug 11, 2022
For now, we only populate the `producer` field of the VersionDef, which describes the version of the code used to create the fingerprint. The `producer` field will be incremented every time the canonicalization code changes, the granularity of the change that warrants an increment is TBD.

In the future, we can add a `min_consumer` field if the need arises or have additional functions for version checking.

RFC: tensorflow/community#415
PiperOrigin-RevId: 467075556
mihaimaruseac pushed a commit to tensorflow/tensorflow that referenced this pull request Aug 13, 2022
467408627  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Update sqlite version in TF

--
467380418  by A. Unique TensorFlower<gardener@tensorflow.org>:

    compat: Update forward compatibility horizon to 2022-08-13

--
467378663  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Update GraphDef version to 1222.

--
467363891  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Update TFRT dependency to use revision
    http://github.com/tensorflow/runtime/commit/b750bc2999cf02abca6ad9eeff0a04ec7bf3b683.

--
467363622  by A. Unique TensorFlower<gardener@tensorflow.org>:

    [xla:runtime] NFC: Move constraints documentation from jitrt to xla/runtime/constraints

--
467362586  by A. Unique TensorFlower<gardener@tensorflow.org>:

    [xla:runtime] NFC: Extract JitCompilationContext library from jitrt and move it to xla/runtime

--
467361314  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Update TFRT dependency to use revision
    http://github.com/tensorflow/runtime/commit/0a042cbb5275e6ff9a3a7c2748c74df6dcede09e.

--
467360160  by A. Unique TensorFlower<gardener@tensorflow.org>:

    [xla:runtime] NFC: Extract calling_convention library from jitrt and move it to xla/runtime

--
467341954  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Op documentation update.
    	update of g3doc/_includes/tf_passes.md

--
467341426  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Refactor SELECT_V2 in preparation for porting to TFLM.

--
467340678  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Create some global stat tracking for CompilationEnvironments. This tracking can be used to help debug cases in which multiple CompilationEnvironments are used to compile a single HloModule (which should not happen).

--
467339870  by A. Unique TensorFlower<gardener@tensorflow.org>:
    Automated rollback of changelist 467224197.

467339756  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Update TFRT dependency to use revision
    http://github.com/tensorflow/runtime/commit/b20ec05d272477fa6223213687bb22145df92674.

--
467339529  by A. Unique TensorFlower<gardener@tensorflow.org>:

    [XLA] Bugfix for gather index parallel partitioning where the sharded non-parallel dims in indices are not handled.

--
467337900  by A. Unique TensorFlower<gardener@tensorflow.org>:

    [XLA] Minor renamings, refactorings, checks.

--
467337622  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Remove unneeded dependency.

--
467337170  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Integrate LLVM at llvm/llvm-project@2c3ca3b684bb

    Updates LLVM usage to match
    [2c3ca3b684bb](llvm/llvm-project@2c3ca3b684bb)

--
467335264  by A. Unique TensorFlower<gardener@tensorflow.org>:

    [SavedModel Fingerprinting] Add hash #5, which represents the checkpoint.

    The `checkpoint_hash` is a hash of the serialized .index file, which is the metadata file of the TensorBundle containing a string-string table
    of the name of a tensor to its serialized BundleEntryProto. The BundleEntryProto contains a crc32 hash of the tensor contents, but not the contents of the tensor itself.

    RFC: tensorflow/community#415

--
467334010  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Update TFRT dependency to use revision
    http://github.com/tensorflow/runtime/commit/76b3fea4cc9d5e7cb8a85798e41a61a55c301578.

--
467332094  by A. Unique TensorFlower<gardener@tensorflow.org>:

    [xla:runtime] NFC: Extract executable library from jitrt and move it to xla/runtime

--
467324078  by A. Unique TensorFlower<gardener@tensorflow.org>:

    #tf-data-service #codehealth Clean up clang-tidy report.

    missing #include <vector> for 'std::vector'

--
467322782  by A. Unique TensorFlower<gardener@tensorflow.org>:

    PR #57137: [oneDNN] Skip appending kernel registration to log message for MKL ops

    Imported from GitHub PR #57137

    This PR skips printing kernel registrations for MKL ops since it leads to performance drop for some eager models caused by this commit c04f65d This is a temporary fix and the condition will be removed when support for block format is removed as a more permanent fix.
    Copybara import of the project:

    --
    89c4c20 by Kanvi Khanna <kanvi.khanna@intel.com>:

    [oneDNN] Skip appending kernel registration to log message for MKL ops

    Merging this change closes #57137

--
467322425  by A. Unique TensorFlower<gardener@tensorflow.org>:

    #tf-data-service #codehealth Clean up clang-tidy report.

    missing #include <memory> for 'std::unique_ptr'

--
467321561  by A. Unique TensorFlower<gardener@tensorflow.org>:

    #tf-data-service #codehealth Clean up clang-tidy report.

    'int64' is deprecated: Use int64_t instead.

--
467321058  by A. Unique TensorFlower<gardener@tensorflow.org>:

    PR #57089: [TF-TRT] Adjusting Conv2D Test Tolerance

    Imported from GitHub PR #57089

    This PR adjusts & fixes the unittest tolerance for the test `Conv2DStridedNCHWTest` in INT8 mode.

    Copybara import of the project:

    --
    13e4bff by DEKHTIARJonathan <contact@jonathandekhtiar.eu>:

    [TF-TRT] Adjusting Conv2D Test Tolerance

    Merging this change closes #57089

--
467320826  by A. Unique TensorFlower<gardener@tensorflow.org>:

    PR #55804: [TF-TRT] Various Cleanups & Python Debugging Assertion Improvements

    Imported from GitHub PR #55804

    This PR cleans a few spots in the code base, improves the debuggability of assertion messages in unittests. And replace `distutils.version.LooseVersion` (deprecated) with `packaging.version.Version` (new recommended API).
    Copybara import of the project:

    --
    a4d15ef by DEKHTIARJonathan <contact@jonathandekhtiar.eu>:

    [TF-TRT] Various Cleanups & Python Debugging Assertion Improvements

    Merging this change closes #55804

--
467320083  by A. Unique TensorFlower<gardener@tensorflow.org>:

    #tf-data-service #codehealth Clean up clang-tidy report.

    missing #include <ostream> for 'std::ostream'

--
467319094  by A. Unique TensorFlower<gardener@tensorflow.org>:

    #tf-data-service #codehealth Clean up clang-tidy report.

    'int64' is deprecated: Use int64_t instead.

--
467318151  by A. Unique TensorFlower<gardener@tensorflow.org>:

    #tf-data-service #codehealth Clean up clang-tidy report.

    missing #include <iterator> for 'std::back_inserter'

--
467316931  by A. Unique TensorFlower<gardener@tensorflow.org>:

    #tf-data-service #codehealth Clean up clang-tidy report.

    using decl 'IsSubsetOf' is unused

--
467316097  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Move passes under tensorflow/compiler/mlir/tensorflow/.

--
467315812  by A. Unique TensorFlower<gardener@tensorflow.org>:

    #tf-data-service #codehealth Clean up clang-tidy report.

    missing #include <memory> for 'std::unique_ptr'

--
467314236  by A. Unique TensorFlower<gardener@tensorflow.org>:

    #tf-data-service #codehealth Clean up clang-tidy report.

    missing #include <memory> for 'std::unique_ptr'

--
467313254  by A. Unique TensorFlower<gardener@tensorflow.org>:

    #tf-data-service #codehealth Clean up clang-tidy report.

    missing #include <vector> for 'std::vector'
    missing #include <memory> for 'std::make_unique'

--
467312293  by A. Unique TensorFlower<gardener@tensorflow.org>:

    #tf-data-service #codehealth Clean up clang-tidy report.

    using decl 'RangeSquareDataset' is unused

--
467311309  by A. Unique TensorFlower<gardener@tensorflow.org>:

    #tf-data-service #codehealth Clean up clang-tidy report.

    missing #include <vector> for 'std::vector'

--
467310637  by A. Unique TensorFlower<gardener@tensorflow.org>:

    PR #57013: [TF-TRT] Add LogSoftmax Support for TF-TRT

    Imported from GitHub PR #57013

    This PR adds TF-TRT support to `tf.nn.log_softmax` operation.  This is performed using the formula `logsoftmax = logits - log(reduce_sum(exp(logits), axis=-1))` .  The implemented TRT layers are fused into a single op.

    @DEKHTIARJonathan @tfeher : Please review the changes.
    Copybara import of the project:

    --
    1a8eb9a by Pavani Majety <pmajety@nvidia.com>:

    Add LogSoftmax conversion

    Fix Softmax comments

    [TF-TRT] Move LogSoftmax to use OpConverterBase

    Fix compiler errors

    clang-format

    Undo changes to convert_nodes.cc

    Fix comments

    Merging this change closes #57013

--
467310335  by A. Unique TensorFlower<gardener@tensorflow.org>:

    #tf-data-service #codehealth Clean up clang-tidy report.

    missing #include <array> for 'std::array'

--
467310313  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Update test config in cross device ops

--
467309032  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Update TFRT dependency to use revision
    http://github.com/tensorflow/runtime/commit/eba528ef667653c3554984e5c05573b152c9893b.

--
467308765  by A. Unique TensorFlower<gardener@tensorflow.org>:

    #tf-data-service #codehealth Clean up clang-tidy report.

    missing #include <vector> for 'std::vector'

--
467307702  by A. Unique TensorFlower<gardener@tensorflow.org>:

    #tf-data-service #codehealth Clean up clang-tidy report.

    missing #include <vector> for 'std::vector'

--
467306473  by A. Unique TensorFlower<gardener@tensorflow.org>:

    #tf-data-service #codehealth Clean up clang-tidy report.

    'int64' is deprecated: Use int64_t instead.

--
467306092  by A. Unique TensorFlower<gardener@tensorflow.org>:

    PR #56771: Add return_index_map argument in ssim()

    Imported from GitHub PR #56771

    Closes #53115
    Copybara import of the project:

    --
    8f5a1b1 by CohenAriel <ariel17112005@gmail.com>:

    Add return_index_map argument in ssim()

    Merging this change closes #56771

--
467305190  by A. Unique TensorFlower<gardener@tensorflow.org>:

    #tf-data-service #codehealth Clean up clang-tidy report.

    missing #include <unordered_map> for 'std::unordered_map'
    missing #include <vector> for 'std::vector'
    missing #include <memory> for 'std::shared_ptr'

--
467304747  by A. Unique TensorFlower<gardener@tensorflow.org>:

    [tfrt:jitrt] NFC: Remove Executable::KernelContext

    It was added before runtime::KernelContext and is not used anywhere. Remove it to avoid confusion. In the future we should reuse runtime::KernelContext as an extension point for user-defined memory allocation etc.

--
467303335  by A. Unique TensorFlower<gardener@tensorflow.org>:

    #tf-data-service #codehealth Clean up clang-tidy report.

    'int64' is deprecated: Use int64_t instead.

--
467301808  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Changes all local `State` or `TaskState` enum in coordination service into `CoordinatedTaskState` enum in proto.

--
467300580  by A. Unique TensorFlower<gardener@tensorflow.org>:

    lite: enable variable freezing in tf_tfl_translate tester

--
467298890  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Update TFRT dependency to use revision
    http://github.com/tensorflow/runtime/commit/9bb23f7d1ee0e9a55d26c7168790667e5266a74c.

--
467292686  by A. Unique TensorFlower<gardener@tensorflow.org>:

    [xla:runtime] NFC: Move execution_engine library from jitrt to xla/runtime

--
467280901  by A. Unique TensorFlower<gardener@tensorflow.org>:

    [GML] Add tests for concat in the GML tiling and fusion pipeline

--
467276349  by A. Unique TensorFlower<gardener@tensorflow.org>:

    [GML] Implement dim-based shape reification for concat

--
467273958  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Change mutexes under stream_executor/gpu to use absl::Mutex and absl::MutexLock instead of tensorflow::mutex and tensorflow::mutex_lock. Change instance of absl::make_unique to std::make_unique

--
467272897  by A. Unique TensorFlower<gardener@tensorflow.org>:

    [tf.data] Prepend `/bufferedio/` for all paths passed to LoadDataset op.

--

PiperOrigin-RevId: 467408627
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Aug 24, 2022
…d_model_fingerprint.pb -> fingerprint.pb

This follows the design in the the RFC.

tensorflow/community#415

PiperOrigin-RevId: 469824087
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Nov 16, 2022
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Dec 1, 2022
theadactyl added a commit that referenced this pull request Dec 5, 2022
[SavedModel Fingerprinting] Amend RFC #415 to include public API.
copybara-service bot pushed a commit to google/tsl that referenced this pull request Dec 14, 2022
…print given the file path (`export_dir`) of the SavedModel.

tensorflow/community#415

PiperOrigin-RevId: 493957227
copybara-service bot pushed a commit to openxla/xla that referenced this pull request Dec 14, 2022
…print given the file path (`export_dir`) of the SavedModel.

tensorflow/community#415

PiperOrigin-RevId: 493957227
copybara-service bot pushed a commit to google/tsl that referenced this pull request Dec 16, 2022
…_fingerprint(export_dir)` for reading a fingerprint given the file path (`export_dir`) of the SavedModel.

tensorflow/community#415

PiperOrigin-RevId: 493957227
copybara-service bot pushed a commit to google/tsl that referenced this pull request Dec 16, 2022
…_fingerprint(export_dir)` for reading a fingerprint given the file path (`export_dir`) of the SavedModel.

tensorflow/community#415

PiperOrigin-RevId: 493957227
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Dec 17, 2022
copybara-service bot pushed a commit to google/tsl that referenced this pull request Jan 5, 2023
…_fingerprint(export_dir)` for reading a fingerprint given the file path (`export_dir`) of the SavedModel.

tensorflow/community#415

PiperOrigin-RevId: 493957227
copybara-service bot pushed a commit to google/tsl that referenced this pull request Jan 5, 2023
…_fingerprint(export_dir)` for reading a fingerprint given the file path (`export_dir`) of the SavedModel.

tensorflow/community#415

PiperOrigin-RevId: 493957227
copybara-service bot pushed a commit to google/tsl that referenced this pull request Jan 5, 2023
…_fingerprint(export_dir)` for reading a fingerprint given the file path (`export_dir`) of the SavedModel.

tensorflow/community#415

PiperOrigin-RevId: 493957227
copybara-service bot pushed a commit to google/tsl that referenced this pull request Jan 5, 2023
…_fingerprint(export_dir)` for reading a fingerprint given the file path (`export_dir`) of the SavedModel.

tensorflow/community#415

PiperOrigin-RevId: 493957227
copybara-service bot pushed a commit to google/tsl that referenced this pull request Jan 5, 2023
…_fingerprint(export_dir)` for reading a fingerprint given the file path (`export_dir`) of the SavedModel.

tensorflow/community#415

PiperOrigin-RevId: 493957227
copybara-service bot pushed a commit to openxla/xla that referenced this pull request Jan 5, 2023
…_fingerprint(export_dir)` for reading a fingerprint given the file path (`export_dir`) of the SavedModel.

tensorflow/community#415

PiperOrigin-RevId: 493957227
copybara-service bot pushed a commit to google/tsl that referenced this pull request Jan 13, 2023
…rimental.read_fingerprint(export_dir)` for reading a fingerprint given the file path (`export_dir`) of the SavedModel. Also, introduce new class `tf.saved_model.experimental.Fingerprint` which contains the fingerprint.

tensorflow/community#415
tensorflow/community#439

PiperOrigin-RevId: 493957227
copybara-service bot pushed a commit to openxla/xla that referenced this pull request Jan 14, 2023
…rimental.read_fingerprint(export_dir)` for reading a fingerprint given the file path (`export_dir`) of the SavedModel. Also, introduce new class `tf.saved_model.experimental.Fingerprint` which contains the fingerprint.

tensorflow/community#415
tensorflow/community#439

PiperOrigin-RevId: 501965978
copybara-service bot pushed a commit to google/tsl that referenced this pull request Jan 14, 2023
…rimental.read_fingerprint(export_dir)` for reading a fingerprint given the file path (`export_dir`) of the SavedModel. Also, introduce new class `tf.saved_model.experimental.Fingerprint` which contains the fingerprint.

tensorflow/community#415
tensorflow/community#439

PiperOrigin-RevId: 501965978
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Jan 14, 2023
…rimental.read_fingerprint(export_dir)` for reading a fingerprint given the file path (`export_dir`) of the SavedModel. Also, introduce new class `tf.saved_model.experimental.Fingerprint` which contains the fingerprint.

tensorflow/community#415
tensorflow/community#439

PiperOrigin-RevId: 501965978
copybara-service bot pushed a commit to tensorflow/docs that referenced this pull request Mar 31, 2023
…the SavedModel guide.

Starting in TF 2.11, all SavedModels written using `tf.saved_model.save` will contain a `fingerprint.pb` file.

tensorflow/community#415

PiperOrigin-RevId: 520938759
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
RFC: Accepted RFC Design Document: Accepted by Review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants