uploader: fix graph RPC test #3507

wchargin · 2020-04-13T19:10:25Z

Summary:
The graph uploader test wasn’t actually covering the blob writing code,
and was set up incorrectly such that the request iterator would throw
when forced.

Test Plan:
The new assertions pass, but fail if the RPC changes are reverted.

wchargin-branch: uploader-graph-write-test

Summary: The graph uploader test wasn’t actually covering the blob writing code, and was set up incorrectly such that the request iterator would throw when forced. Test Plan: The new assertions pass, but fail if the RPC changes are reverted. wchargin-branch: uploader-graph-write-test

davidsoergel

I certainly agree that this is a more thorough test, but now I don't quite see why it passed and/or why it should not have passed before. Also I think some of the other tests in the same file may suffer from the same issue, in which case we should fix them all at once.

Briefly, I think blob_sequence_id was being set to a default/unconfigured Mock object, because GetOrCreateBlobSequence was not mocked. _write_blob_request_iterator() must have worked anyway, because the test confirmed that WriteBlob was called 10 times. It was indeed being called with somewhat bogus arguments (e.g., including the Mock for blob_sequence_id), but so what? Which code path was actually not being tested that you think should have been?

davidsoergel · 2020-04-13T19:47:25Z

tensorboard/uploader/uploader_test.py

            uploader, "_logdir_loader", mock_logdir_loader
        ), self.assertRaises(AbortUploadError):
            uploader.start_uploading()
        self.assertEqual(1, mock_client.CreateExperiment.call_count)


Could add self.assertEqual(5, mock_client. GetOrCreateBlobSequence.call_count) here while we're at it.

It’s called 10 times, not 5; twice for each run. I’m presuming that that
was intentional? but unless you feel strongly I’d prefer not to add such
an assertion, as (a) it’s unchanged by this patch and (b) I try to limit
mocking and mock assertions wherever possible because of how brittle
they are and how hard they make it to further update the code and tests.

Ah indeed, 10 times with 1 graph each. Anyway, fine to leave the assertion out.

wchargin-branch: uploader-graph-write-test wchargin-source: 35940e6556b4145af8ab5555c291072849246031

wchargin

I certainly agree that this is a more thorough test, but now I don't
quite see why it passed and/or why it should not have passed before.

It passed before because the request iterator was never forced and so
the body of _write_blob_request_iterator was never actually run. To
reveal the underlying failure, it suffices to add:

for call in mock_client.WriteBlob.call_args_list:
    list(call[0][0])

after any assertion that WriteBlob.call_count is a nonzero value.

This raises an error because it tries to create a WriteBlobRequest
with the given blob sequence ID, but the blob sequence ID is just a mock
dummy object because the GetOrCreateBlobSequence RPC was silently
mocked out.

The tests still pass at master even if the code is obviously broken:

$ git log --oneline -1
b2d4712f (HEAD, origin/master, origin/HEAD) benchmark: remove unused deps (#3505)
$ git diff
diff --git a/tensorboard/uploader/uploader.py b/tensorboard/uploader/uploader.py
index e5168aef..bd7a1282 100644
--- a/tensorboard/uploader/uploader.py
+++ b/tensorboard/uploader/uploader.py
@@ -774,24 +774,8 @@ class _BlobRequestSender(object):
                 raise
 
     def _write_blob_request_iterator(self, blob_sequence_id, seq_index, blob):
-        # For now all use cases have the blob in memory already.
-        # In the future we may want to stream from disk; that will require
-        # refactoring here.
-        # TODO(soergel): compute crc32c's to allow server-side data validation.
-        for offset in range(0, len(blob), BLOB_CHUNK_SIZE):
-            chunk = blob[offset : offset + BLOB_CHUNK_SIZE]
-            finalize_object = offset + BLOB_CHUNK_SIZE >= len(blob)
-            request = write_service_pb2.WriteBlobRequest(
-                blob_sequence_id=blob_sequence_id,
-                index=seq_index,
-                data=chunk,
-                offset=offset,
-                crc32c=None,
-                finalize_object=finalize_object,
-                final_crc32c=None,
-                blob_bytes=len(blob),
-            )
-            yield request
+        raise NotImplementedError("hmm")
+        yield
 
 
 @contextlib.contextmanager
$ bazel test //tensorboard/uploader/... --nocache_test_results
INFO: Analyzed 22 targets (0 packages loaded, 0 targets configured).
INFO: Found 14 targets and 8 test targets...
INFO: Elapsed time: 2.698s, Critical Path: 2.44s
INFO: 8 processes: 8 linux-sandbox.
INFO: Build completed successfully, 9 total actions
//tensorboard/uploader:auth_test                                         PASSED in 0.5s
//tensorboard/uploader:exporter_test                                     PASSED in 0.5s
//tensorboard/uploader:flags_parser_test                                 PASSED in 0.3s
//tensorboard/uploader:formatters_test                                   PASSED in 0.4s
//tensorboard/uploader:logdir_loader_test                                PASSED in 1.6s
//tensorboard/uploader:server_info_test                                  PASSED in 0.6s
//tensorboard/uploader:uploader_test                                     PASSED in 2.4s
//tensorboard/uploader:util_test                                         PASSED in 0.5s

Executed 8 out of 8 tests: 8 tests pass.
INFO: Build completed successfully, 9 total actions

…so fixing this seemed important to me. Also, I want to add tests that
actually assert on the contents of the written blobs; when I copied the
existing test structure to do so, it was confusing to figure out why the
tests were failing as soon as I needed to read the requests.

Also I think some of the other tests in the same file may suffer from
the same issue, in which case we should fix them all at once.

Sure; moved into _create_mock_client, since it looks like that
function already adds some mocks to the returned client.

_write_blob_request_iterator() must have worked anyway, because the
test confirmed that WriteBlob was called 10 times.

No—see above. Generator functions suspend immediately until forced, and
the test never forced them.

It was indeed being called with somewhat bogus arguments (e.g.,
including the Mock for blob_sequence_id), but so what?

The mock for blob_sequence_id is not harmless. It causes a failure
when the request object is constructed:

Traceback (most recent call last):
  File "/usr/lib/python3.7/unittest/mock.py", line 1255, in patched
    return func(*args, **keywargs)
  File ".../tensorboard/uploader/uploader_test.py", line 306, in test_start_uploading_graphs
    list(call[0][0])
  File ".../tensorboard/uploader/uploader.py", line 792, in _write_blob_request_iterator
    blob_bytes=len(blob),
  File ".../python/google/protobuf/internal/python_message.py", line 553, in init
    _ReraiseTypeErrorWithFieldName(message_descriptor.name, field_name)
  File ".../python/google/protobuf/internal/python_message.py", line 470, in _ReraiseTypeErrorWithFieldName
    six.reraise(type(exc), exc, sys.exc_info()[2])
  File ".../six.py", line 695, in reraise
    raise value.with_traceback(tb)
  File ".../python/google/protobuf/internal/python_message.py", line 551, in init
    setattr(self, field_name, field_value)
  File ".../python/google/protobuf/internal/python_message.py", line 695, in field_setter
    new_value = type_checker.CheckValue(new_value)
  File ".../python/google/protobuf/internal/type_checkers.py", line 182, in CheckValue
    raise TypeError(message)
TypeError: <MagicMock name='mock.GetOrCreateBlobSequence().blob_sequence_id' id='140155430870992'> has type <class 'unittest.mock.MagicMock'>, but expected one of: (<class 'bytes'>, <class 'str'>) for field WriteBlobRequest.blob_sequence_id

Does this answer your questions?

wchargin · 2020-04-13T20:59:12Z

tensorboard/uploader/uploader_test.py

            uploader, "_logdir_loader", mock_logdir_loader
        ), self.assertRaises(AbortUploadError):
            uploader.start_uploading()
        self.assertEqual(1, mock_client.CreateExperiment.call_count)


It’s called 10 times, not 5; twice for each run. I’m presuming that that
was intentional? but unless you feel strongly I’d prefer not to add such
an assertion, as (a) it’s unchanged by this patch and (b) I try to limit
mocking and mock assertions wherever possible because of how brittle
they are and how hard they make it to further update the code and tests.

davidsoergel · 2020-04-13T21:21:34Z

tensorboard/uploader/uploader_test.py

            uploader, "_logdir_loader", mock_logdir_loader
        ), self.assertRaises(AbortUploadError):
            uploader.start_uploading()
        self.assertEqual(1, mock_client.CreateExperiment.call_count)


Ah indeed, 10 times with 1 graph each. Anyway, fine to leave the assertion out.

Summary: Fixes an oversight in #3507: we can’t assert that the raw bytes are what was expected because the code under test does a proto serialization roundtrip, which is permitted to permute keys. Test Plan: Tests still pass, and this should fix an internal sync error. wchargin-branch: uploader-test-proto-equal

Summary: Fixes an oversight in #3507: we can’t assert that the raw bytes are what was expected because the code under test does a proto serialization roundtrip, which is permitted to permute keys. Test Plan: Tests still pass; a test sync shows that internal tests pass now, too. wchargin-branch: uploader-test-proto-equal

Summary: The graph uploader test wasn’t actually covering the blob writing code, and was set up incorrectly such that the request iterator would throw when forced. Test Plan: The new assertions pass, but fail if the RPC changes are reverted. wchargin-branch: uploader-graph-write-test

Summary: Fixes an oversight in tensorflow#3507: we can’t assert that the raw bytes are what was expected because the code under test does a proto serialization roundtrip, which is permitted to permute keys. Test Plan: Tests still pass; a test sync shows that internal tests pass now, too. wchargin-branch: uploader-test-proto-equal

Summary: The graph uploader test wasn’t actually covering the blob writing code, and was set up incorrectly such that the request iterator would throw when forced. Test Plan: The new assertions pass, but fail if the RPC changes are reverted. wchargin-branch: uploader-graph-write-test

Summary: Fixes an oversight in #3507: we can’t assert that the raw bytes are what was expected because the code under test does a proto serialization roundtrip, which is permitted to permute keys. Test Plan: Tests still pass; a test sync shows that internal tests pass now, too. wchargin-branch: uploader-test-proto-equal

wchargin added the core:uploader label Apr 13, 2020

wchargin requested a review from davidsoergel April 13, 2020 19:10

googlebot added the cla: yes label Apr 13, 2020

davidsoergel reviewed Apr 13, 2020

View reviewed changes

[update patch]

647bca6

wchargin-branch: uploader-graph-write-test wchargin-source: 35940e6556b4145af8ab5555c291072849246031

wchargin commented Apr 13, 2020

View reviewed changes

davidsoergel approved these changes Apr 13, 2020

View reviewed changes

wchargin merged commit 114211a into master Apr 13, 2020

wchargin deleted the wchargin-uploader-graph-write-test branch April 13, 2020 21:28

wchargin mentioned this pull request Apr 14, 2020

uploader_test: check logical equality of protos #3519

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

uploader: fix graph RPC test #3507

uploader: fix graph RPC test #3507

Uh oh!

wchargin commented Apr 13, 2020

Uh oh!

davidsoergel left a comment

Uh oh!

davidsoergel Apr 13, 2020

Uh oh!

wchargin Apr 13, 2020

Uh oh!

davidsoergel Apr 13, 2020

Uh oh!

wchargin left a comment

Uh oh!

wchargin Apr 13, 2020

Uh oh!

davidsoergel Apr 13, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

uploader: fix graph RPC test #3507

uploader: fix graph RPC test #3507

Uh oh!

Conversation

wchargin commented Apr 13, 2020

Uh oh!

davidsoergel left a comment

Choose a reason for hiding this comment

Uh oh!

davidsoergel Apr 13, 2020

Choose a reason for hiding this comment

Uh oh!

wchargin Apr 13, 2020

Choose a reason for hiding this comment

Uh oh!

davidsoergel Apr 13, 2020

Choose a reason for hiding this comment

Uh oh!

wchargin left a comment

Choose a reason for hiding this comment

Uh oh!

wchargin Apr 13, 2020

Choose a reason for hiding this comment

Uh oh!

davidsoergel Apr 13, 2020

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants