Skip to content

Conversation

@wchargin
Copy link
Contributor

Summary:
The graph uploader test wasn’t actually covering the blob writing code,
and was set up incorrectly such that the request iterator would throw
when forced.

Test Plan:
The new assertions pass, but fail if the RPC changes are reverted.

wchargin-branch: uploader-graph-write-test

Summary:
The graph uploader test wasn’t actually covering the blob writing code,
and was set up incorrectly such that the request iterator would throw
when forced.

Test Plan:
The new assertions pass, but fail if the RPC changes are reverted.

wchargin-branch: uploader-graph-write-test
Copy link
Member

@davidsoergel davidsoergel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I certainly agree that this is a more thorough test, but now I don't quite see why it passed and/or why it should not have passed before. Also I think some of the other tests in the same file may suffer from the same issue, in which case we should fix them all at once.

Briefly, I think blob_sequence_id was being set to a default/unconfigured Mock object, because GetOrCreateBlobSequence was not mocked. _write_blob_request_iterator() must have worked anyway, because the test confirmed that WriteBlob was called 10 times. It was indeed being called with somewhat bogus arguments (e.g., including the Mock for blob_sequence_id), but so what? Which code path was actually not being tested that you think should have been?

uploader, "_logdir_loader", mock_logdir_loader
), self.assertRaises(AbortUploadError):
uploader.start_uploading()
self.assertEqual(1, mock_client.CreateExperiment.call_count)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could add self.assertEqual(5, mock_client. GetOrCreateBlobSequence.call_count) here while we're at it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s called 10 times, not 5; twice for each run. I’m presuming that that
was intentional? but unless you feel strongly I’d prefer not to add such
an assertion, as (a) it’s unchanged by this patch and (b) I try to limit
mocking and mock assertions wherever possible because of how brittle
they are and how hard they make it to further update the code and tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah indeed, 10 times with 1 graph each. Anyway, fine to leave the assertion out.

wchargin-branch: uploader-graph-write-test
wchargin-source: 35940e6556b4145af8ab5555c291072849246031
Copy link
Contributor Author

@wchargin wchargin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I certainly agree that this is a more thorough test, but now I don't
quite see why it passed and/or why it should not have passed before.

It passed before because the request iterator was never forced and so
the body of _write_blob_request_iterator was never actually run. To
reveal the underlying failure, it suffices to add:

for call in mock_client.WriteBlob.call_args_list:
    list(call[0][0])

after any assertion that WriteBlob.call_count is a nonzero value.

This raises an error because it tries to create a WriteBlobRequest
with the given blob sequence ID, but the blob sequence ID is just a mock
dummy object because the GetOrCreateBlobSequence RPC was silently
mocked out.

The tests still pass at master even if the code is obviously broken:

$ git log --oneline -1
b2d4712f (HEAD, origin/master, origin/HEAD) benchmark: remove unused deps (#3505)
$ git diff
diff --git a/tensorboard/uploader/uploader.py b/tensorboard/uploader/uploader.py
index e5168aef..bd7a1282 100644
--- a/tensorboard/uploader/uploader.py
+++ b/tensorboard/uploader/uploader.py
@@ -774,24 +774,8 @@ class _BlobRequestSender(object):
                 raise
 
     def _write_blob_request_iterator(self, blob_sequence_id, seq_index, blob):
-        # For now all use cases have the blob in memory already.
-        # In the future we may want to stream from disk; that will require
-        # refactoring here.
-        # TODO(soergel): compute crc32c's to allow server-side data validation.
-        for offset in range(0, len(blob), BLOB_CHUNK_SIZE):
-            chunk = blob[offset : offset + BLOB_CHUNK_SIZE]
-            finalize_object = offset + BLOB_CHUNK_SIZE >= len(blob)
-            request = write_service_pb2.WriteBlobRequest(
-                blob_sequence_id=blob_sequence_id,
-                index=seq_index,
-                data=chunk,
-                offset=offset,
-                crc32c=None,
-                finalize_object=finalize_object,
-                final_crc32c=None,
-                blob_bytes=len(blob),
-            )
-            yield request
+        raise NotImplementedError("hmm")
+        yield
 
 
 @contextlib.contextmanager
$ bazel test //tensorboard/uploader/... --nocache_test_results
INFO: Analyzed 22 targets (0 packages loaded, 0 targets configured).
INFO: Found 14 targets and 8 test targets...
INFO: Elapsed time: 2.698s, Critical Path: 2.44s
INFO: 8 processes: 8 linux-sandbox.
INFO: Build completed successfully, 9 total actions
//tensorboard/uploader:auth_test                                         PASSED in 0.5s
//tensorboard/uploader:exporter_test                                     PASSED in 0.5s
//tensorboard/uploader:flags_parser_test                                 PASSED in 0.3s
//tensorboard/uploader:formatters_test                                   PASSED in 0.4s
//tensorboard/uploader:logdir_loader_test                                PASSED in 1.6s
//tensorboard/uploader:server_info_test                                  PASSED in 0.6s
//tensorboard/uploader:uploader_test                                     PASSED in 2.4s
//tensorboard/uploader:util_test                                         PASSED in 0.5s

Executed 8 out of 8 tests: 8 tests pass.
INFO: Build completed successfully, 9 total actions

…so fixing this seemed important to me. Also, I want to add tests that
actually assert on the contents of the written blobs; when I copied the
existing test structure to do so, it was confusing to figure out why the
tests were failing as soon as I needed to read the requests.

Also I think some of the other tests in the same file may suffer from
the same issue, in which case we should fix them all at once.

Sure; moved into _create_mock_client, since it looks like that
function already adds some mocks to the returned client.

_write_blob_request_iterator() must have worked anyway, because the
test confirmed that WriteBlob was called 10 times.

No—see above. Generator functions suspend immediately until forced, and
the test never forced them.

It was indeed being called with somewhat bogus arguments (e.g.,
including the Mock for blob_sequence_id), but so what?

The mock for blob_sequence_id is not harmless. It causes a failure
when the request object is constructed:

Traceback (most recent call last):
  File "/usr/lib/python3.7/unittest/mock.py", line 1255, in patched
    return func(*args, **keywargs)
  File ".../tensorboard/uploader/uploader_test.py", line 306, in test_start_uploading_graphs
    list(call[0][0])
  File ".../tensorboard/uploader/uploader.py", line 792, in _write_blob_request_iterator
    blob_bytes=len(blob),
  File ".../python/google/protobuf/internal/python_message.py", line 553, in init
    _ReraiseTypeErrorWithFieldName(message_descriptor.name, field_name)
  File ".../python/google/protobuf/internal/python_message.py", line 470, in _ReraiseTypeErrorWithFieldName
    six.reraise(type(exc), exc, sys.exc_info()[2])
  File ".../six.py", line 695, in reraise
    raise value.with_traceback(tb)
  File ".../python/google/protobuf/internal/python_message.py", line 551, in init
    setattr(self, field_name, field_value)
  File ".../python/google/protobuf/internal/python_message.py", line 695, in field_setter
    new_value = type_checker.CheckValue(new_value)
  File ".../python/google/protobuf/internal/type_checkers.py", line 182, in CheckValue
    raise TypeError(message)
TypeError: <MagicMock name='mock.GetOrCreateBlobSequence().blob_sequence_id' id='140155430870992'> has type <class 'unittest.mock.MagicMock'>, but expected one of: (<class 'bytes'>, <class 'str'>) for field WriteBlobRequest.blob_sequence_id

Does this answer your questions?

uploader, "_logdir_loader", mock_logdir_loader
), self.assertRaises(AbortUploadError):
uploader.start_uploading()
self.assertEqual(1, mock_client.CreateExperiment.call_count)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s called 10 times, not 5; twice for each run. I’m presuming that that
was intentional? but unless you feel strongly I’d prefer not to add such
an assertion, as (a) it’s unchanged by this patch and (b) I try to limit
mocking and mock assertions wherever possible because of how brittle
they are and how hard they make it to further update the code and tests.

uploader, "_logdir_loader", mock_logdir_loader
), self.assertRaises(AbortUploadError):
uploader.start_uploading()
self.assertEqual(1, mock_client.CreateExperiment.call_count)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah indeed, 10 times with 1 graph each. Anyway, fine to leave the assertion out.

@wchargin wchargin merged commit 114211a into master Apr 13, 2020
@wchargin wchargin deleted the wchargin-uploader-graph-write-test branch April 13, 2020 21:28
wchargin added a commit that referenced this pull request Apr 14, 2020
Summary:
Fixes an oversight in #3507: we can’t assert that the raw bytes are what
was expected because the code under test does a proto serialization
roundtrip, which is permitted to permute keys.

Test Plan:
Tests still pass, and this should fix an internal sync error.

wchargin-branch: uploader-test-proto-equal
wchargin added a commit that referenced this pull request Apr 14, 2020
Summary:
Fixes an oversight in #3507: we can’t assert that the raw bytes are what
was expected because the code under test does a proto serialization
roundtrip, which is permitted to permute keys.

Test Plan:
Tests still pass; a test sync shows that internal tests pass now, too.

wchargin-branch: uploader-test-proto-equal
bileschi pushed a commit to bileschi/tensorboard that referenced this pull request Apr 15, 2020
Summary:
The graph uploader test wasn’t actually covering the blob writing code,
and was set up incorrectly such that the request iterator would throw
when forced.

Test Plan:
The new assertions pass, but fail if the RPC changes are reverted.

wchargin-branch: uploader-graph-write-test
bileschi pushed a commit to bileschi/tensorboard that referenced this pull request Apr 15, 2020
Summary:
Fixes an oversight in tensorflow#3507: we can’t assert that the raw bytes are what
was expected because the code under test does a proto serialization
roundtrip, which is permitted to permute keys.

Test Plan:
Tests still pass; a test sync shows that internal tests pass now, too.

wchargin-branch: uploader-test-proto-equal
bileschi pushed a commit that referenced this pull request Apr 15, 2020
Summary:
The graph uploader test wasn’t actually covering the blob writing code,
and was set up incorrectly such that the request iterator would throw
when forced.

Test Plan:
The new assertions pass, but fail if the RPC changes are reverted.

wchargin-branch: uploader-graph-write-test
bileschi pushed a commit that referenced this pull request Apr 15, 2020
Summary:
Fixes an oversight in #3507: we can’t assert that the raw bytes are what
was expected because the code under test does a proto serialization
roundtrip, which is permitted to permute keys.

Test Plan:
Tests still pass; a test sync shows that internal tests pass now, too.

wchargin-branch: uploader-test-proto-equal
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants