Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[per-stream cdk] Update Source Acceptance Tests to support new outbound state messages format #16059

Closed
Tracked by #16054
brianjlai opened this issue Aug 29, 2022 · 1 comment · Fixed by #16686
Closed
Tracked by #16054
Assignees

Comments

@brianjlai
Copy link
Contributor

brianjlai commented Aug 29, 2022

Two tests in source-acceptance-tests/source_acceptance_tests/tests/test_incremental.py will have to be updated to support interpreting outbound state messages in the new format. The tests that have to be updated are:

At a high level what the two tests do:

  • test_two_sequential_reads: Make a full incremental sync. Then get the final state message emitted and make a second read() request and verify no new records

  • test_read_sequential_slices: Make a full incremental sync. Then construct batches of requests by splitting them apart by state message. We sequentially make a series of read() requests using a sampling of batches and verify the resulting records are older than the state

Issues

  • The current SAT tests make assertions against a state message's data field. These tests don't validate per-stream correctness. Once we deprecate it, these tests will also fail.
  • Now that incremental sync requests only emit the state of the stream being synced, when making subsequent read() requests, we do not have the full context and state of prior incremental syncs that came before it. In the current form, it might work, but we'll be over extracting data since most streams will end up pulling more data

Solution

  • Depending on the format of the state message received, read from state.data (legacy) or state.stream.stream_state (per-stream) to get the state_value for a record.
  • During each of these tests, we need to maintain a dictionary of the last observed state for each stream. Then as we loop over the message batches, we'll need to update the dictionary and use that as input to the next read() request. This should filter out any records that came before in prior syncs.
@brianjlai brianjlai changed the title Update Source Acceptance Tests to support new outbound state messages format [per-stream cdk] Update Source Acceptance Tests to support new outbound state messages format Aug 29, 2022
@brianjlai
Copy link
Contributor Author

brianjlai commented Sep 6, 2022

grooming notes:

  • encapsulate state management behind a class in the test to manage setting/retrieving state

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant