-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Connector builder: support for test read with message grouping per slices #23925
Conversation
… Builder. Also implements `resolve_manifest` handler
@@ -0,0 +1,503 @@ | |||
# |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def _emitted_at(): | ||
return int(datetime.now().timestamp()) * 1000 | ||
|
||
def _create_configure_catalog(stream_name: str) -> ConfiguredAirbyteCatalog: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
creating the catalog on the Java side would allow us to keep the same read interface as connectors
|
||
|
||
def create_source(config: Mapping[str, Any]) -> ManifestDeclarativeSource: | ||
manifest = config.get("__injected_declarative_manifest") | ||
return ManifestDeclarativeSource(manifest) | ||
return ManifestDeclarativeSource(manifest, True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set debug to True so the source returns raw requests and responses
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great @girarda! A couple of questions but no blockers.
|
||
def read_stream(source: DeclarativeSource, config: Mapping[str, Any], configured_catalog: ConfiguredAirbyteCatalog) -> AirbyteMessage: | ||
try: | ||
if "__test_read_config" not in config: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Feels like this should be optional since we're not requiring any of the keys to be present, and are handling the case where it isn't set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed. fixed
class MessageGrouper: | ||
logger = logging.getLogger("airbyte.connector-builder") | ||
|
||
def __init__(self, max_pages_per_slice: int, max_slices: int, max_record_limit: int = 1000): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit (also I know this may just be copy/pasted): Would it be preferable to enforce the maximums in this class, instead of allowing them to be fully configurable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the main reason why I didn't enforce maximums here is that we'll need to update the value in two places if we decide to increase it. I think the risk of not enforcing a limit here is negligible because we control the caller
…ices (airbytehq#23925) * New connector_builder module for handling requests from the Connector Builder. Also implements `resolve_manifest` handler * Automated Commit - Formatting Changes * Rename ConnectorBuilderSource to ConnectorBuilderHandler * Update source_declarative_manifest README * Reorganize * read records * paste unit tests from connector builder server * compiles but tests fail * first test passes * Second test passes * 3rd test passes * one more test * another test * one more test * test * return StreamRead * test * test * rename * test * test * test * main seems to work * Update * Update * Update * Update * update * error message * rename * update * Update * CR improvements * fix test_source_declarative_manifest * fix tests * Update * Update * Update * Update * rename * rename * rename * format * Give connector_builder its own main.py * Update * reset * delete dead code * remove debug print * update test * Update * set right stream * Add --catalog argument * Remove unneeded preparse * Update README * handle error * tests pass * more explicit test * reset * format * fix merge * raise exception * fix * black format * raise with config * update * fix flake * __test_read_config is optional * fix * Automated Commit - Formatting Changes * fix * exclude_unset --------- Co-authored-by: Catherine Noll <noll.catherine@gmail.com> Co-authored-by: clnoll <clnoll@users.noreply.github.com> Co-authored-by: girarda <girarda@users.noreply.github.com>
…ices (#23925) * New connector_builder module for handling requests from the Connector Builder. Also implements `resolve_manifest` handler * Automated Commit - Formatting Changes * Rename ConnectorBuilderSource to ConnectorBuilderHandler * Update source_declarative_manifest README * Reorganize * read records * paste unit tests from connector builder server * compiles but tests fail * first test passes * Second test passes * 3rd test passes * one more test * another test * one more test * test * return StreamRead * test * test * rename * test * test * test * main seems to work * Update * Update * Update * Update * update * error message * rename * update * Update * CR improvements * fix test_source_declarative_manifest * fix tests * Update * Update * Update * Update * rename * rename * rename * format * Give connector_builder its own main.py * Update * reset * delete dead code * remove debug print * update test * Update * set right stream * Add --catalog argument * Remove unneeded preparse * Update README * handle error * tests pass * more explicit test * reset * format * fix merge * raise exception * fix * black format * raise with config * update * fix flake * __test_read_config is optional * fix * Automated Commit - Formatting Changes * fix * exclude_unset --------- Co-authored-by: Catherine Noll <noll.catherine@gmail.com> Co-authored-by: clnoll <clnoll@users.noreply.github.com> Co-authored-by: girarda <girarda@users.noreply.github.com>
…ices (#23925) * New connector_builder module for handling requests from the Connector Builder. Also implements `resolve_manifest` handler * Automated Commit - Formatting Changes * Rename ConnectorBuilderSource to ConnectorBuilderHandler * Update source_declarative_manifest README * Reorganize * read records * paste unit tests from connector builder server * compiles but tests fail * first test passes * Second test passes * 3rd test passes * one more test * another test * one more test * test * return StreamRead * test * test * rename * test * test * test * main seems to work * Update * Update * Update * Update * update * error message * rename * update * Update * CR improvements * fix test_source_declarative_manifest * fix tests * Update * Update * Update * Update * rename * rename * rename * format * Give connector_builder its own main.py * Update * reset * delete dead code * remove debug print * update test * Update * set right stream * Add --catalog argument * Remove unneeded preparse * Update README * handle error * tests pass * more explicit test * reset * format * fix merge * raise exception * fix * black format * raise with config * update * fix flake * __test_read_config is optional * fix * Automated Commit - Formatting Changes * fix * exclude_unset --------- Co-authored-by: Catherine Noll <noll.catherine@gmail.com> Co-authored-by: clnoll <clnoll@users.noreply.github.com> Co-authored-by: girarda <girarda@users.noreply.github.com>
What
Sample command
python main.py read --config secrets/config.json --catalog secrets/configured_catalog.json
Test read
The test read method will run if the
"__command"
field is set to"test_read"
. The source will read data from stream defined in the configured catalog. Only one stream at a time is currently supported.The test_read command can be configured with the optional
"__test_read_config"
field in the configThe
"__test_read_config"
should have the following fieldssample config for a test read:
configured catalog:
sample configured catalog for resolving the manifest:
How
Interface change
Test read
Recommended reading order
airbyte-cdk/python/source_declarative_manifest/main.py
airbyte-cdk/python/connector_builder/connector_builder_handler.py
airbyte-cdk/python/connector_builder/message_grouper.py
airbyte-cdk/python/connector_builder/models.py
airbyte-cdk/python/airbyte_cdk/sources/source.py
airbyte-cdk/python/unit_tests/connector_builder/test_message_grouper.py
airbyte-cdk/python/airbyte_cdk/sources/source.py
airbyte-cdk/python/unit_tests/connector_builder/utils.py