Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

estuary-cdk and new connectors #1293

Merged
merged 14 commits into from
Mar 4, 2024
Merged

estuary-cdk and new connectors #1293

merged 14 commits into from
Mar 4, 2024

Conversation

jgraettinger
Copy link
Member

@jgraettinger jgraettinger commented Feb 28, 2024

See individual commits.

Workflow steps:

(How does one use this feature, and how has it changed)

Documentation links affected:

(list any documentation links that you created, or existing ones that you've identified as needing updates, along with a brief description)

Notes for reviewers:

(anything that might help someone review this PR)


This change is Reviewable

@jgraettinger jgraettinger force-pushed the johnny/sdk-proto branch 12 times, most recently from 0383cde to 6af61ac Compare March 1, 2024 00:13
Copy link
Member

@williamhbaker williamhbaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Had a couple of questions, and a few other trivial notes. I must admit that a lot of the python stuff is still over my head, but everything generally made sense. Others with more python-specific knowledge might have more useful feedback on specifics of the code.

.github/actions/setup/action.yaml Outdated Show resolved Hide resolved
estuary-cdk/estuary_cdk/capture/common.py Outdated Show resolved Hide resolved
estuary-cdk/estuary_cdk/capture/common.py Outdated Show resolved Hide resolved
estuary-cdk/estuary_cdk/capture/common.py Show resolved Hide resolved
estuary-cdk/estuary_cdk/http.py Outdated Show resolved Hide resolved
estuary-cdk/estuary_cdk/http.py Outdated Show resolved Hide resolved
estuary-cdk/estuary_cdk/pydantic_polyfill.py Show resolved Hide resolved
estuary-cdk/estuary_cdk/shim_airbyte_cdk.py Show resolved Hide resolved
source-hubspot-native/tests/test_snapshots.py Outdated Show resolved Hide resolved
The Estuary CDK differs from the earlier flow-sdk in fundamental ways:

* It leans heavily into Pydantic V2 (with a polyfill for V1),
  which is used for validation and schema generation,
  married with Flow's schema inference capabilities.

* It has a framework -> connector -> library structure.
  The framework is maximally unopinionated as to how a connector is
  built. But, the CDK offers library routines (the `common` module)
  which encapsulates the common patterns for fetch snapshot or
  incremental resources.

* It's async at it's core.
  All work proceeds concurrently across all bindings.
Still missing a bunch of entities, but functional.
…wocuments or a LogCursor

This avoids implementations from having to know the maximum LogCursor
until they're done reading documents, and avoids nested async
generators.

Also add a BasicAuth credential type.
request_stream() is an AsyncGenerator over arbitrary stream chunks.
Then, request() and a new request_lines() are implemented in terms of
request_stream().

This allows callers to efficiently process unbounded responses.
…ming

Rework the contract to allow implementations to yield checkpoints at
times of their choosing. This allows for more ergonomic handling of
long-lived push streams of documents.
Begin to factor out common setup steps into reuseable composite actions.

Add a common estuary-cdk Dockerfile
The runtime currently uses a non-UUID placeholder internally, which causes spurious schema violations
Establish a convention that `log: Logger` is the first parameter.
We're going to be threading these through everywhere -- which is
desireable, because it gives us a tightly-scoped structured log context
that tells us as much as possible about the surrounding task -- so let's
standardize how it should be passed so we don't have to think hard about
it.

Also refactor `http` module to clarify APIs which are stable, vs
portions that are very likely to be refactored.

A few other code-review cleanups as well.
We don't need it yet, so let's not have it.
@jgraettinger jgraettinger merged commit fca51f0 into main Mar 4, 2024
48 of 52 checks passed
@jgraettinger jgraettinger deleted the johnny/sdk-proto branch March 4, 2024 19:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants