-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🎉 JDBC sources: emit cursor counts #15535
Conversation
16b9263
to
c80eaee
Compare
/test connector=connectors/source-postgres
Build PassedTest summary info:
|
3a2d8c9
to
7b2f949
Compare
/test connector=connectors/source-oracle-strict-encrypt
Build PassedTest summary info:
|
/publish connector=connectors/source-postgres run-tests=false
if you have connectors that successfully published but failed definition generation, follow step 4 here |
/publish connector=connectors/source-postgres-strict-encrypt run-tests=false
if you have connectors that successfully published but failed definition generation, follow step 4 here |
/test connector=connectors/source-mssql
Build PassedTest summary info:
|
…vation * master: (98 commits) 🐛 Source Bing Ads - Fix Campaigns stream misses Audience and Shopping (#17873) Source S3 - fix schema inference (#17991) 🎉 JDBC sources: store cursor record count in db state (#15535) Introduce webhook configs into workspace api and persistence (#17950) ci: upload test results to github for analysis (#17953) Trigger the connectors build if there are worker changes. (#17976) Add additional sync timing information (#17643) Use page_token_option instead of page_token (#17892) capture metrics around json messages size (#17973) 🐛 Correct kube annotations variable as per the docs. (#17972) 🪟 🎉 Add /connector-builder page with embedded YAML editor (#17482) fix `est_num_metrics_emitted_by_reporter` not being emitted (#17929) Update schema dumps (#17960) Remove the bump in the value.yml (#17959) Ensure database initialization in test container (#17697) Remove typo line from incremental reads docs (#17920) DocS: Update authentication.md (#17931) Use MessageMigration for Source Connection Check. (#17656) fixed links (#17949) remove usages of YamlSeedConfigPersistence (#17895) ...
* Add cursor_record_count to db stream state * Add cursor record count to cursor info * Emit max cursor record count * Add original cursor record count * Unify logging format * Add backward compatible methods * Update unit tests for state decorating iterator * Update test (not done yet) * Fix one more unit test * Change where clause operator according to record count * Add branch for null cursor * Skip saving record count when it is 0 * Fix log wording * Set mock record count in test * Check cursor value instead of cursor info * Fix source jdbc test * Read record count from state * Fix tests * Add an acceptance test case * Fix npe * Change record count from int to long to avoid type conversion * Fix references * Fix oracle container * Use uppercase for snowflake * Use uppercase for db2 * Fix and use uppercase * Update test case to include the edge case * Format code * Remove extra assertion in clickhouse * Merge ms sql incremental query method * Log query for debugging * Clean up name_and_timestamp table * Fix db2 tests * Fix mssql tests * Fix oracle tests * Fix oracle tests * Fix cockroachdb tests * Fix snowflake tests * Add changelog * Fix mssql tests * Fix db2-strict-encrypt tests * Fix oracle-strict-encrypt tests * Bump postgres version * Fix oracle-strict-encrypt tests * auto-bump connector version [ci skip] Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
@jdbranham I think this missed a bunch of version updates? I only see an update to postgres. Meaning all the other connectors are actually missing this change until a new version is deployed? |
@tuliren ^ |
@Kopiczek Yes. All the other connectors are tested, but not published. The change will be available with the next version bump. Do you need this for another connector? We can publish that connector immediately. This is intentional because I think this record count is not a critical feature. It is only useful for a subset of an edge case.
So at the end of the day, only a small number of users will benefit from it, and does not worth the effort of publishing all the connectors. To guarantee zero data loss, CDC is a better way. |
@tuliren |
Changes were made in this PR to fix a bug when new data is inserted at the same time the sync starts. However, they were never published #15535
* Bump up snowflake source version Changes were made in this PR to fix a bug when new data is inserted at the same time the sync starts. However, they were never published #15535 * auto-bump connector version Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
* Bump up snowflake source version Changes were made in this PR to fix a bug when new data is inserted at the same time the sync starts. However, they were never published #15535 * auto-bump connector version Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
What
source-postgres
andsource-postgres-strict-encrypt
.🚨 User Impact 🚨
Review order
db_models.yaml
AbstractJdbcSource.java
AbstractJdbcSourceTest.java
- specificallytestIncrementalWithConcurrentInsertion
StateManager.java
and implementations of the abstract state managerStateDecoratingIterator.java
TODO
CursorInfo
tolong
to avoid the type conversion.Integration tests
source-scaffold-java-jdbcsource-snowflake