Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dagster: Add sentry logging #28822

Merged
merged 13 commits into from
Aug 3, 2023
Merged

Dagster: Add sentry logging #28822

merged 13 commits into from
Aug 3, 2023

Conversation

bnchrch
Copy link
Contributor

@bnchrch bnchrch commented Jul 28, 2023

Overview

  1. Adds sentry
  2. Creates a sentry decorator to capture exceptions from ops and assets

closes #28673

Notes for reviewers

  1. Unfortuantely dagster eats any errors so we have to wrap the functions ourselves
  2. Datadog is not an option until we move to an ec2 deployment

Before merge

In prod dagster set

  • SENTRY_DSN="..."
  • SENTRY_ENVIRONMENT="production"
  • TRACES_SAMPLE_RATE=1.0

@bnchrch bnchrch requested a review from a team July 28, 2023 02:50
Copy link
Contributor

@pedroslopez pedroslopez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Left some questions.

Also - we had said in the original issue that we wanted error rates, but I don't think this gets us that. Sentry does show some error rate stuff for transactions, but I'm not sure those will show up in the current setup. Adding sentry might still be useful though. Could you confirm what your thinking is around the error rates? Did you manage to see transactions in sentry?

Comment on lines +17 to +24
from sentry_sdk.integrations.argv import ArgvIntegration
from sentry_sdk.integrations.atexit import AtexitIntegration
from sentry_sdk.integrations.dedupe import DedupeIntegration
from sentry_sdk.integrations.logging import LoggingIntegration, ignore_logger
from sentry_sdk.integrations.modules import ModulesIntegration
from sentry_sdk.integrations.stdlib import StdlibIntegration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From reading this, it seems we're doing this to ignore the logger? What exactly are we quieting down?

If we don't ignore the logger does this not also help surface the errors and avoid having us to manually report the exceptions to sentry? (assuming the exceptions go to the logger)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good callout

So essentially here we had a trade off.

If we use sentry out of the box there is one large issue

  1. Dagster eats all the user code errors and doesnt let them make it to the sentry level hook

However if we work around this (like we did in this PR)

We end up in a situation where the dagster logger is reporting an exception for every asset in the graph that failed (very noisy)
Screenshot 2023-07-31 at 2 53 06 PM

Ontop of this the dagster logs dont have the sentry tags applied which makes grouping and fingerprinting a more difficult task

So this ignore is meant to reduce the noise, only log sentry errors weve caught and have applied tags to in one way or another.

The reason im comfortable enough with the leaky bucket to recommend it is we still have slack errors to fallback on

Comment on lines 47 to 103
def log_asset_or_op_context(context: OpExecutionContext):
"""
Capture Dagster OP context for Sentry Error handling
"""
sentry_sdk.add_breadcrumb(
category="dagster",
message=f"{context.job_name} - {context.op_def.name}",
level="info",
data={
"run_config": context.run_config,
"job_name": context.job_name,
"op_name": context.op_def.name,
"run_id": context.run_id,
"retry_number": context.retry_number,
},
)

sentry_sdk.set_tag("job_name", context.job_name)
sentry_sdk.set_tag("op_name", context.op_def.name)
sentry_sdk.set_tag("run_id", context.run_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can run multiple of these and exceptions can occur from outside the "asset or op context", I would suggest wrapping this in a sentry scope so we ensure the tags are only set while executing the relevant code. I guess this scope belongs in the decorator method below.

This prevents tags from showing up on unrelated errors, leading things to incorrectly show up when filtering or leading us down the wrong path when debugging/investigating errors.

There's an example with scopes here

with sentry_sdk.configure_scope() as scope:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to use scopes


SENTRY_DSN = os.environ.get("SENTRY_DSN")
SENTRY_ENVIRONMENT = os.environ.get("SENTRY_ENVIRONMENT")
TRACES_SAMPLE_RATE = float(os.environ.get("SENTRY_TRACES_SAMPLE_RATE", 0))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm actually not sure if this is helpful for these - I have it disabled for now on airbyte-ci pipelines but 🤷🏼

Did you see a use for it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use generally is that Ill leave it set to 1.0 in production to get performance data and error rates.

We can change this to a bool, just figured Id leave it a more direct number

Comment on lines 51 to 99
sentry_sdk.add_breadcrumb(
category="dagster",
message=f"{context.job_name} - {context.op_def.name}",
level="info",
data={
"run_config": context.run_config,
"job_name": context.job_name,
"op_name": context.op_def.name,
"run_id": context.run_id,
"retry_number": context.retry_number,
},
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this information better served as a context? https://docs.sentry.io/platforms/python/enriching-events/context/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I grabbed it from an example, I thought it was useful, however looking at the docs it seems like its more of a debug statement in case you forget to log context 🤔

Will remove

@pedroslopez
Copy link
Contributor

Also! question: what does the release in sentry show up as when an error is caught?

@bnchrch bnchrch force-pushed the bnchrch/dagster/add-sentry branch from a40bfc6 to 054f419 Compare August 1, 2023 02:30
@bnchrch
Copy link
Contributor Author

bnchrch commented Aug 1, 2023

@pedroslopez just wanted to follow up on the error reporting comment

Also - we had said in the original issue that we wanted error rates, but I don't think this gets us that. Sentry does show some error rate stuff for transactions, but I'm not sure those will show up in the current setup. Adding sentry might still be useful though. Could you confirm what your thinking is around the error rates? Did you manage to see transactions in sentry?

So the reason I went with Sentry here was because until we decide to move to a self hosted / hybrid dagster deployment we are unable to send logs directly to datadog (as we wont be able to spin up the datadog agent).

With this implementation though we should still get full application monitoring through sentry and get an error rate for each job

Heres one example: https://airbytehq.sentry.io/performance/summary/?project=4504087736090624&query=&referrer=performance-transaction-summary&statsPeriod=30d&transaction=generate_cloud_registry&unselectedSeries=p100%28%29&unselectedSeries=avg%28%29

You can also get a rough waterfall view / flame graph by individual asset
https://airbytehq.sentry.io/performance/sentry-dev-test:54520cc347b5451780e9dd8a7c5f308d/?project=4504087736090624&query=&referrer=performance-transaction-summary&statsPeriod=30d&transaction=generate_cloud_registry&unselectedSeries=p100%28%29&unselectedSeries=avg%28%29

However each asset in a job at this time still has a separate trace id which means we dont have any information directly related to average run time at the job level.

This is because sentry doesnt have an explicit/non hacky way of setting trace id at the transaction level manually yet.

Also! question: what does the release in sentry show up as when an error is caught?
it shows as the git sha currently

https://airbytehq.sentry.io/issues/4356903801/?project=4504087736090624&query=&referrer=issue-stream&statsPeriod=14d&stream_index=10

@bnchrch bnchrch requested a review from pedroslopez August 1, 2023 04:35
@bnchrch bnchrch force-pushed the bnchrch/dagster/add-sentry branch from 9eb1a6e to 36242e3 Compare August 1, 2023 04:37
@bnchrch bnchrch force-pushed the bnchrch/dagster/add-sentry branch from 36242e3 to 8a1322a Compare August 1, 2023 04:39
@bnchrch bnchrch requested a review from a team August 1, 2023 23:06
Copy link
Contributor

@pedroslopez pedroslopez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining- Let's do it! :shipit:


sentry_logger.info(f"Initializing Sentry Transaction for Dagster Op/Asset {job_name} - {op_name}")
transaction = sentry_sdk.Hub.current.scope.transaction
sentry_logger.info(f"Current Sentry Transaction: {transaction}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this not spammy? Could be a debug log level thing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call!

@bnchrch bnchrch merged commit 464409a into master Aug 3, 2023
@bnchrch bnchrch deleted the bnchrch/dagster/add-sentry branch August 3, 2023 16:01
jbfbell pushed a commit that referenced this pull request Aug 5, 2023
* Add sentry

* add sentry decorator

* Add traces

* Use sentry trace

* Improve duplicate logging

* Add comments

* DNC

* Fix up issues

* Move to scopes

* Remove breadcrumb

* Update lock
jbfbell added a commit that referenced this pull request Aug 9, 2023
* Add everything for BQ but migrate, refactor interface after practical work

* Make new default methods, refactor to single implemented method

* MigrationInterface and BQ impl created

* Trying to integrate with standard inserts

* remove unnecessary NameAndNamespacePair class

* Shimmed in

* Java Docs

* Initial Testing Setup

* Tests!

* Move Migrator into TyperDeduper

* Functional Migration

* Add Integration Test

* Pr updates

* bump version

* bump version

* version bump

* Update to airbyte-ci-internal (#29026)

* 🐛 Source Github, Instagram, Zendesk-support, Zendesk-talk: fix CAT tests fail on `spec` (#28910)

* connectors-ci: better modified connectors detection logic (#28855)

* connectors-ci: report path should always start with `airbyte-ci/` (#29030)

* make report path always start with airbyte-ci

* revert report path in orchestrator

* add more test cases

* bump version

* Updated docs (#29019)

* CDK: Embedded reader utils (#28873)

* relax pydantic dep

* Automated Commit - Format and Process Resources Changes

* wip

* wrap up base integration

* add init file

* introduce CDK runner and improve error message

* make state param optional

* update protocol models

* review comments

* always run incremental if possible

* fix

---------

Co-authored-by: flash1293 <flash1293@users.noreply.github.com>

* 🤖 Bump minor version of Airbyte CDK

* 🚨🚨 Low code CDK: Decouple SimpleRetriever and HttpStream (#28657)

* fix tests

* format

* review comments

* Automated Commit - Formatting Changes

* review comments

* review comments

* review comments

* log all messages

* log all message

* review comments

* review comments

* Automated Commit - Formatting Changes

* add comment

---------

Co-authored-by: flash1293 <flash1293@users.noreply.github.com>

* 🤖 Bump minor version of Airbyte CDK

* 🐛 Source Github, Instagram, Zendesk Support / Talk - revert `spec` changes and improve (#29031)

* Source oauth0: new streams and fix incremental (#29001)

* Add new streams Organizations,OrganizationMembers,OrganizationMemberRoles

* relax schema definition to allow additional fields

* Bump image tag version

* revert some changes to the old schemas

* Format python so gradle can pass

* update incremental

* remove unused print

* fix unit test

---------

Co-authored-by: Vasilis Gavriilidis <vasilis.gavriilidis@orfium.com>

* 🐛 Source Mongo: Fix failing acceptance tests (#28816)

* Fix failing acceptance tests

* Fix failing strict acceptance tests

* Source-Greenhouse: Fix unit tests for new CDK version (#28969)

Fix unit tests

* Add CSV options to the CSV parser (#28491)

* remove invalid legacy option

* remove unused option

* the tests pass but this is quite messy

* very slight clean up

* Add skip options to csv format

* fix some of the typing issues

* fixme comment

* remove extra log message

* fix typing issues

* skip before header

* skip after header

* format

* add another test

* Automated Commit - Formatting Changes

* auto generate column names

* delete dead code

* update title and description

* true and false values

* Update the tests

* Add comment

* missing test

* rename

* update expected spec

* move to method

* Update comment

* fix typo

* remove unused import

* Add a comment

* None records do not pass the WaitForDiscoverPolicy

* format

* remove second branch to ensure we always go through the same processing

* Raise an exception if the record is None

* reset

* Update tests

* handle unquoted newlines

* Automated Commit - Formatting Changes

* Update test case so the quoting is explicit

* Update comment

* Automated Commit - Formatting Changes

* Fail validation if skipping rows before header and header is autogenerated

* always fail if a record cannot be parsed

* format

* set write line_no in error message

* remove none check

* Automated Commit - Formatting Changes

* enable autogenerate test

* remove duplicate test

* missing unit tests

* Update

* remove branching

* remove unused none check

* Update tests

* remove branching

* format

* extract to function

* comment

* missing type

* type annotation

* use set

* Document that the strings are case-sensitive

* public -> private

* add unit test

* newline

---------

Co-authored-by: girarda <girarda@users.noreply.github.com>

* Dagster: Add sentry logging (#28822)

* Add sentry

* add sentry decorator

* Add traces

* Use sentry trace

* Improve duplicate logging

* Add comments

* DNC

* Fix up issues

* Move to scopes

* Remove breadcrumb

* Update lock

* ✨Source Shortio: Migrate Python CDK to Low-code CDK (#28950)

* Migrate Shortio to Low-Code

* Update abnormal state

* Format

* Update Docs

* Fix metadata.yaml

* Add pagination

* Add incremental sync

* add incremental parameters

* update metadata

* rollback update version

* release date

---------

Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>

* Update to new verbiage (#29051)

* [skip ci] Metadata: Remove leading underscore (#29024)

* DNC

* Add test models

* Add model test

* Remove underscore from metadata files

* Regenerate models

* Add test to check for key transformation

* Allow additional fields on metadata

* Delete transform

* Proof of concept parallel source stream reading implementation for MySQL (#26580)

* Proof of concept parallel source stream reading implementation for MySQL

* Automated Change

* Add read method that supports concurrent execution to Source interface

* Remove parallel iterator

* Ensure that executor service is stopped

* Automated Commit - Format and Process Resources Changes

* Expose method to fix compilation issue

* Use concurrent map to avoid access issues

* Automated Commit - Format and Process Resources Changes

* Ensure concurrent streams finish before closing source

* Fix compile issue

* Formatting

* Exclude concurrent stream threads from orphan thread watcher

* Automated Commit - Format and Process Resources Changes

* Refactor orphaned thread logic to account for concurrent execution

* PR feedback

* Implement readStreams in wrapper source

* Automated Commit - Format and Process Resources Changes

* Add readStream override

* Automated Commit - Format and Process Resources Changes

* 🤖 Auto format source-mysql code [skip ci]

* 🤖 Auto format source-mysql code [skip ci]

* 🤖 Auto format source-mysql code [skip ci]

* 🤖 Auto format source-mysql code [skip ci]

* 🤖 Auto format source-mysql code [skip ci]

* Debug logging

* Reduce logging level

* Replace synchronized calls to System.out.println when concurrent

* Close consumer

* Flush before close

* Automated Commit - Format and Process Resources Changes

* Remove charset

* Use ASCII and flush periodically for parallel streams

* Test performance harness patch

* Automated Commit - Format and Process Resources Changes

* Cleanup

* Logging to identify concurrent read enabled

* Mark parameter as final

---------

Co-authored-by: jdpgrailsdev <jdpgrailsdev@users.noreply.github.com>
Co-authored-by: octavia-squidington-iii <octavia-squidington-iii@users.noreply.github.com>
Co-authored-by: Rodi Reich Zilberman <867491+rodireich@users.noreply.github.com>
Co-authored-by: rodireich <rodireich@users.noreply.github.com>

* connectors-ci: disable dependency scanning (#29033)

* updates (#29059)

* Metadata: skip breaking change validation on prerelease (#29017)

* skip breaking change validation

* Move ValidatorOpts higher in call

* Add prerelease test

* Fix test

* ✨ Source MongoDB Internal POC: Generate Test Data (#29049)

* Add script to generate test data

* Fix prose

* Update credentials example

* PR feedback

* Bump Airbyte version from 0.50.12 to 0.50.13

* Bump versions for mssql strict-encrypt (#28964)

* Bump versions for mssql strict-encrypt

* Fix failing test

* Fix failing test

* 🎨 Improve replication method selection UX (#28882)

* update replication method in MySQL source

* bump version

* update expected specs

* update registries

* bump strict encrypt version

* make password always_show

* change url

* update registries

* 🐛 Avoid writing records to log (#29047)

* Avoid writing records to log

* Update version

* Rollout ctid cdc (#28708)

* source-postgres: enable ctid+cdc implementation

* 100% ctid rollout for cdc

* remove CtidFeatureFlags

* fix CdcPostgresSourceAcceptanceTest

* Bump versions and release notes

* Fix compilation error due to previous merge

---------

Co-authored-by: subodh <subodh1810@gmail.com>

* connectors-ci: fix `unhashable type 'set'` (#29064)

* Add Slack Alert lifecycle to Dagster for Metadata publish (#28759)

* DNC

* Add slack lifecycle logging

* Update to use slack

* Update slack to use resource and bot

* Improve markdown

* Improve log

* Add sensor logging

* Extend sensor time

* merge conflict

* PR Refactoring

* Make the tests work

* remove unnecessary classes, pr feedback

* more merging

* Update airbyte-integrations/bases/base-typing-deduping-test/src/main/java/io/airbyte/integrations/base/destination/typing_deduping/BaseSqlGeneratorIntegrationTest.java

Co-authored-by: Edward Gao <edward.gao@airbyte.io>

* snowflake updates

---------

Co-authored-by: Ben Church <ben@airbyte.io>
Co-authored-by: Baz <oleksandr.bazarnov@globallogic.com>
Co-authored-by: Augustin <augustin@airbyte.io>
Co-authored-by: Serhii Lazebnyi <53845333+lazebnyi@users.noreply.github.com>
Co-authored-by: Joe Reuter <joe@airbyte.io>
Co-authored-by: flash1293 <flash1293@users.noreply.github.com>
Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com>
Co-authored-by: Vasilis Gavriilidis <vasilis.gavriilidis@orfium.com>
Co-authored-by: Jonathan Pearlin <jonathan@airbyte.io>
Co-authored-by: Alexandre Girard <alexandre@airbyte.io>
Co-authored-by: girarda <girarda@users.noreply.github.com>
Co-authored-by: btkcodedev <btk.codedev@gmail.com>
Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>
Co-authored-by: Natalie Kwong <38087517+nataliekwong@users.noreply.github.com>
Co-authored-by: jdpgrailsdev <jdpgrailsdev@users.noreply.github.com>
Co-authored-by: octavia-squidington-iii <octavia-squidington-iii@users.noreply.github.com>
Co-authored-by: Rodi Reich Zilberman <867491+rodireich@users.noreply.github.com>
Co-authored-by: rodireich <rodireich@users.noreply.github.com>
Co-authored-by: Alexandre Cuoci <Hesperide@users.noreply.github.com>
Co-authored-by: terencecho <terencecho@users.noreply.github.com>
Co-authored-by: Lake Mossman <lake@airbyte.io>
Co-authored-by: Benoit Moriceau <benoit@airbyte.io>
Co-authored-by: subodh <subodh1810@gmail.com>
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
harrytou pushed a commit to KYVENetwork/airbyte that referenced this pull request Sep 1, 2023
* Add everything for BQ but migrate, refactor interface after practical work

* Make new default methods, refactor to single implemented method

* MigrationInterface and BQ impl created

* Trying to integrate with standard inserts

* remove unnecessary NameAndNamespacePair class

* Shimmed in

* Java Docs

* Initial Testing Setup

* Tests!

* Move Migrator into TyperDeduper

* Functional Migration

* Add Integration Test

* Pr updates

* bump version

* bump version

* version bump

* Update to airbyte-ci-internal (airbytehq#29026)

* 🐛 Source Github, Instagram, Zendesk-support, Zendesk-talk: fix CAT tests fail on `spec` (airbytehq#28910)

* connectors-ci: better modified connectors detection logic (airbytehq#28855)

* connectors-ci: report path should always start with `airbyte-ci/` (airbytehq#29030)

* make report path always start with airbyte-ci

* revert report path in orchestrator

* add more test cases

* bump version

* Updated docs (airbytehq#29019)

* CDK: Embedded reader utils (airbytehq#28873)

* relax pydantic dep

* Automated Commit - Format and Process Resources Changes

* wip

* wrap up base integration

* add init file

* introduce CDK runner and improve error message

* make state param optional

* update protocol models

* review comments

* always run incremental if possible

* fix

---------

Co-authored-by: flash1293 <flash1293@users.noreply.github.com>

* 🤖 Bump minor version of Airbyte CDK

* 🚨🚨 Low code CDK: Decouple SimpleRetriever and HttpStream (airbytehq#28657)

* fix tests

* format

* review comments

* Automated Commit - Formatting Changes

* review comments

* review comments

* review comments

* log all messages

* log all message

* review comments

* review comments

* Automated Commit - Formatting Changes

* add comment

---------

Co-authored-by: flash1293 <flash1293@users.noreply.github.com>

* 🤖 Bump minor version of Airbyte CDK

* 🐛 Source Github, Instagram, Zendesk Support / Talk - revert `spec` changes and improve (airbytehq#29031)

* Source oauth0: new streams and fix incremental (airbytehq#29001)

* Add new streams Organizations,OrganizationMembers,OrganizationMemberRoles

* relax schema definition to allow additional fields

* Bump image tag version

* revert some changes to the old schemas

* Format python so gradle can pass

* update incremental

* remove unused print

* fix unit test

---------

Co-authored-by: Vasilis Gavriilidis <vasilis.gavriilidis@orfium.com>

* 🐛 Source Mongo: Fix failing acceptance tests (airbytehq#28816)

* Fix failing acceptance tests

* Fix failing strict acceptance tests

* Source-Greenhouse: Fix unit tests for new CDK version (airbytehq#28969)

Fix unit tests

* Add CSV options to the CSV parser (airbytehq#28491)

* remove invalid legacy option

* remove unused option

* the tests pass but this is quite messy

* very slight clean up

* Add skip options to csv format

* fix some of the typing issues

* fixme comment

* remove extra log message

* fix typing issues

* skip before header

* skip after header

* format

* add another test

* Automated Commit - Formatting Changes

* auto generate column names

* delete dead code

* update title and description

* true and false values

* Update the tests

* Add comment

* missing test

* rename

* update expected spec

* move to method

* Update comment

* fix typo

* remove unused import

* Add a comment

* None records do not pass the WaitForDiscoverPolicy

* format

* remove second branch to ensure we always go through the same processing

* Raise an exception if the record is None

* reset

* Update tests

* handle unquoted newlines

* Automated Commit - Formatting Changes

* Update test case so the quoting is explicit

* Update comment

* Automated Commit - Formatting Changes

* Fail validation if skipping rows before header and header is autogenerated

* always fail if a record cannot be parsed

* format

* set write line_no in error message

* remove none check

* Automated Commit - Formatting Changes

* enable autogenerate test

* remove duplicate test

* missing unit tests

* Update

* remove branching

* remove unused none check

* Update tests

* remove branching

* format

* extract to function

* comment

* missing type

* type annotation

* use set

* Document that the strings are case-sensitive

* public -> private

* add unit test

* newline

---------

Co-authored-by: girarda <girarda@users.noreply.github.com>

* Dagster: Add sentry logging (airbytehq#28822)

* Add sentry

* add sentry decorator

* Add traces

* Use sentry trace

* Improve duplicate logging

* Add comments

* DNC

* Fix up issues

* Move to scopes

* Remove breadcrumb

* Update lock

* ✨Source Shortio: Migrate Python CDK to Low-code CDK (airbytehq#28950)

* Migrate Shortio to Low-Code

* Update abnormal state

* Format

* Update Docs

* Fix metadata.yaml

* Add pagination

* Add incremental sync

* add incremental parameters

* update metadata

* rollback update version

* release date

---------

Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>

* Update to new verbiage (airbytehq#29051)

* [skip ci] Metadata: Remove leading underscore (airbytehq#29024)

* DNC

* Add test models

* Add model test

* Remove underscore from metadata files

* Regenerate models

* Add test to check for key transformation

* Allow additional fields on metadata

* Delete transform

* Proof of concept parallel source stream reading implementation for MySQL (airbytehq#26580)

* Proof of concept parallel source stream reading implementation for MySQL

* Automated Change

* Add read method that supports concurrent execution to Source interface

* Remove parallel iterator

* Ensure that executor service is stopped

* Automated Commit - Format and Process Resources Changes

* Expose method to fix compilation issue

* Use concurrent map to avoid access issues

* Automated Commit - Format and Process Resources Changes

* Ensure concurrent streams finish before closing source

* Fix compile issue

* Formatting

* Exclude concurrent stream threads from orphan thread watcher

* Automated Commit - Format and Process Resources Changes

* Refactor orphaned thread logic to account for concurrent execution

* PR feedback

* Implement readStreams in wrapper source

* Automated Commit - Format and Process Resources Changes

* Add readStream override

* Automated Commit - Format and Process Resources Changes

* 🤖 Auto format source-mysql code [skip ci]

* 🤖 Auto format source-mysql code [skip ci]

* 🤖 Auto format source-mysql code [skip ci]

* 🤖 Auto format source-mysql code [skip ci]

* 🤖 Auto format source-mysql code [skip ci]

* Debug logging

* Reduce logging level

* Replace synchronized calls to System.out.println when concurrent

* Close consumer

* Flush before close

* Automated Commit - Format and Process Resources Changes

* Remove charset

* Use ASCII and flush periodically for parallel streams

* Test performance harness patch

* Automated Commit - Format and Process Resources Changes

* Cleanup

* Logging to identify concurrent read enabled

* Mark parameter as final

---------

Co-authored-by: jdpgrailsdev <jdpgrailsdev@users.noreply.github.com>
Co-authored-by: octavia-squidington-iii <octavia-squidington-iii@users.noreply.github.com>
Co-authored-by: Rodi Reich Zilberman <867491+rodireich@users.noreply.github.com>
Co-authored-by: rodireich <rodireich@users.noreply.github.com>

* connectors-ci: disable dependency scanning (airbytehq#29033)

* updates (airbytehq#29059)

* Metadata: skip breaking change validation on prerelease (airbytehq#29017)

* skip breaking change validation

* Move ValidatorOpts higher in call

* Add prerelease test

* Fix test

* ✨ Source MongoDB Internal POC: Generate Test Data (airbytehq#29049)

* Add script to generate test data

* Fix prose

* Update credentials example

* PR feedback

* Bump Airbyte version from 0.50.12 to 0.50.13

* Bump versions for mssql strict-encrypt (airbytehq#28964)

* Bump versions for mssql strict-encrypt

* Fix failing test

* Fix failing test

* 🎨 Improve replication method selection UX (airbytehq#28882)

* update replication method in MySQL source

* bump version

* update expected specs

* update registries

* bump strict encrypt version

* make password always_show

* change url

* update registries

* 🐛 Avoid writing records to log (airbytehq#29047)

* Avoid writing records to log

* Update version

* Rollout ctid cdc (airbytehq#28708)

* source-postgres: enable ctid+cdc implementation

* 100% ctid rollout for cdc

* remove CtidFeatureFlags

* fix CdcPostgresSourceAcceptanceTest

* Bump versions and release notes

* Fix compilation error due to previous merge

---------

Co-authored-by: subodh <subodh1810@gmail.com>

* connectors-ci: fix `unhashable type 'set'` (airbytehq#29064)

* Add Slack Alert lifecycle to Dagster for Metadata publish (airbytehq#28759)

* DNC

* Add slack lifecycle logging

* Update to use slack

* Update slack to use resource and bot

* Improve markdown

* Improve log

* Add sensor logging

* Extend sensor time

* merge conflict

* PR Refactoring

* Make the tests work

* remove unnecessary classes, pr feedback

* more merging

* Update airbyte-integrations/bases/base-typing-deduping-test/src/main/java/io/airbyte/integrations/base/destination/typing_deduping/BaseSqlGeneratorIntegrationTest.java

Co-authored-by: Edward Gao <edward.gao@airbyte.io>

* snowflake updates

---------

Co-authored-by: Ben Church <ben@airbyte.io>
Co-authored-by: Baz <oleksandr.bazarnov@globallogic.com>
Co-authored-by: Augustin <augustin@airbyte.io>
Co-authored-by: Serhii Lazebnyi <53845333+lazebnyi@users.noreply.github.com>
Co-authored-by: Joe Reuter <joe@airbyte.io>
Co-authored-by: flash1293 <flash1293@users.noreply.github.com>
Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com>
Co-authored-by: Vasilis Gavriilidis <vasilis.gavriilidis@orfium.com>
Co-authored-by: Jonathan Pearlin <jonathan@airbyte.io>
Co-authored-by: Alexandre Girard <alexandre@airbyte.io>
Co-authored-by: girarda <girarda@users.noreply.github.com>
Co-authored-by: btkcodedev <btk.codedev@gmail.com>
Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>
Co-authored-by: Natalie Kwong <38087517+nataliekwong@users.noreply.github.com>
Co-authored-by: jdpgrailsdev <jdpgrailsdev@users.noreply.github.com>
Co-authored-by: octavia-squidington-iii <octavia-squidington-iii@users.noreply.github.com>
Co-authored-by: Rodi Reich Zilberman <867491+rodireich@users.noreply.github.com>
Co-authored-by: rodireich <rodireich@users.noreply.github.com>
Co-authored-by: Alexandre Cuoci <Hesperide@users.noreply.github.com>
Co-authored-by: terencecho <terencecho@users.noreply.github.com>
Co-authored-by: Lake Mossman <lake@airbyte.io>
Co-authored-by: Benoit Moriceau <benoit@airbyte.io>
Co-authored-by: subodh <subodh1810@gmail.com>
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add instrumentation to Dagster to display chart of error rate
2 participants