Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AirbyteLogMessage.stack_trace for logging messages with related (non-fatal) errors #16479

Merged
merged 12 commits into from
Sep 12, 2022

Conversation

evantahler
Copy link
Contributor

@evantahler evantahler commented Sep 8, 2022

This PR is inspired by the conversation in #15668 (comment). Please read this thread first. The goal of this PR is to allow connectors to pass non-fatal stack traces along with an AirbyteLogMessage to be logged by the worker. This is helpful when the connector catches an exception and handles it, e.g. log("caught HTTP exception, trying again in 10 seconds", exception);. The message and exception will be logged together for debugging.

For example:

{
  TYPE: "LOG", 
  log: {
    level: "INFO", 
    message: "The connector had a problem, but we are retrying in 10 seconds...",
    stack_trace: "OH NO on line 1\r\n\tBECAUSE OF foo on line 2"
}}

Will be rendered in the logs like:

2022-09-08 20:38:54 source > The connector had a problem, but we are retrying in 10 seconds...
Stack Trace: OH NO on line 1
	BECAUSE OF foo on line 2

This PR replaces #16317.

Rationale

  • We are ok with adding this one additional property to AirbyteLogMessage because this one additional named property's use-case is very clear and unlikely to be misinterpreted.
  • We shouldn’t add wildcard objects to the protocol (e.g. AirbyeteLogMessage.data) because people will do silly with things that we can’t guarantee - e.g. data.foo could be used for triggering other operations which we can’t promise will work in the future.
  • Loggers in general seem to understand stack_trace as a first class object. This elevates this concept above a wildcard .data object
  • Google says "stack trace" is 2 words, and that’s what we already use for AirbyteErrorTraceMessage so we will use the underscore
  • This allows AirbyteNotificaitonTraceMessage to remain reserved for notifying UI consumers of connector behavior.

Notes

As this is a change to the Airbyte Protocol, do not merge this PR until the Protocol Change Process has been completed.

To test this PR, source-faker has also been modified to emit additional LOG messages.

We use stack_trace (underscore) in the protocol.

@github-actions github-actions bot added area/connectors Connector related issues area/documentation Improvements or additions to documentation area/platform issues related to the platform area/protocol area/worker Related to worker labels Sep 8, 2022
@evantahler evantahler temporarily deployed to more-secrets September 8, 2022 19:55 Inactive
@github-actions github-actions bot added the CDK Connector Development Kit label Sep 8, 2022
@evantahler evantahler temporarily deployed to more-secrets September 8, 2022 20:50 Inactive
@evantahler
Copy link
Contributor Author

evantahler commented Sep 8, 2022

/test connector=connectors/source-faker

🕑 connectors/source-faker https://github.com/airbytehq/airbyte/actions/runs/3018237976
✅ connectors/source-faker https://github.com/airbytehq/airbyte/actions/runs/3018237976
Python tests coverage:

Name                       Stmts   Miss  Cover
----------------------------------------------
source_faker/__init__.py       2      0   100%
source_faker/source.py       130      2    98%
----------------------------------------------
TOTAL                        132      2    98%
	 Name                                                 Stmts   Miss  Cover   Missing
	 ----------------------------------------------------------------------------------
	 source_acceptance_test/base.py                          10      4    60%   15-18
	 source_acceptance_test/config.py                        83      6    93%   78-80, 84-86
	 source_acceptance_test/conftest.py                     164    164     0%   6-282
	 source_acceptance_test/plugin.py                        48     48     0%   6-104
	 source_acceptance_test/tests/test_core.py              329    111    66%   39, 50-58, 63-70, 74-75, 79-80, 164, 202-219, 228-236, 240-245, 251, 284-289, 327-334, 374-376, 379, 439-448, 477-478, 484, 487, 520-530, 543-568, 573-577
	 source_acceptance_test/tests/test_full_refresh.py       52      2    96%   34, 65
	 source_acceptance_test/tests/test_incremental.py       121     25    79%   21-23, 29-31, 36-43, 48-61, 208-216
	 source_acceptance_test/utils/asserts.py                 37      2    95%   57-58
	 source_acceptance_test/utils/common.py                  77     17    78%   15-16, 24-30, 47-54, 64, 67
	 source_acceptance_test/utils/compare.py                 62     23    63%   21-51, 68, 97-99
	 source_acceptance_test/utils/connector_runner.py       110     48    56%   23-26, 32, 36, 39-64, 67-69, 72-74, 77-79, 82-84, 87-89, 92-110, 144-146
	 source_acceptance_test/utils/json_schema_helper.py     105     13    88%   30-31, 38, 41, 65-68, 96, 120, 190-192
	 ----------------------------------------------------------------------------------
	 TOTAL                                                 1322    463    65%

Build Passed

Test summary info:

=========================== short test summary info ============================
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/plugin.py:60: Skipping TestIncremental.test_two_sequential_reads because not found in the config
======================== 26 passed, 1 skipped in 25.77s ========================

@evantahler
Copy link
Contributor Author

evantahler commented Sep 8, 2022

Here's what it looks like in the UI:

Screen Shot 2022-09-08 at 2 00 14 PM

@evantahler evantahler temporarily deployed to more-secrets September 8, 2022 21:00 Inactive
@evantahler evantahler marked this pull request as ready for review September 8, 2022 21:00
@evantahler evantahler requested review from a team as code owners September 8, 2022 21:00
@evantahler evantahler temporarily deployed to more-secrets September 8, 2022 21:26 Inactive
@evantahler evantahler temporarily deployed to more-secrets September 8, 2022 21:46 Inactive
Copy link
Contributor

@sherifnada sherifnada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding from our IRL convo @evantahler was we wanted to a way to do structured logging, and that adding a generic data field would allow us to do that. Why has this goal changed? It seemed like a worthwhile target. It seems the main reason was to prevent consumers from making assumptions about the structure of the logs, but I guess I don't see why that is any more true than it is today? (people can make assumptions about the content of the message field)

I thought the previous direction made sense, so I'm just trying to better understand the change

@evantahler
Copy link
Contributor Author

@sherifnada yeah, this is a reversal. We ended up deciding that the right path forward was to be a strictly typed as possible with the protocol in order to discourage un-expected use cases. A single "stack_trace" property we can guarantee, and because it's a known term, expect people will use it only for this purpose. The "data" blob appeared to be to enticing for others to put structured data into it what would be used downstream (e.g. parse the logs for {next_step:go} and use that to power dagster or something similar).

Copy link
Contributor

@edgao edgao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice :shipit:

no comments about the protocol change itself, looks like it aligns with the slack discussions over the last few days. Minor comment about the implementation but I wouldn't block on it.

@evantahler evantahler temporarily deployed to more-secrets September 9, 2022 16:05 Inactive
Copy link
Contributor

@sherifnada sherifnada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rationale seems fair enough

Is there a protocol change log that needs to be updated or something?

@evantahler
Copy link
Contributor Author

evantahler commented Sep 9, 2022

Is there a protocol change log that needs to be updated or something?

There isn't! I'll add one in a separate PR. There might be some conversation about the version numbers.

Edit: That PR is here - #16560

Copy link
Contributor

@jdpgrailsdev jdpgrailsdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@evantahler
Copy link
Contributor Author

evantahler commented Sep 9, 2022

/publish connector=connectors/source-faker

🕑 Publishing the following connectors:
connectors/source-faker
https://github.com/airbytehq/airbyte/actions/runs/3025790475


Connector Did it publish? Were definitions generated?
connectors/source-faker

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

@evantahler evantahler temporarily deployed to more-secrets September 9, 2022 22:55 Inactive
@pedroslopez
Copy link
Contributor

pedroslopez commented Sep 9, 2022

/test connector=connectors/source-faker

🕑 connectors/source-faker https://github.com/airbytehq/airbyte/actions/runs/3025874097
❌ connectors/source-faker https://github.com/airbytehq/airbyte/actions/runs/3025874097
🐛 https://gradle.com/s/x4ihuuje3mnqy

Build Failed

Test summary info:

=========================== short test summary info ============================
FAILED test_core.py::TestBasicRead::test_read[inputs0] - docker.errors.Contai...
FAILED test_full_refresh.py::TestFullRefresh::test_sequential_reads[inputs0]
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/plugin.py:60: Skipping TestIncremental.test_two_sequential_reads because not found in the config
=================== 2 failed, 24 passed, 1 skipped in 24.40s ===================

@evantahler
Copy link
Contributor Author

evantahler commented Sep 10, 2022

/test connector=connectors/source-faker

🕑 connectors/source-faker https://github.com/airbytehq/airbyte/actions/runs/3026231967
❌ connectors/source-faker https://github.com/airbytehq/airbyte/actions/runs/3026231967
🐛 https://gradle.com/s/3wujjd2zs7toc

Build Failed

Test summary info:

=========================== short test summary info ============================
FAILED test_core.py::TestBasicRead::test_read[inputs0] - docker.errors.Contai...
FAILED test_full_refresh.py::TestFullRefresh::test_sequential_reads[inputs0]
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/plugin.py:60: Skipping TestIncremental.test_two_sequential_reads because not found in the config
=================== 2 failed, 24 passed, 1 skipped in 23.89s ===================

@evantahler evantahler temporarily deployed to more-secrets September 10, 2022 01:12 Inactive
@evantahler evantahler temporarily deployed to more-secrets September 10, 2022 01:12 Inactive
@evantahler
Copy link
Contributor Author

evantahler commented Sep 12, 2022

/test connector=connectors/source-faker

🕑 connectors/source-faker https://github.com/airbytehq/airbyte/actions/runs/3040851106
✅ connectors/source-faker https://github.com/airbytehq/airbyte/actions/runs/3040851106
Python tests coverage:

Name                       Stmts   Miss  Cover
----------------------------------------------
source_faker/__init__.py       2      0   100%
source_faker/source.py       130      2    98%
----------------------------------------------
TOTAL                        132      2    98%
	 Name                                                 Stmts   Miss  Cover   Missing
	 ----------------------------------------------------------------------------------
	 source_acceptance_test/base.py                          10      4    60%   15-18
	 source_acceptance_test/config.py                        83      6    93%   78-80, 84-86
	 source_acceptance_test/conftest.py                     164    164     0%   6-282
	 source_acceptance_test/plugin.py                        48     48     0%   6-104
	 source_acceptance_test/tests/test_core.py              329    111    66%   39, 50-58, 63-70, 74-75, 79-80, 164, 202-219, 228-236, 240-245, 251, 284-289, 327-334, 374-376, 379, 439-448, 477-478, 484, 487, 520-530, 543-568, 573-577
	 source_acceptance_test/tests/test_full_refresh.py       52      2    96%   34, 65
	 source_acceptance_test/tests/test_incremental.py       121     25    79%   21-23, 29-31, 36-43, 48-61, 208-216
	 source_acceptance_test/utils/asserts.py                 37      2    95%   57-58
	 source_acceptance_test/utils/common.py                  77     17    78%   15-16, 24-30, 47-54, 64, 67
	 source_acceptance_test/utils/compare.py                 62     23    63%   21-51, 68, 97-99
	 source_acceptance_test/utils/connector_runner.py       110     48    56%   23-26, 32, 36, 39-64, 67-69, 72-74, 77-79, 82-84, 87-89, 92-110, 144-146
	 source_acceptance_test/utils/json_schema_helper.py     105     13    88%   30-31, 38, 41, 65-68, 96, 120, 190-192
	 ----------------------------------------------------------------------------------
	 TOTAL                                                 1325    463    65%

Build Passed

Test summary info:

=========================== short test summary info ============================
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/plugin.py:60: Skipping TestIncremental.test_two_sequential_reads because not found in the config
======================== 26 passed, 1 skipped in 25.80s ========================

@evantahler evantahler temporarily deployed to more-secrets September 12, 2022 22:00 Inactive
@evantahler
Copy link
Contributor Author

evantahler commented Sep 12, 2022

/publish connector=connectors/source-faker

🕑 Publishing the following connectors:
connectors/source-faker
https://github.com/airbytehq/airbyte/actions/runs/3041012392


Connector Did it publish? Were definitions generated?
connectors/source-faker

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

@evantahler evantahler merged commit dcfcb75 into master Sep 12, 2022
@evantahler evantahler deleted the evan/AirbyteLogMessage-with-stacktrace branch September 12, 2022 23:01
robbinhan pushed a commit to robbinhan/airbyte that referenced this pull request Sep 29, 2022
…n-fatal) errors (airbytehq#16479)

* Test log message from faker

* AirbyteLogMessage gains stack_trace

* Fixup spacing

* bump python protocol

* fixup additionalProperties in faker spec

* bump faker version

* update docs

* use lineSeparator vs \r\n

* auto-bump connector version [ci skip]

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
jhammarstedt pushed a commit to jhammarstedt/airbyte that referenced this pull request Oct 31, 2022
…n-fatal) errors (airbytehq#16479)

* Test log message from faker

* AirbyteLogMessage gains stack_trace

* Fixup spacing

* bump python protocol

* fixup additionalProperties in faker spec

* bump faker version

* update docs

* use lineSeparator vs \r\n

* auto-bump connector version [ci skip]

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation area/platform issues related to the platform area/protocol area/worker Related to worker CDK Connector Development Kit connectors/source/faker
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants