Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Database sources should not log to stderr when things are OK #13583

Closed
evantahler opened this issue Jun 7, 2022 · 6 comments · Fixed by #14801
Closed

Database sources should not log to stderr when things are OK #13583

evantahler opened this issue Jun 7, 2022 · 6 comments · Fixed by #14801

Comments

@evantahler
Copy link
Contributor

evantahler commented Jun 7, 2022

Try docker run airbyte/source-mysql spec 1>stdour.log 2>stderr.log and you'll see some warnings emitted on stderr:

stdout.log

2022-06-07 22:46:39 �[32mINFO�[m i.a.i.s.p.PostgresSource(main):409 - starting source: class io.airbyte.integrations.source.postgres.PostgresSource
2022-06-07 22:46:39 �[32mINFO�[m i.a.i.b.IntegrationCliParser(parseOptions):118 - integration args: {spec=null}
2022-06-07 22:46:39 �[32mINFO�[m i.a.i.b.IntegrationRunner(runInternal):123 - Running integration: io.airbyte.integrations.base.ssh.SshWrappedSource
2022-06-07 22:46:39 �[32mINFO�[m i.a.i.b.IntegrationRunner(runInternal):124 - Command: SPEC
2022-06-07 22:46:39 �[32mINFO�[m i.a.i.b.IntegrationRunner(runInternal):125 - Integration config: IntegrationConfig{command=SPEC, configPath='null', catalogPath='null', statePath='null'}

{"type":"SPEC","spec":{"documentationUrl":"https://docs.airbyte.com/integrations/sources/postgres","connectionSpecification":{"$schema":"http://json-schema.org/draft-07/schema#","title":"Postgres Source Spec","type":"object","required":["host","port","database","username"],"additionalProperties":false,"properties":{"host":{"title":"Host","description":"Hostname of the database.","type":"string","order":0},"port":{"title":"Port","description":"Port of the database.","type":"integer","minimum":0,"maximum":65536,"default":5432,"examples":["5432"],"order":1},"database":{"title":"DB Name","description":"Name of the database.","type":"string","order":2},"schemas":{"title":"Schemas","description":"The list of schemas to sync from. Defaults to user. Case sensitive.","type":"array","items":{"type":"string"},"minItems":0,"uniqueItems":true,"default":["public"],"order":3},"username":{"title":"User","description":"Username to use to access the database.","type":"string","order":4},"password":{"title":"Password","description":"Password associated with the username.","type":"string","airbyte_secret":true,"order":5},"jdbc_url_params":{"description":"Additional properties to pass to the JDBC URL string when connecting to the database formatted as 'key=value' pairs separated by the symbol '&'. (example: key1=value1&key2=value2&key3=value3).","title":"JDBC URL Params","type":"string","order":6},"ssl":{"title":"Connect using SSL","description":"Encrypt client/server communications for increased security.","type":"boolean","default":false,"order":7},"replication_method":{"type":"object","title":"Replication Method","description":"Replication method to use for extracting data from the database.","order":8,"oneOf":[{"title":"Standard","additionalProperties":false,"description":"Standard replication requires no setup on the DB side but will not be able to represent deletions incrementally.","required":["method"],"properties":{"method":{"type":"string","const":"Standard","enum":["Standard"],"default":"Standard","order":0}}},{"title":"Logical Replication (CDC)","additionalProperties":false,"description":"Logical replication uses the Postgres write-ahead log (WAL) to detect inserts, updates, and deletes. This needs to be configured on the source database itself. Only available on Postgres 10 and above. Read the <a href=\"https://docs.airbyte.com/integrations/sources/postgres\">Postgres Source</a> docs for more information.","required":["method","replication_slot","publication"],"properties":{"method":{"type":"string","const":"CDC","enum":["CDC"],"default":"CDC","order":0},"plugin":{"type":"string","title":"Plugin","description":"A logical decoding plug-in installed on the PostgreSQL server. `pgoutput` plug-in is used by default.\nIf replication table contains a lot of big jsonb values it is recommended to use `wal2json` plug-in. For more information about `wal2json` plug-in read <a href=\"https://docs.airbyte.com/integrations/sources/postgres\">Postgres Source</a> docs.","enum":["pgoutput","wal2json"],"default":"pgoutput","order":1},"replication_slot":{"type":"string","title":"Replication Slot","description":"A plug-in logical replication slot.","order":2},"publication":{"type":"string","title":"Publication","description":"A Postgres publication used for consuming changes.","order":3}}}]},"tunnel_method":{"type":"object","title":"SSH Tunnel Method","description":"Whether to initiate an SSH tunnel before connecting to the database, and if so, which kind of authentication to use.","oneOf":[{"title":"No Tunnel","required":["tunnel_method"],"properties":{"tunnel_method":{"description":"No ssh tunnel needed to connect to database","type":"string","const":"NO_TUNNEL","order":0}}},{"title":"SSH Key Authentication","required":["tunnel_method","tunnel_host","tunnel_port","tunnel_user","ssh_key"],"properties":{"tunnel_method":{"description":"Connect through a jump server tunnel host using username and ssh key","type":"string","const":"SSH_KEY_AUTH","order":0},"tunnel_host":{"title":"SSH Tunnel Jump Server Host","description":"Hostname of the jump server host that allows inbound ssh tunnel.","type":"string","order":1},"tunnel_port":{"title":"SSH Connection Port","description":"Port on the proxy/jump server that accepts inbound ssh connections.","type":"integer","minimum":0,"maximum":65536,"default":22,"examples":["22"],"order":2},"tunnel_user":{"title":"SSH Login Username","description":"OS-level username for logging into the jump server host.","type":"string","order":3},"ssh_key":{"title":"SSH Private Key","description":"OS-level user account ssh key credentials in RSA PEM format ( created with ssh-keygen -t rsa -m PEM -f myuser_rsa )","type":"string","airbyte_secret":true,"multiline":true,"order":4}}},{"title":"Password Authentication","required":["tunnel_method","tunnel_host","tunnel_port","tunnel_user","tunnel_user_password"],"properties":{"tunnel_method":{"description":"Connect through a jump server tunnel host using username and password authentication","type":"string","const":"SSH_PASSWORD_AUTH","order":0},"tunnel_host":{"title":"SSH Tunnel Jump Server Host","description":"Hostname of the jump server host that allows inbound ssh tunnel.","type":"string","order":1},"tunnel_port":{"title":"SSH Connection Port","description":"Port on the proxy/jump server that accepts inbound ssh connections.","type":"integer","minimum":0,"maximum":65536,"default":22,"examples":["22"],"order":2},"tunnel_user":{"title":"SSH Login Username","description":"OS-level username for logging into the jump server host","type":"string","order":3},"tunnel_user_password":{"title":"Password","description":"OS-level password for logging into the jump server host","type":"string","airbyte_secret":true,"order":4}}}]}}},"supportsNormalization":false,"supportsDBT":false,"supported_destination_sync_modes":[]}}

2022-06-07 22:46:39 �[32mINFO�[m i.a.i.b.IntegrationRunner(runInternal):171 - Completed integration: io.airbyte.integrations.base.ssh.SshWrappedSource
2022-06-07 22:46:39 �[32mINFO�[m i.a.i.s.p.PostgresSource(main):411 - completed source: class io.airbyte.integrations.source.postgres.PostgresSource

stderr.log

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/airbyte/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/airbyte/lib/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

The command worked, and the stderr messages are more-or-less meaningless to an end-user. I propose that messages which are not fatal (e.g. causing a crash/excaption/exit) are not emitted on stderr.

This was also observed in a airbyte/source-postgres as well. There are likley other connectors that behave the same way.

@evantahler evantahler added type/bug Something isn't working needs-triage labels Jun 7, 2022
@evantahler evantahler changed the title database sources emit stderr when things still work Database sources should not log to stderr when things are working OK Jun 7, 2022
@evantahler
Copy link
Contributor Author

cc @grishick & @tuliren - what do you think about this as a bug?

@tuliren
Copy link
Contributor

tuliren commented Jun 8, 2022

what do you think about this as a bug?

Yes, this is probably not intended and a bug.

@evantahler evantahler changed the title Database sources should not log to stderr when things are working OK Database sources should not log to stderr when things are OK Jun 8, 2022
@etsybaev etsybaev self-assigned this Jul 11, 2022
@etsybaev
Copy link
Contributor

etsybaev commented Aug 2, 2022

Performed some investigation.
So basically it was an issue and it's OK that we saw it in stderr. The reason is that we have conflicts in logger imports (https://www.baeldung.com/slf4j-classpath-multiple-bindings).
Should be fixed as part of this PR from @DoNotPanicUA
#14801

@etsybaev etsybaev linked a pull request Aug 2, 2022 that will close this issue
@evantahler
Copy link
Contributor Author

Awesome! Are there other DB connectors we should publish to get this fix in?

@edgao
Copy link
Contributor

edgao commented Aug 3, 2022

destinations gcs and s3 didn't publish successfully in #14801 and destination-databricks maybe also needs to be published; reopening. @DoNotPanicUA it might make sense to just try and get those published in a followup PR, what do you think?

(I merged the PR early to unblock some exciting postgres publishes, but otherwise have no context on this issue :P )

@edgao edgao reopened this Aug 3, 2022
@etsybaev etsybaev removed their assignment Aug 5, 2022
@DoNotPanicUA
Copy link
Contributor

S3, GCS, and Databricks successfully published with the fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants