Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Status of testing Providers that were prepared on September 28, 2022 #26752

Closed
44 of 89 tasks
potiuk opened this issue Sep 28, 2022 · 29 comments
Closed
44 of 89 tasks

Status of testing Providers that were prepared on September 28, 2022 #26752

potiuk opened this issue Sep 28, 2022 · 29 comments
Labels
kind:meta High-level information important to the community testing status Status of testing releases

Comments

@potiuk
Copy link
Member

potiuk commented Sep 28, 2022

Body

I have a kind request for all the contributors to the latest provider packages release.
Could you please help us to test the RC versions of the providers?

Let us know in the comment, whether the issue is addressed.

Those are providers that require testing as there were some substantial changes introduced:

Provider alibaba: 2.1.0rc1

Provider amazon: 6.0.0rc1

Provider apache.drill: 2.2.1rc1

Provider apache.druid: 3.2.1rc1

Provider apache.hive: 4.0.1rc1

Provider apache.pinot: 3.2.1rc1

Provider databricks: 3.3.0rc1

Provider dbt.cloud: 2.2.0rc1

Provider docker: 3.2.0rc1

Provider elasticsearch: 4.2.1rc1

Provider exasol: 4.0.1rc1

Provider google: 8.4.0rc1

Provider jdbc: 3.2.1rc1

Provider jira: 3.1.0rc1

Provider microsoft.azure: 4.3.0rc1

Provider microsoft.mssql: 3.2.1rc1

Provider mysql: 3.2.1rc1

Provider odbc: 3.1.2rc1

Provider oracle: 3.4.0rc1

Provider postgres: 5.2.2rc1

Provider qubole: 3.2.1rc1

Provider sftp: 4.1.0rc1

Provider slack: 6.0.0rc1

Provider snowflake: 3.3.0rc1

Provider ssh: 3.2.0rc1

Provider trino: 4.1.0rc1

Provider vertica: 3.2.1rc1

The guidelines on how to test providers can be found in

Verify providers by contributors

Committer

  • I acknowledge that I am a maintainer/committer of the Apache Airflow project.
@potiuk potiuk added the kind:meta High-level information important to the community label Sep 28, 2022
@eladkal
Copy link
Contributor

eladkal commented Sep 28, 2022

links are wrong? all are RC1 not RC2

@potiuk
Copy link
Member Author

potiuk commented Sep 28, 2022

Thanks Elad :). Fixing :)

@potiuk
Copy link
Member Author

potiuk commented Sep 28, 2022

Copy-pasted a wrong command :). All links are fixed now.

@patricker
Copy link
Contributor

I've tested #26277, it's good.

I was testing #26285, and there is something off with the way I wrote the value to XCom. The Max Value shows correctly in the text logs, but when it's written to XCom there is some kind of serialization issue, and it throws an error in the web ui when I try to view it.

  File "/usr/local/lib/python3.9/site-packages/airflow/models/xcom.py", line 600, in deserialize_value
    return pickle.loads(result.value)
  File "/usr/local/lib/python3.9/site-packages/google/cloud/bigquery/table.py", line 1409, in __getattr__
    value = self._xxx_field_to_index.get(name)
  File "/usr/local/lib/python3.9/site-packages/google/cloud/bigquery/table.py", line 1409, in __getattr__
    value = self._xxx_field_to_index.get(name)
  File "/usr/local/lib/python3.9/site-packages/google/cloud/bigquery/table.py", line 1409, in __getattr__
    value = self._xxx_field_to_index.get(name)
  [Previous line repeated 957 more times]
RecursionError: maximum recursion depth exceeded

@c2zwdjnlcg
Copy link
Contributor

Tested trino: 4.1.0rc1 and did not hit any issues

@potiuk
Copy link
Member Author

potiuk commented Sep 28, 2022

I was testing #26285, and there is something off with the way I wrote the value to XCom. The Max Value shows correctly in the text logs, but when it's written to XCom there is some kind of serialization issue, and it throws an error in the web ui when I try to view it.

Hmm. I think we need to take closer look. Can you describe a way you did it? It kinda looks like you have a "table" object serialized somehow rather than primitive.

@jscheffl
Copy link
Contributor

I'd like to contribute to the RC tests but for all 4 provider packages touched by my PR I just did code checks for consistency of the logging on API. I have no back-end environment for alibaba, azure, google and amazon in my availability :-( Is somebody besides me able to test?
Test is simple if you have a running environment, can use any existing longer running DAG and check the log view ,as described in PR #26169

@patricker
Copy link
Contributor

I was testing #26285, and there is something off with the way I wrote the value to XCom. The Max Value shows correctly in the text logs, but when it's written to XCom there is some kind of serialization issue, and it throws an error in the web ui when I try to view it.

Hmm. I think we need to take closer look. Can you describe a way you did it? It kinda looks like you have a "table" object serialized somehow rather than primitive.

@potiuk In looking at the code again, and the actual Log message, it looks like I accidentally tried to serialize a whole table row instead of the value of the first column of the table row.

Loaded BQ data with max data-landing.api.users_20220928_.updated_at=Row((datetime.datetime(2022, 9, 28, 16, 57, 18, 752695, tzinfo=datetime.timezone.utc),), {'f0_': 0})

This serialized row then causes issues in XCom. I'll see about fixing it.

@potiuk
Copy link
Member Author

potiuk commented Sep 28, 2022

This serialized row then causes issues in XCom. I'll see about fixing it.

Thanks for flagging/testing/fixing. We can remove google provider and re-release it as rc2 (but let's continue testing it for now by others).

@potiuk
Copy link
Member Author

potiuk commented Sep 28, 2022

I'd like to contribute to the RC tests but for all 4 provider packages touched by my PR I just did code checks for consistency of the logging on API. I have no back-end environment for alibaba, azure, google and amazon in my availability :-( Is somebody besides me able to test? Test is simple if you have a running environment, can use any existing longer running DAG and check the log view ,as described in PR #26169

Yep. If others can test it - it would be great (especially while testing other issues). Thanks for notifying :)

@pauldalewilliams
Copy link
Contributor

I tested both #26666 and #26576 - both are good.

@patricker
Copy link
Contributor

patricker commented Sep 29, 2022

@potiuk I'll try and get my other two tested tomorrow. Here is the PR with the fix for #26285 :
#26768.

@bharanidharan14
Copy link
Contributor

bharanidharan14 commented Sep 29, 2022

Tested both Azure Service Bus (Create and Delete) Topic Operators #25436, working fine 👍
Screenshot 2022-09-29 at 5 07 08 PM
Screenshot 2022-09-29 at 5 07 18 PM

@nshetty15
Copy link
Contributor

#26653 - (apache-airflow-providers-docker 3.2.0rc1) Looks good. Thanks!

@patricker
Copy link
Contributor

patricker commented Sep 29, 2022

I tested #26096 and #26190. Both tested fine.
I ran into an issue while testing #26096, but which isn't a "bug" with the original PR's implementation. Just a new bug in a similar part of the code. I cut a PR for it, but doesn't have to go into RC2 if it doesn't make it.
#26786

@Taragolis
Copy link
Contributor

Taragolis commented Sep 29, 2022

#26442 - Only could proof that changed code persists. It is hardly possible to make loggers fail on create hook
#26628 - Check that amazon and databricks do not change anything in airflow.providers_manager.ProvidersManager after changes. (related to removed listed hooks)
#25980 - As expected deprecated s3 connection can be use in aws hooks with warnings. No connection in the UI anymore.
#26464 - Working as expected.

@Taragolis
Copy link
Contributor

#26169 - @jens-scheffler-bosch @potiuk It work with S3TaskHandler in breeze with some small artifacts: during fetching logs continuously printed *** Falling back to local log

[2022-09-29, 23:02:50 UTC] {subprocess.py:93} INFO - 70
[2022-09-29, 23:02:51 UTC] {subprocess.py:93} INFO - 71
[2022-09-29, 23:02:52 UTC] {subprocess.py:93} INFO - 72
*** Falling back to local log
[2022-09-29, 23:02:53 UTC] {subprocess.py:93} INFO - 73
[2022-09-29, 23:02:54 UTC] {subprocess.py:93} INFO - 74
*** Falling back to local log
[2022-09-29, 23:02:55 UTC] {subprocess.py:93} INFO - 75
[2022-09-29, 23:02:56 UTC] {subprocess.py:93} INFO - 76
*** Falling back to local log
[2022-09-29, 23:02:57 UTC] {subprocess.py:93} INFO - 77
[2022-09-29, 23:02:58 UTC] {subprocess.py:93} INFO - 78
*** Falling back to local log
[2022-09-29, 23:02:59 UTC] {subprocess.py:93} INFO - 79
[2022-09-29, 23:03:00 UTC] {subprocess.py:93} INFO - 80

If something wrong with connection than also spam with this message

log = f'*** Failed to verify remote log exists {remote_loc}.\n{error}\n'

@alexkruc
Copy link
Contributor

#25675 - verified

@Taragolis
Copy link
Contributor

#25852: Try to test in Airflow 2.2.5 and got an issues that it not work correctly with extra in the UI:

  1. Do not allow create connection with empty value for timeout
  2. If field not specified than write empty value to the DB, as result slack_sdk can't build correct URL

Need to check is it affect 2.3, because in 2.4 no problem. No problem with connection if it created via API or provided as URL (env var, secrets backend)

return {
prefixed_extra_field("timeout", cls.conn_type): IntegerField(
lazy_gettext("Timeout"),
widget=BS3TextFieldWidget(),
description="Optional. The maximum number of seconds the client will wait to connect "
"and receive a response from Slack API.",
),
prefixed_extra_field("base_url", cls.conn_type): StringField(
lazy_gettext('Base URL'),
widget=BS3TextFieldWidget(),
description="Optional. A string representing the Slack API base URL.",
),
prefixed_extra_field("proxy", cls.conn_type): StringField(
lazy_gettext('Proxy'),
widget=BS3TextFieldWidget(),
description="Optional. Proxy to make the Slack API call.",
),
}

#26648: Same as previous
#26118: Working fine
#26459: Unsafe imports not presented. Fine.

@josh-fell
Copy link
Contributor

#25604 looks good!

@potiuk potiuk added the testing status Status of testing releases label Oct 1, 2022
@potiuk
Copy link
Member Author

potiuk commented Oct 1, 2022

#25852: Try to test in Airflow 2.2.5 and got an issues that it not work correctly with extra in the UI:

  1. Do not allow create connection with empty value for timeout
  2. If field not specified than write empty value to the DB, as result slack_sdk can't build correct URL

Need to check is it affect 2.3, because in 2.4 no problem. No problem with connection if it created via API or provided as URL (env var, secrets backend)

Ok. I will wait for the check to relase an RC2 and will remove Slack alongside Google from the release and we will have RC2 together.

@potiuk
Copy link
Member Author

potiuk commented Oct 1, 2022

If something wrong with connection than also spam with this message

@Taragolis I think it's a small thing, they indicate that something is wrong with configuration. We might want to configure it in Breeze to make it less annoying.

@potiuk
Copy link
Member Author

potiuk commented Oct 1, 2022

I ran into an issue while testing #26096, but which isn't a "bug" with the original PR's implementation. Just a new bug in a similar part of the code. I cut a PR for it, but doesn't have to go into RC2 if it doesn't make it.
#26786

We are going to have RC2 of Google and I just merged #26786 so it will make it :)

@potiuk
Copy link
Member Author

potiuk commented Oct 1, 2022

Thanks all who tested it ! As usual we have great help from our community.

And thanks a lot for all the "critical" checks and doubts. Keep having them. It helps to make the right decisions for release!

I am releasing all but google and slack. I am at ApacheCon now. And I will look for @Taragolis comments on the connection for Slack to release RC2s

@potiuk potiuk closed this as completed Oct 1, 2022
@Taragolis
Copy link
Contributor

@Taragolis I think it's a small thing, they indicate that something is wrong with configuration. We might want to configure it in Breeze to make it less annoying.

I also think it a small things. I use breeze for test because part related to actual print a log during task execution in a core and seem like it not added in any current releases yet (milestones is Airflow 2.5.0)

The issue itself how work S3TaskHandler (as well as other Object storage):

  1. During task execution logs received from worker and after task execution write to S3
  2. If logs received from logger it print message that *** Falling back to local log
  3. If also user incorrectly configure AWS credentials it also write *** Failed to verify remote log exists

So.. Auto tail file logs just continuously execute _read which write this message (only in the UI, this information not presented into the actual logs).

@Taragolis
Copy link
Contributor

I am releasing all but google and slack. I am at ApacheCon now. And I will look for @Taragolis comments on the connection for Slack to release RC2s

@potiuk Let me check is might only related to specific version of Airflow, like only for 2.2.5
I have an idea about workaround if I sure that it related for all 2.2.x

Also as an option this version of provider might be released after October 11 with min airflow version 2.3.
In this case we do not need to create workaround which we remove in next version of provider

@potiuk
Copy link
Member Author

potiuk commented Oct 1, 2022

It might still be cool to fix it as the 'last' version working for 2.2.5 :)

@hankehly
Copy link
Contributor

hankehly commented Oct 2, 2022

Manually tested the following on 2022/10/02, both OK
#26003
#26005

@Taragolis
Copy link
Contributor

It might still be cool to fix it as the 'last' version working for 2.2.5 :)

I try to check it again today to determine with which version it actually has a problem (personally I have a problem with create connection in all providers which define additional fields in get_connection_form_widgets on different 2.2.5 environments and do not have in 2.3.4)

And create a workaround which allow to create Slack Connection in the UI

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:meta High-level information important to the community testing status Status of testing releases
Projects
None yet
Development

No branches or pull requests