Skip to content

Conversation

@dominikhei
Copy link
Contributor

@dominikhei dominikhei commented Jun 15, 2025


I added JWT Authentication to the TableauHook. If password and login are set, it is preferred over JWT Authentication.
There are two options for using a token: Passing it as a string to the connection extras or using a path to a location on disk to a file. If both are set, the file will be used.
closes: #51453

@eladkal
Copy link
Contributor

eladkal commented Jun 15, 2025

You probably need to run
breeze static-checks --type update-providers-dependencies --all-files

@dominikhei
Copy link
Contributor Author

@eladkal The problem is between apache-airflow-providers-amazon[aiobotocore] and apache-airflow-providers-tableau:

  • botocore requires urllib3 versions between 1.25.4 and 1.27 for Python < 3.10
  • tableauserverclient>=0.27 (which we need for the JWT authentication) requires urllib3>=2.0.6

It seems like this issue is not a problem with Python>=3.10 only with < 3.10

@eladkal
Copy link
Contributor

eladkal commented Jun 15, 2025

I think we can set the provider to be Python 3.10+ only. Python 3.9 is EOL very soon and anyway we don't introduce that many changes to the tableau provider.
cc @potiuk WDYT?

@dominikhei
Copy link
Contributor Author

dependency issues seem to be resolved with python3.10 when testing locally. However breeze static-checks --type update-providers-dependencies --all-files always uses a python3.9 ci-image, no matter what I try to change.

@eladkal
Copy link
Contributor

eladkal commented Jun 16, 2025

Can you push the changes? Lets see
Probably need to set min Python version for the provider, not sure if we allow this per provider. Lets see what the CI says.
We can also limit usage of the new feature for Python 3.10+ if needed without changing the min Python vesrion.

@eladkal
Copy link
Contributor

eladkal commented Jun 17, 2025

@dominikhei try to run:
breeze ci-image build --upgrade-to-newer-dependencies

@dominikhei
Copy link
Contributor Author

dominikhei commented Jun 17, 2025

@dominikhei try to run: breeze ci-image build --upgrade-to-newer-dependencies

Already tried it, still tries to run the checks with the 3.9 ci-image. maybe its hardcoded somewhere because I can not find a way to use the 3.10 image, even though I have configured breeze to use it.

@eladkal
Copy link
Contributor

eladkal commented Jun 19, 2025

@dominikhei try to run: breeze ci-image build --upgrade-to-newer-dependencies

Already tried it, still tries to run the checks with the 3.9 ci-image. maybe its hardcoded somewhere because I can not find a way to use the 3.10 image, even though I have configured breeze to use it.

We have a case of excluded Python 3.9 from a provider. apply similar logic as shown here:

requires-python = "~=3.9,!=3.9"
# The dependencies should be modified in place in the generated file.
# Any change in the dependencies is preserved when the file is regenerated
# Make sure to run ``breeze static-checks --type update-providers-dependencies --all-files``
# After you modify the dependencies, and rebuild your Breeze CI image with ``breeze ci-image build``
dependencies = [
"apache-airflow>=2.10.0",
# Even though 3.9 is excluded below, we need to make this python_version aware so that `uv` can generate a
# full lock file when building lock file from provider sources
"ibmcloudant==0.9.1;python_version>=\"3.10\"",
]

@dominikhei
Copy link
Contributor Author

Apart from the failing doc checks, which i'll adjust, the CI still runs with 3.9 leading to tableauserverclient now not being included as a dependency (tableauserverclient>=0.27;python_version>=\"3.10\"). Locally the breeze static-checks --type update-providers-dependencies --all-files worked.

@eladkal
Copy link
Contributor

eladkal commented Jun 20, 2025

Looks like we need also to make changes in global_constants.py add tableau to all Python 3.9 entries in remove-providers:

PROVIDERS_COMPATIBILITY_TESTS_MATRIX: list[dict[str, str | list[str]]] = [
{
"python-version": "3.9",
"airflow-version": "2.10.5",
"remove-providers": "cloudant common.messaging fab git keycloak",
"run-tests": "true",
},
{
"python-version": "3.9",
"airflow-version": "2.11.0",
"remove-providers": "cloudant common.messaging fab git keycloak",
"run-tests": "true",
},
{
"python-version": "3.9",
"airflow-version": "3.0.2",
"remove-providers": "cloudant",
"run-tests": "true",
},
]

might also need change in test-providers.yaml but not 100% sure, lets apply the above fix first and see if we need to change here too

- name: Remove Python 3.9-incompatible provider distributions
run: |
echo "Removing Python 3.9-incompatible provider: cloudant"
rm -vf dist/*cloudant*

We just need to read the errors to get direction of what is the issue and then look up for cloudant key word in the code base to see what special treatment it gets. The issue of excluding Python version in a specific provider is not common thus we don't have 1 line fix for this yet.

@dominikhei
Copy link
Contributor Author

Might be helpful for later once we made it work, to also adjust the Contributors Guide with a section on this.

@eladkal
Copy link
Contributor

eladkal commented Jun 20, 2025

airflow-core/tests/unit/always/test_providers_manager.py:256: in test_hook_values
    raise AssertionError("There are warnings generated during hook imports. Please fix them")
E   AssertionError: There are warnings generated during hook imports. Please fix them

failure is because we are missing similar:

excluded-python-versions:
# ibmcloudant transitively brings in urllib3 2.x, but the snowflake provider has a dependency that pins
# urllib3 to 1.x on Python 3.9; thus we exclude those Python versions from taking the update
# to ibmcloudant.
# See #21004, #41555, and https://github.com/snowflakedb/snowflake-connector-python/issues/2016
- "3.9"

@eladkal
Copy link
Contributor

eladkal commented Jun 20, 2025

Might be helpful for later once we made it work, to also adjust the Contributors Guide with a section on this.

You will be the best person for the job. Ideally, if we can find automations I think it's better.
we already have the excluded-python-versions per provider so I think it would be better to try to automate the tests to account for this

@eladkal
Copy link
Contributor

eladkal commented Jun 20, 2025

OK looks like all cutrent failures are docs/examples and tableau tests

@dominikhei
Copy link
Contributor Author

dominikhei commented Jun 20, 2025

OK looks like all cutrent failures are docs/examples and tableau tests

Seems like airflow/providers/tableau/docs/tableau.rstwas not previously tracked and I added it by accident

The failing tests are still linked to the python version:

_ ERROR collecting providers/tableau/tests/unit/tableau/sensors/test_tableau.py _
ModuleNotFoundError: No module named 'tableauserverclient'

@eladkal
Copy link
Contributor

eladkal commented Jun 20, 2025

can you fix the other tests? Sometimes handling the other errors solves the doc problems

@eladkal
Copy link
Contributor

eladkal commented Jun 20, 2025

Almost :)
We need to skip tests on Python 3.9 runners - this needs to be done on all Tableau test files

if sys.version_info < (3, 10):
pytestmark.append(
pytest.mark.skip(
f"Skipping {__name__} as the cloudant provider is not supported on Python 3.9, see #41555."
)
)
else:
from airflow.providers.cloudant.hooks.cloudant import CloudantHook

@dominikhei
Copy link
Contributor Author

@eladkal the only failing static check is ts-compile-lint-ui due to some code quality violations in the frontend code. I just checked and the code responsible for one of the errors was last touched on 22.04.2025.

Do you know why this is failing now?

@gopidesupavan
Copy link
Member

@eladkal the only failing static check is ts-compile-lint-ui due to some code quality violations in the frontend code. I just checked and the code responsible for one of the errors was last touched on 22.04.2025.

Do you know why this is failing now?

we reverted dependaboat bumps, please rebase it should work now?

@eladkal
Copy link
Contributor

eladkal commented Jun 22, 2025

I started [DISCUSS] Dropping Python 3.9 support
lets see if we can get it passed soon thus making this PR easier

@dominikhei
Copy link
Contributor Author

dominikhei commented Jun 22, 2025

I started [DISCUSS] Dropping Python 3.9 support lets see if we can get it passed soon thus making this PR easier

Should we just wait until we upgrade to 3.10, as the feedback was positive or add it for 3.9 with the workarounds? (Would need to rebase due to #51930)

@eladkal
Copy link
Contributor

eladkal commented Jun 27, 2025

#52072 is merged. You can now proceed without the trouble of Python 3.9

@dominikhei dominikhei marked this pull request as ready for review June 29, 2025 21:46
@dominikhei dominikhei requested a review from eladkal June 30, 2025 13:11
@eladkal
Copy link
Contributor

eladkal commented Jul 3, 2025

Nice!

@eladkal eladkal merged commit c09853b into apache:main Jul 3, 2025
84 checks passed
@armandduijn
Copy link

Thank you for implementing JWT authentication. This opens the door to migrating from PAT to Connected Apps.

I noticed the current implementation assumes the Tableau connection already contains a valid JWT token. Is there a reason you opted for this approach over generating the JWT within the Tableau hook? As far as I'm aware, it isn't possible to generate a long-lived JWT, so users would need to regenerate the JWT quite frequently (every couple of minutes).

@dominikhei
Copy link
Contributor Author

Thank you for implementing JWT authentication. This opens the door to migrating from PAT to Connected Apps.

I noticed the current implementation assumes the Tableau connection already contains a valid JWT token. Is there a reason you opted for this approach over generating the JWT within the Tableau hook? As far as I'm aware, it isn't possible to generate a long-lived JWT, so users would need to regenerate the JWT quite frequently (every couple of minutes).

Hi, let me have a look at it and I'll get back to you.

@dominikhei
Copy link
Contributor Author

Thank you for implementing JWT authentication. This opens the door to migrating from PAT to Connected Apps.

I noticed the current implementation assumes the Tableau connection already contains a valid JWT token. Is there a reason you opted for this approach over generating the JWT within the Tableau hook? As far as I'm aware, it isn't possible to generate a long-lived JWT, so users would need to regenerate the JWT quite frequently (every couple of minutes).

So from my perspective, as the JWT token comes from an external identity provider, tableauserverclient can only consume pre-signed JWT's. Fetching this is dependent on the identity provider, and I don't think logic to fetch this belongs in the tableau provider. If I am not mistaken you can sign the token yourself with a secret configured in Tableau Connected Apps? But I am unsure if logic for this belongs in here. What you could do for now is build a custom task using python that generates or fetches a token and then pass it. @potiuk what is your thought?

@armandduijn
Copy link

armandduijn commented Jan 12, 2026

Thank you for implementing JWT authentication. This opens the door to migrating from PAT to Connected Apps.
I noticed the current implementation assumes the Tableau connection already contains a valid JWT token. Is there a reason you opted for this approach over generating the JWT within the Tableau hook? As far as I'm aware, it isn't possible to generate a long-lived JWT, so users would need to regenerate the JWT quite frequently (every couple of minutes).

So from my perspective, as the JWT token comes from an external identity provider, tableauserverclient can only consume pre-signed JWT's. Fetching this is dependent on the identity provider, and I don't think logic to fetch this belongs in the tableau provider. If I am not mistaken you can sign the token yourself with a secret configured in Tableau Connected Apps? But I am unsure if logic for this belongs in here. What you could do for now is build a custom task using python that generates or fetches a token and then pass it. @potiuk what is your thought?

Thank you for sharing your thoughts. You're correct, users would need to obtain a secret from a Tableau Connected App and pass it to the operator. This does add an extra step, but one could argue that generating a PAT follows a similar pattern. It also requires the user to create a token in Tableau and then pass both its name and value into the Tableau connection in Airflow. The main difference is that a Connected App must be created by a site administrator.

@dominikhei
Copy link
Contributor Author

Thank you for implementing JWT authentication. This opens the door to migrating from PAT to Connected Apps.
I noticed the current implementation assumes the Tableau connection already contains a valid JWT token. Is there a reason you opted for this approach over generating the JWT within the Tableau hook? As far as I'm aware, it isn't possible to generate a long-lived JWT, so users would need to regenerate the JWT quite frequently (every couple of minutes).

So from my perspective, as the JWT token comes from an external identity provider, tableauserverclient can only consume pre-signed JWT's. Fetching this is dependent on the identity provider, and I don't think logic to fetch this belongs in the tableau provider. If I am not mistaken you can sign the token yourself with a secret configured in Tableau Connected Apps? But I am unsure if logic for this belongs in here. What you could do for now is build a custom task using python that generates or fetches a token and then pass it. @potiuk what is your thought?

Thank you for sharing your thoughts. You're correct, users would need to obtain a secret from a Tableau Connected App and pass it to the operator. This does add an extra step, but one could argue that generating a PAT follows a similar pattern. It also requires the user to create a token in Tableau and then pass both its name and value into the Tableau connection in Airflow. The main difference is that a Connected App must be created by a site administrator.

In this case you would want to store the secret to sign the token in Airflow and then generate the token in Airflow and sign it using the secret? In my opinion, this is logic that should be handled outside of Airflow. I think it would be worthwhile to get input from someone with more experience before proceeding in any way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add JWT sign in method for Tableau provider

4 participants