Skip to content

Conversation

@eladkal
Copy link
Contributor

@eladkal eladkal commented May 11, 2025

The workaround of set xmlsec and lxml is relevant only for aws auth manager thus we need these as part of the python3-saml extra rather than as direct dependency of the provider

@eladkal eladkal requested a review from o-nikolas as a code owner May 11, 2025 07:36
@boring-cyborg boring-cyborg bot added area:providers provider:amazon AWS/Amazon - related issues labels May 11, 2025
@potiuk
Copy link
Member

potiuk commented May 11, 2025

Hm.. I think this one will fail provider tests - selective checks did not trigger provider tests on this one (fix is coming to that one) but I have a feeling this one will fail if we run full tests. Let's see.

@potiuk potiuk added the full tests needed We need to run full set of tests for this PR to merge label May 11, 2025
@potiuk potiuk closed this May 11, 2025
@potiuk potiuk reopened this May 11, 2025
@potiuk
Copy link
Member

potiuk commented May 11, 2025

Closed/reopened to trigger full tests

@potiuk
Copy link
Member

potiuk commented May 11, 2025

Yep. That's what I thought. You will need to add python3-saml extra to the apache-airflow meta-project pyproject.toml (the one in the root package, similarly to the s3fs one:

"s3fs" = [
    # This is required for support of S3 file system which uses aiobotocore
    # which can have a conflict with boto3 as mentioned in aiobotocore extra
    "apache-airflow-providers-amazon[s3fs]",
]

Like this:

"python3-saml" = [
   "apache-airflow-providers-amazon[python3-saml]",
] 

And it should be added as an extra installed when all extras are installed (same pyproject.toml) :

"all" = [
    "apache-airflow[aiobotocore,apache-atlas,apache-webhdfs,async,cloudpickle,github-enterprise,google-auth,graphviz,kerberos,ldap,otel,pandas,polars,rabbitmq,s3fs,sentry,statsd,uv]",

Otherwise CI image wil not install this extra. We use all extra to build the ci image - and if the extra is not specified in all - directly or transitively, it will not be installed when ci image is built.

@potiuk
Copy link
Member

potiuk commented May 11, 2025

Fix to selective check #50451

@eladkal
Copy link
Contributor Author

eladkal commented May 11, 2025

"python3-saml" = [
"apache-airflow-providers-amazon[python3-saml]",
]

I named it aws-auth to avoid confusion

@potiuk
Copy link
Member

potiuk commented May 11, 2025

Not sure if that's a good idea to add different name - it's better to keep it consistent, so maybe we should change both to be aws-auth ?

Also some docs need to be updated.

@potiuk
Copy link
Member

potiuk commented May 11, 2025

Also compatibility tests will have problem in this case - we might need to add some special case for the compat tests to pass -- the extra is not defined in Airflow 3.0.0, so we cannot really install it via extra - we should likely add separate installation of the missing dependencies in case of 3.0 tests - until at least the 3.0 * version tested in back-compat has the same extra defined.

Or maybe just conditionally skipping the tests with importorskip - that woudl also work.

@eladkal
Copy link
Contributor Author

eladkal commented May 12, 2025

Not sure if that's a good idea to add different name - it's better to keep it consistent, so maybe we should change both to be aws-auth ?

We have google-auth. I don't think we can have python3-saml as Airflow extra. We can have it as amazon extra.

Also some docs need to be updated.

why? We still have python3-saml as extra for amazon. What we add is aws-auth extra for apache-airflow.

@potiuk
Copy link
Member

potiuk commented May 12, 2025

Also some docs need to be updated.

why? We still have python3-saml as extra for amazon. What we add is aws-auth extra for apache-airflow.

Becuase we have apache-airlfow extras documented here https://airflow.apache.org/docs/apache-airflow/stable/extra-packages-ref.html#core-airflow-extras - including google-auth. Likely this page should be restructured a bit - because we split the extras to "core" and "meta-package" extras though.

@potiuk
Copy link
Member

potiuk commented May 12, 2025

Restructuring PR here #50495

@potiuk
Copy link
Member

potiuk commented May 12, 2025

We have google-auth. I don't think we can have python3-saml as Airflow extra. We can have it as amazon extra.

Sure - indeed we have other cases like that.

@eladkal eladkal force-pushed the aws branch 4 times, most recently from ab6b616 to d863e4b Compare May 20, 2025 11:55
@eladkal
Copy link
Contributor Author

eladkal commented May 21, 2025

@potiuk I am missing something here
I see

==================================== ERRORS ====================================
_ ERROR collecting providers/amazon/tests/unit/amazon/aws/auth_manager/routes/test_login.py _
ImportError while importing test module '/opt/airflow/providers/amazon/tests/unit/amazon/aws/auth_manager/routes/test_login.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/local/lib/python3.9/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
providers/amazon/tests/unit/amazon/aws/auth_manager/routes/test_login.py:29: in <module>
    from onelogin.saml2.idp_metadata_parser import OneLogin_Saml2_IdPMetadataParser
E   ModuleNotFoundError: No module named 'onelogin'
------ generated xml file: /files/test_result-providers_amazon-sqlite.xml ------
=========================== short test summary info ============================
ERROR providers/amazon/tests/unit/amazon/aws/auth_manager/routes/test_login.py
!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!
=================== 1 skipped, 1 warning, 1 error in 33.28s ====================

This is due to

from onelogin.saml2.idp_metadata_parser import OneLogin_Saml2_IdPMetadataParser

but this import should be available when the apache-airflow-providers-amazon[python3-saml] is installed in the CI. So I guess I am missing something to make this extra installed in the CI

@potiuk
Copy link
Member

potiuk commented May 21, 2025

from onelogin.saml2.idp_metadata_parser import OneLogin_Saml2_IdPMetadataParser

but this import should be available when the apache-airflow-providers-amazon[python3-saml] is installed in the CI. So I guess I am missing something to make this extra installed in the CI

@eladkal - Yes. The problem is that in the compatibility tests and amazon tests with lowest dependencies we do not use the extra when installing the provider. In those tests we are installing just provider - no optional extras.

Since this is now effectively an optional feature we can make the tests optional via - importorskip

@eladkal
Copy link
Contributor Author

eladkal commented May 22, 2025

@potiuk CI is green

@potiuk potiuk merged commit 5fed2eb into apache:main May 22, 2025
96 checks passed
@eladkal eladkal deleted the aws branch May 22, 2025 12:48
dadonnelly316 pushed a commit to dadonnelly316/airflow that referenced this pull request May 26, 2025
sanederchik pushed a commit to sanederchik/airflow that referenced this pull request Jun 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:providers full tests needed We need to run full set of tests for this PR to merge provider:amazon AWS/Amazon - related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants