Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move Hive macros to the provider #28538

Merged
merged 1 commit into from
Dec 27, 2022
Merged

Conversation

potiuk
Copy link
Member

@potiuk potiuk commented Dec 22, 2022

The Hive macros are now moved to the provider.

Fixes: #19445


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@potiuk
Copy link
Member Author

potiuk commented Dec 22, 2022

This one will remove I think last "real" dependency of Airflow code on providers.

I am not 100% sure on two things:

  • will that actually work (do we have to add some code to discover such hive macros ?
  • do we care?
  • should we add a min-hive-provider version to "apache.hive" airflow extra ?

Copy link
Contributor

@eladkal eladkal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wasn't able to link the issue but this closes #19445

airflow/providers/apache/hive/macros/__init__.py Outdated Show resolved Hide resolved
@Taragolis
Copy link
Contributor

ds_add, ds_format, datetime_diff_for_humans commonly use by users. Also them listed into the Templates reference: https://airflow.apache.org/docs/apache-airflow/stable/templates-ref.html#macros

@Taragolis
Copy link
Contributor

And unfortunetly airflow macros could only distributed by Plugins, not by Providers

@potiuk
Copy link
Member Author

potiuk commented Dec 22, 2022

And unfortunetly airflow macros could only distributed by Plugins, not by Providers

Provider can also deliver plugins - it's just the matter of implementing plugin endpoint.

@potiuk potiuk force-pushed the deprecate-hive-plugins branch 2 times, most recently from 536975b to 4e3f465 Compare December 22, 2022 17:03
@potiuk
Copy link
Member Author

potiuk commented Dec 22, 2022

But yeah. Obviously I moved too much out of the "macros" package :). Bringing it back

@Taragolis
Copy link
Contributor

Provider can also deliver plugins - it's just the matter of implementing plugin endpoint.

If it stored into Provider it can't automatically injected into Jinja Environment.

@Taragolis
Copy link
Contributor

Or you mean we could add additional entrypoint into provider package?

@potiuk
Copy link
Member Author

potiuk commented Dec 22, 2022

Or you mean we could add additional entrypoint into provider package?

Yep. Already done so

@Taragolis
Copy link
Contributor

Whoa!!! I never thought from this side.

That mean potentially we could have CeleryExecutor into celery package and K8S Executors inside cncf.kubernetes package?

@potiuk
Copy link
Member Author

potiuk commented Dec 22, 2022

I am not sure (tired?) I thought the "airflow.macros" macros also belong to hive and I moved them together with hive ones (reverted now).

I believe with the approach I proposed now - we do not even need to add any deprecations. If we properly expose the hive macros via additional entrypoint in the hive provider, we can just move "macros.hive" entrely to the provider - because users interact with them through JINJA only an the macros will be available as hive.max_partition for example.

I think the only thing left (besides testing of course) is to make sure that we document this as potentially breaking change in the docs (users will need to upgrade hive provider) and add >= <new_provider_version> in the hive extra for airlfow (and the docs of course mentioned by @eladkal ).

@potiuk potiuk marked this pull request as draft December 22, 2022 17:23
@potiuk potiuk changed the title Deprecate Hive macros Move Hive macros to the provider Dec 22, 2022
@potiuk
Copy link
Member Author

potiuk commented Dec 22, 2022

Whoa!!! I never thought from this side.

That mean potentially we could have CeleryExecutor into celery package and K8S Executors inside cncf.kubernetes package?

Executors are not plugins - you can run your own executors without doclaring them via plugin - you need just to configure airflow to use them via configuration. So we could move them regardless - but I am not really sure we want to move them to providers because they are so "essential" to airflow. However after AIP-51 it actually would be possible.

@Taragolis
Copy link
Contributor

Oh... I just more interested in a hypothetical possibility rather than move existed one or implement new one.

Anyway this approach could add ability of add provider specific filters and macros

@potiuk
Copy link
Member Author

potiuk commented Dec 22, 2022

Oh... I just more interested in a hypothetical possibility rather than move existed one or implement new one.

Anyway this approach could add ability of add provider specific filters and macros

Very much so. That's why I wanted to try it out finally - it's been on my list for quite some time

@potiuk
Copy link
Member Author

potiuk commented Dec 22, 2022

Still need to sort out versioning, and test it, but it's close to be ready IMHO

@potiuk potiuk marked this pull request as ready for review December 23, 2022 15:41
@potiuk potiuk requested a review from mik-laj as a code owner December 23, 2022 15:41
@potiuk
Copy link
Member Author

potiuk commented Dec 23, 2022

Ok . I tested it and it works well now. Including installing the new provider on earlier version of airflow that has hive macros built-in and the macros from the provider nicely override the hive macros that come from Airflow.

I've added >= 5.1.0 exception for Hive in airflow's [hive] extra so we can be sure that the new version of Airflow pulls the new version of hive when installed with extrra.

The only potentially dangerous scenario is when someone will upgrade just airflow to > 2.6.0 (I assume this change will be out in 2.6.0) and still has the old hive provider installed. This should geenerally not happen, because of constraints and it has an easy upgrede mechanism if it happens (I added newsfragment describing it).

@potiuk potiuk force-pushed the deprecate-hive-plugins branch 3 times, most recently from 17a0901 to 043ed36 Compare December 23, 2022 15:57
The Hive macros are now moved to the apache.hive provider.

Fixes: apache#19445
@eladkal
Copy link
Contributor

eladkal commented Dec 23, 2022

The only potentially dangerous scenario is when someone will upgrade just airflow to > 2.6.0 (I assume this change will be out in 2.6.0) and still has the old hive provider installed. This should geenerally not happen, because of constraints and it has an easy upgrede mechanism if it happens (I added newsfragment describing it).

constraints + newsfragment is great.

It is tricky though usualy we deprecate first but I'm not sure if there is a way to raise deprecation warning for macros?

@potiuk
Copy link
Member Author

potiuk commented Dec 23, 2022

It is tricky though usualy we deprecate first but I'm not sure if there is a way to raise deprecation warning for macros?

Technically it's not "deprecation" - it should work identically to old way but only when the provider is installed. This is yet another case where things are not breaking - as long as we upgrade dependencies.

We could - potentially raise a warning here - it but not just yet. After releasing Hive Provider 5.1.0 (but only after that, because otherwise the user would not have a way to get rid of the warning) we could raise such warnings - but this will have to be in Airflow 2.5.n "airflow.macros.hive" - methods - and the way to raise the warning would be to ask the user to upgrade the Hive Provider to 5.1.0+. It will have to be done directly in v2-5-test branch though.

@potiuk potiuk merged commit 4e545c6 into apache:main Dec 27, 2022
@potiuk potiuk deleted the deprecate-hive-plugins branch December 27, 2022 07:24
@pierrejeambrun pierrejeambrun added this to the Airflow 2.5.1 milestone Jan 9, 2023
@pierrejeambrun pierrejeambrun added the type:misc/internal Changelog: Misc changes that should appear in change log label Jan 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers type:misc/internal Changelog: Misc changes that should appear in change log
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Move Hive macros out of Airflow core to Apache provider
6 participants