-
Notifications
You must be signed in to change notification settings - Fork 14.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make pandas dependency optional for Amazon Provider #28505
Make pandas dependency optional for Amazon Provider #28505
Conversation
e83c95c
to
b4017e1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️ 🚀
This means it is a breaking change. |
Not that it matters in this case (we are releasing breaking version anyway), but let me venture into academic discussion (just for the sake of discussion). And more for "food for thoughts". I do not want to argue vehemotly on this, I am happy to move it now to breaking, I would like more to present a different view on that. It really depends what WE consider as breaking change. It's never 0/1 (breaking/not breaking). As I mentioned multiple time and as Hyrum's law nicely states: "With a sufficient number of users of an API, it does not matter what you promise in the contract: all observable behaviors of your system will be depended on by somebody". Every single change in our case we do is breaking someone's workflow. We should accept this. And whether we classify something as breaking or not depends purely on our assesment of the situation on how breaking it is. We earlier agreed that just dependency change is not breaking on its own (and we released a number of providers with dependency changes without increasing major. What you described is a possibility of course, but more likely than not pandas will already be installed as dependency anyway there for mutliple reasons. And even if not it is just the matter of adding "[pandas]" in their scripts. No DAG modificaitons needed, no changes in the code just "environment" change. We are just dicussing what is what "Airflow Public Interface" - #28300 and I have not seen anyone asking to add "list of dependencies to install automatically" as part of the public API. Once we merge this change, this kind of change is not going to be considered as "breaking" on the list. So - super easy way to recover and error message will be very clear. Also there are other things we should consider - do you think the users who install airflow automatically in the way do not have a test harness to verify if their new install is working ? There might be many more problems than missing pandas if every time they rebuild it from scratch, so in your scenario - they for sure have a test harness in place. They have a staging system to test it. And even if they don't have test harness - do we realistically think they will limit their installation and not install 7.0.0 version of the provider gets released in the same automated way? Do you think they have a separate test harness that they will test against such breaking relase of every single provider they use? Will that make a difference for them whether that description is put in "breaking" or "feature" changes" (because that will be the only difference in this case) ? I am not sure, but I think this is quite unrealistic. And of course we can be wrong in our assesment. Happens. We can always produce a bugfix for that if many of our users will start complaining. And we should of course try to avoid this, but ultimately, we will break some workflows of some people - this is unavoidable. And there is no way to 100% protect against it. And finally - yes, I think this is a border-line for me this is really the case where things aren't clearly 0/1. And we can either err on a side of caution or be a bit more bold This is all purely academic discussion of course - I have 0 problem with putting it into breaking section - it does not really matter in this case of course. Just wanted to explain that putting everything that potentially changes some particular workflow is not necessarily breaking. It's a very simplistic view IMHO and we should not only asses if the change "actually" breaks something but also how disruptive it really is and whether it is simple to recover for the user at early stage, because it is realy harmful. |
Pandas dependency is only used for a few features in Amazon providers amd it is a huge "drag" on a package in general - we use the existing Airflow 2.3+ feature of Optional Provider Feature to make pandas an optional dependency (with "[pandas]" extra). This is not a breaking change, since those who already installed previous provider version have pandas installed already. Only people who install provider manually will have to install "pandas" extra either for Airflow core package or for the provider. Fixes: apache#28468
b4017e1
to
026150c
Compare
Moved the description up to breaking changes. |
@potiuk I actually agree with you on all fronts. I just think that whenever we are not 100% sure we should lean more to the safest option. If this also means no hassle for us then even more reason! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @potiuk
Pandas dependency is only used for a few features in Amazon providers amd it is a huge "drag" on a package in general - we use the existing Airflow 2.3+ feature of Optional Provider Feature to make pandas an optional dependency (with "[pandas]" extra).
This is not a breaking change, since those who already installed previous provider version have pandas installed already. Only people who install provider manually will have to install "pandas" extra either for Airflow core package or for the provider.
Fixes: #28468
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.