-
Notifications
You must be signed in to change notification settings - Fork 910
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid Kedro fsspec requirements being mutually incompatible with pandas 1.1.0 #489
Comments
Happy to make pull request with my suggested implementation if desired. Although it is only a 1-liner edit so barely saving any labour... |
nb./ This issue is a superset of issue #488. Modin compatibility will be fully resolved with an update to fsspec and pandas 1.1.0 |
@JMBurley would it also be possible to update to fsspec 0.8.0 if you create a PR or do you see any problems with the latest version? |
@fjp I could update fsspec in kedro setup files, but I don't think I should. Kedro has a lot of I/O functionality that could intersect with fsspec in a lot of ways. Unless quantumBlack is very confident in their hooks and tests I don't think I should be making that change (particularly as I don't understand fsspec very deeply). On the other hand, I am a (minor) Pandas contributor and understand that library well enough to know that 1.1.0 is a good upgrade and will not break data pipelines if the fsspec dependency is correctly managed (gcsfs and similar have been stable for long enough that it shouldn't be an issue for kedro) |
Hi guys, thanks a lot for the interest here. Unfortunately |
Thanks @lorenabalan, if I'm reading The reason quantum Black unpinned pandas (pandas-dev/pandas#34467) is solved, and the fsspec problem that would have been introduced by a pd update is now fixed. Would only need to change the appropriate pandas-determining lines in requirements.txt and setup.py. |
Oops. FYI, I'm running into an incompatibility with |
Im having this same issue. Its keeping me from creating my environment. A lower version of s3fs is not really a good option for me. |
Hey @JMBurley , I just noticed with Kedro 0.16.5, I get new |
Hi @lorenabalan , Thanks for the update, appreciate having a higher pandas version available, that's great. However, if I'm reading your comment correctly kedro 0.16.5 has created the exact problem I was forewarning against, If kedro enforces QuantumBlack can keep track of Pandas dependencies here: https://pandas.pydata.org/pandas-docs/stable/getting_started/install.html PS. To clear up any confusion, I mentioned parquet as it was the blocker on kedro allowing higher pd versions. It is not the key part of this issue, which is flagging up kedro compatibility issues if it allows higher pandas versions without higher fsspec |
@JMBurley
Kedro has not created a problem, Kedro declares its dependencies and the corresponding acceptable version ranges using the standard mechanism -
Lastly, correct dependency version resolution is not the responsibility of Kedro library, but rather package management tools like |
Hello again! We hear you all on the |
@DmitriiDeriabinQB has beat me to it. 😂 ⏲️ |
Thanks both, Would I be correct in understanding the official position here as: I understand the position if so, although I think there are good reasons to prototype eda in a project using pandas s3 functionality rather than hopping in and out of the data catalog to create potentially disposable items. Regardless, looking forward to v0.17.0! PS. I agree that anyone who doesn't use virtual environments deserves the problems they encounter |
@JMBurley, we would be happy not to break Using pandas directly for eda makes perfect sense, we will have a look at what can be done here. |
|
The changes will be available in the upcoming Kedro 0.17.0 release. |
Description
Kedro 16.4 enforces
fsspec<0.7.0,>=0.5.1
&PANDAS = "pandas>=0.24, <1.0.4"
in setup.pyHowever,
pip install kedro
IF kedro allows pandas 1.1.0, then you are going to hit an incompatibility with fsspec, as pandas 1.1.0 requires fsspec>=0.7.4.
Context
You cannot run kedro and pandas>1.1.0 on the same environment. Pandas needs fsspec>0.7.4.
There are meaningful improvements in newer pandas, so I would like to be able to run them together out-of-the-box.
The current reason to not allow higher pd versions as per setup.py in the kedro source code (pandas-dev/pandas#34467) is no longer applicable, so i think it is time to make this change unless fsspec has deal-breaking problems in later versions.
Possible Implementation
Can you just update fsspec version requirements to
fsspec<0.7.4,>=0.5.1
? I'm not aware of any major problems in the newer versionsPossible Alternatives
Raise warnings or errors if kedro co-exists with pd>=1.1.0
The text was updated successfully, but these errors were encountered: