-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: update code to pandas 2 compatibility #132
Conversation
description = "A projection package" | ||
authors = ["Octopize <help@octopize.io>"] | ||
license = "Apache-2.0" | ||
|
||
[tool.poetry.dependencies] | ||
python = ">=3.9,<3.12" | ||
pandas = "^1.3" | ||
pandas = ">=1.5.3" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This allows a package depending on saiph to work with pandas
v2.
The rest of the code has been updated to work with pandas 2, and pandas 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moreover, this allows us to not break the API/core if we don't update to pandas straight away.
@@ -249,6 +249,7 @@ def get_variable_correlation( | |||
df_quali = pd.get_dummies( | |||
df[model.original_categorical].astype("category"), | |||
prefix_sep=DUMMIES_SEPARATOR, | |||
dtype=np.uint8, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pandas v1 had default np.uint8
, pandas v2 has it as bool
.
The latter might break the missing values handling that add a _True
and _False
string suffix.
series = pd.read_json( | ||
data, orient="index", typ="series", dtype=json_dict["dtype"] | ||
StringIO(data), orient="index", typ="series", dtype=json_dict["dtype"] | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing a string (data
) to pd.read_json
is deprecated in pandas v2/
[[package]] | ||
name = "pandas" | ||
version = "1.5.3" | ||
version = "2.2.2" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, pandas is at v2. poetry.lock
is what is being used when developing locally. It is not put on pypi
for the dependent package to install.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM :)
description = "A projection package" | ||
authors = ["Octopize <help@octopize.io>"] | ||
license = "Apache-2.0" | ||
|
||
[tool.poetry.dependencies] | ||
python = ">=3.9,<3.12" | ||
pandas = "^1.3" | ||
pandas = ">=1.5.3" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not forcing ^2.0.0
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because saiph
is a dependency and it's a pain when dependencies have hard constraints on another package.
Also, this allows us to update to pandas at a later stage in the API, while still benefiting from new features in saiph.
I've tested that this update works with the API and core with
pandas v1.5.3
.Next step is to update pandas to v2 in API, but at least
saiph
still works withpandas 1.5.3
if we don't update straight away, and is now compatible withpandas 2.2.2
.