Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Putting Series in Project #356

Closed
MarieS-WiMLDS opened this issue Sep 18, 2024 · 13 comments · Fixed by #378
Closed

Putting Series in Project #356

MarieS-WiMLDS opened this issue Sep 18, 2024 · 13 comments · Fixed by #378
Assignees
Labels
enhancement New feature or request

Comments

@MarieS-WiMLDS
Copy link
Contributor

MarieS-WiMLDS commented Sep 18, 2024

I would like to be able to store series objects in project.

Here is a piece of code that doesn't work for now:

# %%
import pandas as pd
from skore import load

# %%
# !python -m skore create

# %%
project = load("project.skore")

# %%
my_df = pd.Series(data=[1,2,3])

#%%
type(my_df)
# %%
project.put("my_df", my_df)

I get:

{
	"name": "AttributeError",
	"message": "module 'matplotlib' has no attribute 'figure'",
	"stack": "---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
File /home/marie/Documents/ML_use_cases/binary_class_tabular/test.py:2
      1 # %%
----> 2 project.put(\"my_df\", my_df)

File ~/anaconda3/envs/skore/lib/python3.12/site-packages/skore/project.py:57, in Project.put(self, key, value)
     55 def put(self, key: str, value: Any):
     56     \"\"\"Add a value to the Project.\"\"\"
---> 57     item = object_to_item(value)
     58     self.put_item(key, item)

File ~/anaconda3/envs/skore/lib/python3.12/site-packages/skore/project.py:36, in object_to_item(o)
     34 elif isinstance(o, altair.vegalite.v5.schema.core.TopLevelSpec):
     35     return MediaItem.factory_altair(o)
---> 36 elif isinstance(o, matplotlib.figure.Figure):
     37     return MediaItem.factory_matplotlib(o)
     38 elif isinstance(o, PIL.Image.Image):

File ~/anaconda3/envs/skore/lib/python3.12/site-packages/matplotlib/_api/__init__.py:217, in caching_module_getattr.<locals>.__getattr__(name)
    215 if name in props:
    216     return props[name].__get__(instance)
--> 217 raise AttributeError(
    218     f\"module {cls.__module__!r} has no attribute {name!r}\")

AttributeError: module 'matplotlib' has no attribute 'figure'"

Thanks!

@MarieS-WiMLDS MarieS-WiMLDS added the enhancement New feature or request label Sep 18, 2024
@augustebaum
Copy link
Contributor

augustebaum commented Sep 19, 2024

The traceback you're getting is a bug unrelated to our ability to store Series; it was fixed by commit 27cf3c5 and is part of the new pre-release 0.0.1rc3

@augustebaum
Copy link
Contributor

Here is the error I get with the newer version:

---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
Cell In[4], line 1
----> 1 p.put("he", pd.Series([1,2]))

File ~/Desktop/skore/src/skore/project.py:50, in Project.put(self, key, value)
     48 def put(self, key: str, value: Any):
     49     """Add a value to the Project."""
---> 50     item = object_to_item(value)
     51     self.put_item(key, item)

File ~/Desktop/skore/src/skore/project.py:34, in object_to_item(o)
     32     return MediaItem.factory_pillow(o)
     33 else:
---> 34     raise NotImplementedError(f"Type {o.__class__.__name__} is not supported yet.")

NotImplementedError: Type Series is not supported yet.

@MarieS-WiMLDS
Copy link
Contributor Author

it seems more logical indeed!

@rouk1
Copy link
Contributor

rouk1 commented Sep 19, 2024

Open question: what would be a good widget to visualize a series ? A plot ?
I see that pandas can build a plot from a series btw.

@MarieS-WiMLDS
Copy link
Contributor Author

That's a very good question. I would expect it to be displayed like it is python (see examples here). It's not pretty but I have no other idea I must say.

@thomass-dev
Copy link
Collaborator

@MarieS-WiMLDS in your workflow, do you need to retrieve a series from the project and use it in your script?

@tuscland
Copy link
Member

@thomass-dev do you agree that this message is kind of confusing:

Type Series is not supported yet
It suggests that we might support it later.

I propose to remove the ending "yet": "Type Series is not supported."

@MarieS-WiMLDS
Copy link
Contributor Author

@MarieS-WiMLDS in your workflow, do you need to retrieve a series from the project and use it in your script?

yes, i started to use skore also has a link between notebooks when I have several of them in my project. Here is why I want to store series but didn't think about displaying them, as I don't really know how it could be useful in a report.

@thomass-dev
Copy link
Collaborator

thomass-dev commented Sep 19, 2024

Is it acceptable to say: series can be transformed into a list or created from a list.
If the user wants to set persistent series, he can convert the series into a list himself.

project.put("myserie", myserie.to_list())
[...]
myserie = project.put("myserie", myserie.to_list())
myserie = pandas.Series(myseries)

What do you think ?
I'm just afraid that we should support many many types based on list: pandas.Series, pandas.Index etc.

@MarieS-WiMLDS
Copy link
Contributor Author

MarieS-WiMLDS commented Sep 20, 2024

Yesterday when I saw your comment I thought that you were right and indeed we don't want to have to support many types.
Then I thought about it overnight, and now I think that Series are very important in a data science project, almost always the target in classical projects are stored as series. Could we add it to the short list of data types we will maintain?

@augustebaum
Copy link
Contributor

I propose to remove the ending "yet": "Type Series is not supported."

This is done

@tuscland
Copy link
Member

How can @MarieS-WiMLDS test it?
In this issue, there is no reference to a PR or a branch.

@thomass-dev
Copy link
Collaborator

thomass-dev commented Sep 23, 2024

The comment of @augustebaum was only on the renaming of the message exception, i.e. "yet".
It was done before your comment, here in #355 .

@thomass-dev thomass-dev self-assigned this Sep 23, 2024
@thomass-dev thomass-dev linked a pull request Sep 23, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants