feat(python!): Use Altair in DataFrame.plot #17995

MarcoGorelli · 2024-08-01T14:03:32Z

Some context behind this: since vega/altair#3452, Altair support Polars natively, without any extra heavy dependencies (no pandas, no NumPy, no PyArrow). Altair is a very popular and widely used library, with excellent docs and static typing - hence, I think it'd be best suited as Polars' default plotting backend

DataFrame.plot was marked as "unstable" so this change can technically be made in Polars ~~1.4.0~~ 1.5.0. What I've implemented here is a very thin layer on top of Altair, so it should be both convenient to users and easy to maintain

For existing users wishing to preserve HvPlot plots, all they need to do is apply the diff

+ import hvplot.polars
- df.plot.line
+ df.hvplot.line

So, the impact on users should be fairly small

HvPlot maintainers have been extra-friendly have helpful (especially with answering user questions in Discord). I think it'd be good to still mention them in the docstring (and also to help users for whom this represents an API change), and recommend their library in the "visualisation" section of the user guide

Demo

DataFrame (here source is a polars.DataFrame):

Series plots work too:

Tab-complete works well, making this well-suited to EDA:

TODO

add more definitions than just line and point, so users get good tab completion
~~figure out static typing~~ done

F.A.Q.: what about other plotting backends?

Maybe, in the future, the plotting backend could be configurable in pl.Config. But I think that's an orthogonal issue and can be done/discussed separately. Plotting will stay "unstable" for the time being

MarcoGorelli · 2024-08-01T14:07:52Z

py-polars/polars/dataframe/plotting.py

+    def line(
+        self,
+        x: str | Any | None = None,
+        y: str | Any | None = None,
+        color: str | Any | None = None,
+        order: str | Any | None = None,
+        tooltip: str | Any | None = None,
+        *args: Any,
+        **kwargs: Any,
+    ) -> alt.Chart:


Hi @dangotbanned - may I ask for your input here please?

which do you think are the most common types of plots which are worth explicitly making functions for? Functionality would be unaffected, they would just work better with tab completion

how would you suggest typing the various arguments? Does Altair have public type hints?

Any Altair maintainers you'd suggest looping into the discussion?

Thanks 🙏

Thanks for the ping, happy to help where I can @MarcoGorelli

Couple of resources up top that I think could be useful:

Examples Gallery

An archived version of this idea

User Guide - Marks

User Guide - long vs wide form

Will respond each question in another comment 👍

1

which do you think are the most common types of plots which are worth explicitly making functions for? Functionality would be unaffected, they would just work better with tab completion

Can't speak for everyone, but for a reduced selection:

Bar

Line (Line, Trail)

Scatter (Circle, Point, Square, Image, Text)

Area (Area, Rect)

Tick

Boxplot

Geoshape

Field-dependent, but super important for those who need it

Looking at hvPlot, there are a few methods/chart types I'd need to do some digging to work out the equivalent in altair (if there is one).

However, my suggestion would be using the names defined there, both for compatibility when switching backends and to reduce the number of methods.

Examples

Haven't covered everything here, but it's a start:

hvPlotTabular -> altair.Chart

(bar|barh) -> mark_bar

box -> mark_boxplot

scatter -> mark_(circle|point|square|image|text)

labels -> mark_text

points -> mark_point

line -> mark_(line|trail)

(polygons|paths) -> mark_geoshape

(area|heatmap) -> mark_area

2

how would you suggest typing the various arguments? Does Altair have public type hints?

I might update this later after thinking on it some more.

Yeah they've been there since 5.2.0 but will be improved for altair>=5.4.0 with https://github.com/vega/altair/blob/main/altair/vegalite/v5/schema/_typing.py

For altair the model is quite different to matplotlib-style functions, but .encode() would be where to start.

Something like:

# Annotation from `.encode()` # y: Optional[str | Y | Map | YDatum | YValue] = Undefined # Don't name it this pls TypeForY = str | Mapping[str, Any] | Any

I wouldn't worry about any altair-specific types here.
Spelling them out won't have an impact on attribute access of the result

3

Any Altair maintainers you'd suggest looping into the discussion?

For typing @binste but really anyone from vega/altair#3452 I think would be interested (time-permitting)

@mattijn, @joelostblom, @jonmmease

dangotbanned · 2024-08-01T17:39:04Z

2

how would you suggest typing the various arguments? Does Altair have public type hints?

I might update this later after thinking on it some more.

Back again after thinking @MarcoGorelli

Feel free to rename things, but I came up with this for the typing

Super long code block

from __future__ import annotations

from typing import TYPE_CHECKING, Any, Mapping, Union

from typing_extensions import TypeAlias, TypedDict, Unpack

if TYPE_CHECKING:
    import altair as alt
    import narwhals.stable.v1 as nw

ChannelType: TypeAlias = Union[str, Mapping[str, Any], Any]

class EncodeKwds(TypedDict, total=False):
    angle: ChannelType
    color: ChannelType
    column: ChannelType
    description: ChannelType
    detail: ChannelType | list[Any]
    facet: ChannelType
    fill: ChannelType
    fillOpacity: ChannelType
    href: ChannelType
    key: ChannelType
    latitude: ChannelType
    latitude2: ChannelType
    longitude: ChannelType
    longitude2: ChannelType
    opacity: ChannelType
    order: ChannelType | list[Any]
    radius: ChannelType
    radius2: ChannelType
    row: ChannelType
    shape: ChannelType
    size: ChannelType
    stroke: ChannelType
    strokeDash: ChannelType
    strokeOpacity: ChannelType
    strokeWidth: ChannelType
    text: ChannelType
    theta: ChannelType
    theta2: ChannelType
    tooltip: ChannelType | list[Any]
    url: ChannelType
    x: ChannelType
    x2: ChannelType
    xError: ChannelType
    xError2: ChannelType
    xOffset: ChannelType
    y: ChannelType
    y2: ChannelType
    yError: ChannelType
    yError2: ChannelType
    yOffset: ChannelType


class Plot:
    chart: alt.Chart

    def __init__(self, df: nw.DataFrame) -> None:
        import altair as alt

        self.chart = alt.Chart(df)

    def line(
        self,
        x: ChannelType | None = None,
        y: ChannelType | None = None,
        color: ChannelType | None = None,
        order: ChannelType | list[Any] | None = None,
        tooltip: ChannelType | list[Any] | None = None,
        /,
        **kwargs: Unpack[EncodeKwds],
    ) -> alt.Chart: ...

Which checks out below.

You can use x, y, color, order, tooltip as positional-only or keyword-only, but not both:

def test_plot_typing() -> None:
    from typing import cast
    from typing_extensions import reveal_type

    plot = cast(Plot, "test")
    reveal_type(plot) # Type of "plot" is "Plot"

    example_1 = plot.line(x="col 1")
    reveal_type(example_1) # Type of "example_1" is "Chart"

    example_2 = plot.line("col 1", "col 2")
    reveal_type(example_2) # Type of "example_2" is "Chart"

    example_err = plot.line("col 1", "col 2", x="col 3")
    reveal_type(example_err) # Type of "example_err" is "Any"

At least for VSCode, you get the expanded docs on hover:

You could then repeat the /, **kwargs: Unpack[EncodeKwds] for the other methods - maybe changing the positional-only ones if needed

codecov · 2024-08-01T22:26:01Z

Codecov Report

Attention: Patch coverage is 46.39175% with 52 lines in your changes missing coverage. Please review.

Project coverage is 79.79%. Comparing base (6f5851d) to head (40a0e31).
Report is 5 commits behind head on main.

Files	Patch %	Lines
py-polars/polars/dataframe/plotting.py	38.00%	16 Missing and 15 partials ⚠️
py-polars/polars/series/plotting.py	51.42%	11 Missing and 6 partials ⚠️
py-polars/polars/dataframe/frame.py	60.00%	1 Missing and 1 partial ⚠️
py-polars/polars/series/series.py	60.00%	1 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #17995      +/-   ##
==========================================
- Coverage   79.80%   79.79%   -0.01%     
==========================================
  Files        1497     1499       +2     
  Lines      200379   200464      +85     
  Branches     2841     2864      +23     
==========================================
+ Hits       159913   159966      +53     
- Misses      39941    39952      +11     
- Partials      525      546      +21

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

binste

Thanks for the ping! I'm of course already a big fan of this PR ;) Let me know if I can help!

I find @dangotbanned's suggestion regarding typing a reasonable compromise so that there are some useful type hints but using the Any escape hatch instead of typing out all of them explicitly. However, if you fully want to mirror the type hints in altair with all altair-specific classes, I think we could expose those. Maybe something like altair.typing.XChannelType, ...

py-polars/polars/dataframe/plotting.py

MarcoGorelli · 2024-08-02T08:38:54Z

Thanks both for comments!

One slight hesitation I have about adding such a large class as EncodeKwds is its maintainability - how would Polars ensure it stays up-to-date? Or would it be in-scope for Altair to expose such a class, that Polars could just import?

binste · 2024-08-02T09:11:15Z

So far we did not explicitly expose the types in Altair, even keeping many of them in private modules, as we first wanted to use them for a while before others rely on it. But I think now, also thanks to the recent improvements done by @dangotbanned, we could expose the most relevant ones. This could include a TypedDict such as EncodeKwds with each channel being typed the same as .encode(). This part of the code is autogenerated anyway based on the Vega-Lite jsonschema and so it would be no maintenance effort for us and for you :)

Any thoughts on this @dangotbanned? I think I could spend some time this weekend to think through which types we'd want to expose and how but I'd also very much appreciate your input and help if you want to. Maybe we can even get it into Altair 5.4 and release that in the next 1-2 weeks and so Polars 1.4 could have that as a minimum dependency and use the types.

dangotbanned · 2024-08-02T09:47:55Z

So far we did not explicitly expose the types in Altair, even keeping many of them in private modules, as we first wanted to use them for a while before others rely on it. But I think now, also thanks to the recent improvements done by @dangotbanned, we could expose the most relevant ones. This could include a TypedDict such as EncodeKwds with each channel being typed the same as .encode(). This part of the code is autogenerated anyway based on the Vega-Lite jsonschema and so it would be no maintenance effort for us and for you :)

Any thoughts on this @dangotbanned?

Fully agree on the autogenerated TypedDict @binste, you beat me to the suggestion 😉

I think I could spend some time this weekend to think through which types we'd want to expose and how but I'd also very much appreciate your input and help if you want to. Maybe we can even get it into Altair 5.4 and release that in the next 1-2 weeks and so Polars 1.4 could have that as a minimum dependency and use the types.

Happy to discuss in an altair issue and work with you on a PR

One slight hesitation I have about adding such a large class as EncodeKwds is its maintainability - how would Polars ensure it stays up-to-date? Or would it be in-scope for Altair to expose such a class, that Polars could just import?

@MarcoGorelli
I think this approach would work better if Plot (or a version of) were a Protocol, that altair and any other library could handle the implementation of.

Maybe with fewer positional-args, but this was what I had in mind back in vega/altair#3452 (comment)

Code block

from __future__ import annotations

import sys
from typing import Any, Generic, TypeVar

import narwhals.stable.v1 as nw

if sys.version_info >= (3, 12):
    from typing import Protocol, runtime_checkable
else:
    from typing_extensions import Protocol, runtime_checkable

T_Plot = TypeVar("T_Plot")

@runtime_checkable
class SupportsPlot(Generic[T_Plot], Protocol):
    chart: T_Plot

    def __init__(self, df: nw.DataFrame) -> None: ...

    def bar(
        self,
        x: Any | None = None,
        y: Any | None = None,
        color: Any | None = None,
        tooltip: Any | None = None,
        /,
        **kwargs: Any,
    ) -> T_Plot: ...
    def line(
        self,
        x: Any | None = None,
        y: Any | None = None,
        color: Any | None = None,
        order: Any | None = None,
        tooltip: Any | None = None,
        /,
        **kwargs: Any,
    ) -> T_Plot: ...

So on the polars-side, you can focus more on switching between backends and less on maintaining how the plots are made.

It would also allow decisions like #17995 (comment) to be made in altair, since there may be non-trivial ways to produce some of the hvPlot charts - that some of the maintainers are aware of

MarcoGorelli · 2024-08-02T10:21:12Z

Ooh i like where things are going 😍

I think this approach would work better if Plot (or a version of) were a Protocol

This sounds good, we just need to be careful to learn lessons from the pandas plotting backends and why altair-pandas was abandoned. I think it was probably because:

the API there was too tied to pandas' existing plotting API, which is a bit at odds with how most dataframe libraries handle plots
it was never clear what the boundaries were - if you switched backend, how could you know what would be supported and what wouldn't?

that altair and any other library could handle the implementation of

How would that work? In this PR we're essentially deferring the whole implementation to Altair - if you have time/interest, do you fancy opening a separate PR to show how it would work? If you'd like to talk things over (which may be a good idea if we're coordinating changes across projects, which is never easy), feel free to book some time at https://calendly.com/marcogorelli

mattijn · 2024-08-02T11:13:25Z

Implementation-wise, I cannot contribute much and while not involved I have been following historical developments in pandas and Vega-Altair from the sidelines:

There have been requests to introduce methods in Altair that could be used in other packages but had no usage within Altair itself. This approach should be avoided here as well. This issue is summarized in this comment: pandas-dev/pandas#27488, and also discussed in pandas-dev/pandas#26747.
If we can update Altair to define methods that are both useful for Altair and beneficial for other packages, we should consider introducing them in Altair.
Entrypoints could be utilized by polars to introduce a method to support various plotting backends. Altair supports this functionality. But honestly, I don't know the details. See also https://github.com/altair-viz/altair_pandas.
Regarding the types of plots, there are experiments and research in creating common plots and interactivity approaches for exploratory data analysis for Altair. @joelostblom is/was working on this within Altair Ally, see https://github.com/vega/altair_ally, and @dwootton is/was exploring this with Altair Express.
From my observations, it is still challenging to define a default setting that would be sensible for most users.
How customizable are the results? For example, there have been questions within pdvega about how to add horizontal or vertical rules to the resulting plot (similar to ax.hlines and ax.vlines in Matplotlib), see How to add vertical and horizontal lines to figures? altair-viz/pdvega#21 (comment). Is that possible with the resulting charts here? Additionally, can the interactivity be turned off?

dangotbanned · 2024-08-02T11:54:23Z

@MarcoGorelli @mattijn Really appreciate your thorough responses, I'll do my homework reading up on all of your links and follow up

For now, I can say I'd want to go in with the most minimal + simple definition for a Protocol - nowhere close to #17995 (comment)

No default implementation
No requirement of inheritance/ABCs
Purely defining a structural type
- with some methods that all return the library equivalent of Chart
- E.g. plotly.Figure

For altair, all I'd want is a quick way to get from df -> chart and then customize from there:

import altair as alt
import polars as pl

df = pl.DataFrame()
...
chart = alt.Chart(df).mark_line().encode(...) # <----

dangotbanned · 2024-08-02T20:24:12Z

TLDR: Simple idea got complex 👎

Edit: Leaving this here for future reference, but no longer pushing for this path.
Skip to #17995 (comment)

Ooh i like where things are going 😍

I think this approach would work better if Plot (or a version of) were a Protocol

This sounds good, we just need to be careful to learn lessons from the pandas plotting backends and why altair-pandas was abandoned. I think it was probably because:

the API there was too tied to pandas' existing plotting API, which is a bit at odds with how most dataframe libraries handle plots

it was never clear what the boundaries were - if you switched backend, how could you know what would be supported and what wouldn't?

that altair and any other library could handle the implementation of

How would that work? In this PR we're essentially deferring the whole implementation to Altair - if you have time/interest, do you fancy opening a separate PR to show how it would work?

@MarcoGorelli I'm starting to think I've bitten off more than I can chew with this one 😞

I guess I'll run through some stuff, maybe it sparks an idea for someone else.

importlib.metadata.entry_points seems to be the route to achieve multiple backends.
pandas is using the stdlib equivalent of what was suggested in pandas-dev/pandas#27488 (comment) (thanks @mattijn) - that was a third-party library at the time (2019).

Entrypoints could be utilized by polars to introduce a method to support various plotting backends.
Altair supports this functionality. But honestly, I don't know the details. See also altair-viz/altair_pandas.
@mattijn

We've got lots of examples of this in altair derived from https://github.com/vega/altair/blob/6c4c7856a5b134103d3db1205035d08a83fc3aa6/altair/utils/plugin_registry.py

However I don't think this is sufficient for the task, given that each backend would be returning vastly different objects.
It would be a pretty bad UX returning an imprecise type - breaking any autocomplete, etc.

Probably a fair assumption that a user would be calling pl.DataFrame.plot in an interactive environment, like hvPlot seems to rely on prioritizing an IPython environment.
Personally think that shouldn't come at the cost of experience in an IDE (e.g. vega/altair#3466) - but it is an option.

Digging through ibis I found some examples of combining entry_points and runtime typing.
Pretty atypical use of the typing system, was interesting to read through though:

AFAIK this would still rely on lots of library-specific code and some IR, which I was hoping to avoid.

Something I hadn't seen, but thought could be explored is using library-specific stubs.

Did some experimenting with altair and seaborn (since there are stubs https://github.com/python/typeshed/tree/main/stubs/seaborn).
Maybe there is something to this?

Code block

# hypothetical `.pyi`, located external to `polars`

# ruff: noqa: F401
import sys
import typing as t
import typing_extensions as te
from typing import Any, Generic, TypeVar

import narwhals.stable.v1 as nw
import polars as pl
import seaborn as sns
from matplotlib.axes import Axes

import altair as alt

if sys.version_info >= (3, 12):
    from typing import Protocol, runtime_checkable
else:
    from typing_extensions import Protocol, runtime_checkable

if t.TYPE_CHECKING:
    import matplotlib as mpl
    import seaborn.categorical as sns_c
    from matplotlib.axes import Axes

    ChannelType: te.TypeAlias = str | t.Mapping[str, Any] | Any

    class EncodeKwds(te.TypedDict, total=False):
        angle: ChannelType
        color: ChannelType
        column: ChannelType
        description: ChannelType
        detail: ChannelType | list[Any]
        facet: ChannelType
        fill: ChannelType
        fillOpacity: ChannelType
        href: ChannelType
        key: ChannelType
        latitude: ChannelType
        latitude2: ChannelType
        longitude: ChannelType
        longitude2: ChannelType
        opacity: ChannelType
        order: ChannelType | list[Any]
        radius: ChannelType
        radius2: ChannelType
        row: ChannelType
        shape: ChannelType
        size: ChannelType
        stroke: ChannelType
        strokeDash: ChannelType
        strokeOpacity: ChannelType
        strokeWidth: ChannelType
        text: ChannelType
        theta: ChannelType
        theta2: ChannelType
        tooltip: ChannelType | list[Any]
        url: ChannelType
        x: ChannelType
        x2: ChannelType
        xError: ChannelType
        xError2: ChannelType
        xOffset: ChannelType
        y: ChannelType
        y2: ChannelType
        yError: ChannelType
        yError2: ChannelType
        yOffset: ChannelType

T = TypeVar("T")

@runtime_checkable
class SupportsPlot(Generic[T], Protocol):
    backend: t.ClassVar[te.LiteralString]
    chart: T

    def __init__(self, df: nw.DataFrame, /) -> None: ...
    def area(self, *args: Any, **kwargs: Any) -> T: ...
    def bar(self, *args: Any, **kwargs: Any) -> T: ...
    def line(self, *args: Any, **kwargs: Any) -> T: ...
    def scatter(self, *args: Any, **kwargs: Any) -> T: ...

@runtime_checkable
class AltairPlot(SupportsPlot[alt.ChartType]):
    backend: t.ClassVar[te.LiteralString] = "altair"
    chart: T

    def __init__(self, df: nw.DataFrame, /) -> None: ...
    def area(
        self,
        x: ChannelType | None = None,
        y: ChannelType | None = None,
        color: ChannelType | None = None,
        tooltip: ChannelType | list[Any] | None = None,
        /,
        **kwargs: te.Unpack[EncodeKwds],
    ) -> alt.ChartType: ...
    def bar(
        self,
        x: ChannelType | None = None,
        y: ChannelType | None = None,
        color: ChannelType | None = None,
        tooltip: ChannelType | list[Any] | None = None,
        /,
        **kwargs: te.Unpack[EncodeKwds],
    ) -> alt.ChartType: ...
    def line(
        self,
        x: ChannelType | None = None,
        y: ChannelType | None = None,
        color: ChannelType | None = None,
        order: ChannelType | list[Any] | None = None,
        tooltip: ChannelType | list[Any] | None = None,
        /,
        **kwargs: te.Unpack[EncodeKwds],
    ) -> alt.ChartType: ...
    def scatter(
        self,
        x: ChannelType | None = None,
        y: ChannelType | None = None,
        color: ChannelType | None = None,
        size: ChannelType | None = None,
        tooltip: ChannelType | list[Any] | None = None,
        /,
        **kwargs: te.Unpack[EncodeKwds],
    ) -> alt.ChartType: ...

@runtime_checkable
class SeabornPlot(SupportsPlot[Axes]):
    backend: t.ClassVar[te.LiteralString] = "seaborn"
    chart: T

    def __init__(self, df: nw.DataFrame, /) -> None: ...
    def area(
        self,
        *,
        x: sns_c.ColumnName | sns_c._Vector | None = None,
        y: sns_c.ColumnName | sns_c._Vector | None = None,
        hue: sns_c.ColumnName | sns_c._Vector | None = None,
        **kwargs: Any,
    ) -> Axes: ...
    def bar(
        self,
        *,
        x: sns_c.ColumnName | sns_c._Vector | None = None,
        y: sns_c.ColumnName | sns_c._Vector | None = None,
        hue: sns_c.ColumnName | sns_c._Vector | None = None,
        **kwargs: Any,
    ) -> Axes: ...
    def line(
        self,
        *,
        x: sns_c.ColumnName | sns_c._Vector | None = None,
        y: sns_c.ColumnName | sns_c._Vector | None = None,
        hue: sns_c.ColumnName | sns_c._Vector | None = None,
        **kwargs: Any,
    ) -> Axes: ...
    def scatter(
        self,
        *,
        x: sns_c.ColumnName | sns_c._Vector | None = None,
        y: sns_c.ColumnName | sns_c._Vector | None = None,
        hue: sns_c.ColumnName | sns_c._Vector | None = None,
        **kwargs: Any,
    ) -> Axes: ...

Not sure how you'd convince a type checker of which SupportsPlot.backend to use, if this came from pl.Config (and not from the user directly)?

Final idea is to call in @max-muoto 👋 for thoughts on the soundness of any of the above.
Having seen you on other polars issues and in typeshed, maybe you have a fresh take?

MarcoGorelli · 2024-08-03T09:19:14Z

Thanks all for comments 🙏! I do like how issues such as this one bring different projects together

💯 Totally agree on not adding code to Altair which isn't directly useful to Altair itself. The only request I'd have is public types as mentioned in #17995 (comment), but even then, it's hardly essential

Regarding customisability of results 🔧 : I'd say that if anyone wants fully customisable results, they should use Altair (or their favourite plotting lib) directly. The advantage of DataFrame.plot being a really thin layer is that moving between the two should be easy, e.g.:

user wants to quickly visualise their data, and they call df.plot.line(x='date', y='price', color='symbol')
they realise they need further customisation, or turn off interactivity, or whatever else, and so they swap out df.plot.line with alt.Chart(df).mark_line().encode and go from there
the fact that df.plot.foo always just maps to alt.Chart(df).mark_foo().encode would make the transition predictable and free from surprises, whilst making the most common interactive case easy to find

Furthermore, having some built-in df.plot method signals to users that the default plotting backend is known to work well with Polars 🤝

I think the fully-customisable backends part is becoming too complex too quickly. No other plotting library is close to (as far as I can tell) supporting Polars natively without extra heavy dependencies. I'd suggest to:

start with Altair
keep plotting marked as unstable
if/when other plotting libraries like Seaborn / PlotNine / etc reach this level, we discuss of a more pluggable solution or some "dataframe plotting standard" - but not now, it feels too soon

dangotbanned · 2024-08-03T09:33:15Z

I think the fully-customisable backends part is becoming too complex too quickly.

Absolutely agree @MarcoGorelli, will do my best to support you with this in altair 😄

dangotbanned · 2024-08-03T09:50:03Z

Regarding customisability of results 🔧 : I'd say that if anyone wants fully customisable results, they should use Altair (or their favourite plotting lib) directly. The advantage of DataFrame.plot being a really thin layer is that moving between the two should be easy

You could provide a link to https://altair-viz.github.io/user_guide/customization.html#chart-themes in the docs, for users who simply want different (but consistent) defaults

binste · 2024-08-03T11:32:50Z

Very interesting reading through all the comments and links 😄 +1 on Marco's summary: expose some types publicly in Altair, wait with standardisation until other plotting libraries are being considered as well. I'll work on the public types soon.

Regarding customisability of results 🔧 : I'd say that if anyone wants fully customisable results, they should use Altair (or their favourite plotting lib) directly. The advantage of DataFrame.plot being a really thin layer is that moving between the two should be easy

You could provide a link to https://altair-viz.github.io/user_guide/customization.html#chart-themes in the docs, for users who simply want different (but consistent) defaults

I think this is something useful to consider early on! The default theme of Altair/Vega-Lite feels a bit dated for my taste but changing it in Altair should be well thought through and be part of a major release. In Polars, we'd have the opportunity to spruce it up a bit from the beginning. Personally, I use something close to https://gist.github.com/binste/b4042fa76a89d72d45cbbb9355ec6906 which only requires minimal modifications. Streamlit have their own theme as well enabled by default

joelostblom · 2024-08-03T18:14:53Z

Cool to see this being implemented in Polars and an interesting discussion to follow! I would be inclined to agree with what @MarcoGorelli said regarding a fully-customisable backends becoming too complex too quickly and think it is a good idea to outsource any type of customization as much as possible.

In addition to switching from df.plot... to alt.Chart..., also note the configure_* methods which can be used on any Altair chart. So users could do something like df.plot.line(x='date', y='price', color='symbol').configure_axis(grid=False) to turn off gridlines. I thought of leveraging these in altair_ally, but one of the issues is that you can't set everything you would like to set (e.g. I don't think it is possible to set the actual axis title via configure_, then you have to use .title() on e.g. alt.X, alt.Y, etc), so it might still be better to just point to alt.Chart() for all configuration needs to keep it simple.

binste · 2024-08-11T08:26:43Z

FYI, Altair 5.4.0 is out now including the removal of the dependencies on numpy, pandas, and toolz + with a new altair.typing module 🥳

py-polars/pyproject.toml

py-polars/requirements-dev.txt

MarcoGorelli · 2024-08-18T18:57:08Z

py-polars/tests/unit/operations/namespaces/test_plot.py

-# Calling `plot` the first time is slow
-# https://github.com/pola-rs/polars/issues/13500
-pytestmark = pytest.mark.slow


importing Altair is about 70 times faster* than importing hvplot, so I think we can remove this slow marker

*timed by performing time python -c 'import altair' and time python -c 'import hvplot' 7 times each, and finding the ratio of the smallest "real time" results for both

ritchie46

Thanks a lot @MarcoGorelli and @altair team. Really great effort and great addition. 🙌

v1gnesh · 2024-08-29T02:41:22Z

One advantage of hvplot / bokeh is the free interactivity.
Altair does interactive too but I think only in a subset.
Please consider having a config for interactivity.

MarcoGorelli · 2024-08-29T06:34:00Z

thanks @v1gnesh - coming soon 😉 vega/altair#3394

In the meantime, if you'd like to keep using hvplot, you can just add import hvplot at the top of your script/notebook and change .plot to .hvplot

dwootton · 2024-08-29T11:27:56Z

@v1gnesh what type of interactivity is important to you? For example tooltips, panning/zooming, brushes, etc

v1gnesh · 2024-08-29T15:37:36Z

@dwootton The collection of functions in the side, ex: zoom, reset zoom, selection tool, hand tool.

joelostblom · 2024-08-29T16:10:59Z

@v1gnesh Just a heads up that you can already achieve zoom, reset zoom, and panning ("hand tool" in hvplot) in altair via the .interactive() method, see e.g. https://altair-viz.github.io/gallery/scatter_tooltips.html (they work without being selected in a side bar). The addition of box zoom ("selection tool" in hvplot) is being tracked in this issue vega/vega-lite#4742

When you say "free interactivity", do you mean that you would like this to be the default behavior without having to type .interactive() yourself?

v1gnesh · 2024-08-30T02:18:02Z

@joelostblom Thanks for the links.

Yup, whenever it makes sense for it to be the default, at least I would prefer interactive becoming the default.
It may not be acceptable for everyone, so whenever .interactive() is as mature as you'd like, bringing this up a poll/discussion in altair's repo will help understand what users want.

mjmdavis · 2024-10-26T01:06:42Z

The new default plots from Altair have reduced the interactivity of plots. I've gone back to using hvplot because I struggled to get useful images out of Altair.

Was this a premature move considering Altair is missing essential features:

zoom to selection
resize plot
hover to show datapoint

Was the main intent to deliver a basic plotting experience without adding many dependencies?

MarcoGorelli · 2024-10-26T06:53:05Z

that's totally fair - it's easy to go back to hvplot if that works for you

import hvplot.polars
use df.hvplot instead of df.plot

having said that - stay tuned, more developments may be on their way 👀 😉

joelostblom · 2024-10-26T16:25:05Z

@mjmdavis Thanks for the feedback! While box zoom is not available (vega/vega-lite#4742), you should be able to hover a data point to show additional info in a tooltip as per #18625. If that's not what you mean, could you elaborate on what you expect to happens when hovering?

I'm also curious exactly what you are referring to with "resize plot", do you mean something like dragging in the corner of the plot to resize it? You are currently able to resize plots with e..g .properties(width=400) as per the polars documentation.

mjmdavis · 2024-10-26T19:40:13Z

So, my use case is mostly data exploration in jupyter notebooks. For this, I've become quite fond of ipympl.

My biggest probelm there is that it can be tricky to get it to work with different kernels and there's frequently a 10 minute dance to get things working.

The default plot however has the basic zoom to selection functionality that is very useful when dealing with complex signals. And it's convenient to not have to re-run code to change the size of the plot as you resize your screen.

Vega-Lite and hvplot definitely benefit from being able to show plots in saved notebooks!

There are a lot of considerations here so it's hard to please everyone. My 2c is that it's nice to be able to do some quick GUI based exploration when plotting with default settings.

github-actions bot added the title needs formatting label Aug 1, 2024

feat(python!): Use Altair in DataFrame.plot

9ed8836

MarcoGorelli force-pushed the altair branch from 872d784 to 9ed8836 Compare August 1, 2024 14:05

missing file

00f7413

MarcoGorelli commented Aug 1, 2024

View reviewed changes

MarcoGorelli added 3 commits August 1, 2024 22:15

use ChannelType

f0c806f

typing

eaafc23

requirements

db6d8f7

binste reviewed Aug 2, 2024

View reviewed changes

py-polars/polars/dataframe/plotting.py Outdated Show resolved Hide resolved

py-polars/polars/dataframe/plotting.py Outdated Show resolved Hide resolved

py-polars/polars/dataframe/plotting.py Outdated Show resolved Hide resolved

dangotbanned mentioned this pull request Aug 3, 2024

polars.DataFrame.plot support tracking vega/altair#3514

Closed

dangotbanned mentioned this pull request Aug 3, 2024

feat: Adds 4 missing carbon themes, provide autocomplete vega/altair#3516

Merged

This was referenced Aug 4, 2024

feat(typing): Adds public altair.typing module vega/altair#3515

Merged

Improving Themes vega/altair#3519

Open

dangotbanned reviewed Aug 11, 2024

View reviewed changes

py-polars/pyproject.toml Outdated Show resolved Hide resolved

dangotbanned reviewed Aug 11, 2024

View reviewed changes

py-polars/requirements-dev.txt Outdated Show resolved Hide resolved

MarcoGorelli added 2 commits August 17, 2024 22:30

wip

4bc052f

add Series.plot

e043a35

MarcoGorelli marked this pull request as draft August 18, 2024 12:45

MarcoGorelli added 6 commits August 18, 2024 13:51

Merge remote-tracking branch 'upstream/main' into altair

65fae74

add Series.plot

ec57fb0

add missing page, add scatter as alias

28ac596

lint

d5167f1

rename, better bar plot example, simplify

efed5c9

assorted improvements

381d481

MarcoGorelli commented Aug 18, 2024

View reviewed changes

assorted docs and typing improvements

ea018b5

MarcoGorelli marked this pull request as ready for review August 19, 2024 08:43

Merge remote-tracking branch 'upstream/main' into altair

40a0e31

ritchie46 approved these changes Aug 27, 2024

View reviewed changes

ritchie46 merged commit 0f1edda into pola-rs:main Aug 27, 2024
19 checks passed

joelostblom mentioned this pull request Aug 27, 2024

docs: Remove ecosystem viz section since there is one in misc already #18408

Merged

ahuang11 mentioned this pull request Aug 28, 2024

Remove tests for polars.plot holoviz/hvplot#1403

Merged

This was referenced Sep 9, 2024

feat(python): Add nicer default plot configuration, link to Altair Chart Configuration docs #18609

Closed

Make it easier for downstream libraries to *safely* contribute themes vega/altair#3586

Open

dangotbanned mentioned this pull request Oct 26, 2024

Add more features to .interactive()? vega/altair#3393

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(python!): Use Altair in DataFrame.plot #17995

feat(python!): Use Altair in DataFrame.plot #17995

MarcoGorelli commented Aug 1, 2024 •

edited

Loading

MarcoGorelli Aug 1, 2024 •

edited

Loading

dangotbanned Aug 1, 2024

dangotbanned Aug 1, 2024 •

edited

Loading

dangotbanned Aug 1, 2024 •

edited

Loading

dangotbanned Aug 1, 2024

dangotbanned commented Aug 1, 2024 •

edited

Loading

2

codecov bot commented Aug 1, 2024 •

edited

Loading

binste left a comment

MarcoGorelli commented Aug 2, 2024 •

edited

Loading

binste commented Aug 2, 2024 •

edited

Loading

dangotbanned commented Aug 2, 2024 •

edited

Loading

MarcoGorelli commented Aug 2, 2024

mattijn commented Aug 2, 2024 •

edited

Loading

dangotbanned commented Aug 2, 2024 •

edited

Loading

dangotbanned commented Aug 2, 2024 •

edited

Loading

MarcoGorelli commented Aug 3, 2024

dangotbanned commented Aug 3, 2024

dangotbanned commented Aug 3, 2024

binste commented Aug 3, 2024

joelostblom commented Aug 3, 2024 •

edited

Loading

binste commented Aug 11, 2024

MarcoGorelli Aug 18, 2024

ritchie46 left a comment

v1gnesh commented Aug 29, 2024

MarcoGorelli commented Aug 29, 2024

dwootton commented Aug 29, 2024

v1gnesh commented Aug 29, 2024

joelostblom commented Aug 29, 2024

v1gnesh commented Aug 30, 2024 •

edited

Loading

mjmdavis commented Oct 26, 2024

MarcoGorelli commented Oct 26, 2024

joelostblom commented Oct 26, 2024

mjmdavis commented Oct 26, 2024

feat(python!): Use Altair in DataFrame.plot #17995

feat(python!): Use Altair in DataFrame.plot #17995

Conversation

MarcoGorelli commented Aug 1, 2024 • edited Loading

Demo

TODO

F.A.Q.: what about other plotting backends?

MarcoGorelli Aug 1, 2024 • edited Loading

Choose a reason for hiding this comment

dangotbanned Aug 1, 2024

Choose a reason for hiding this comment

dangotbanned Aug 1, 2024 • edited Loading

Choose a reason for hiding this comment

1

Examples

dangotbanned Aug 1, 2024 • edited Loading

Choose a reason for hiding this comment

2

dangotbanned Aug 1, 2024

Choose a reason for hiding this comment

3

dangotbanned commented Aug 1, 2024 • edited Loading

2

codecov bot commented Aug 1, 2024 • edited Loading

Codecov Report

binste left a comment

Choose a reason for hiding this comment

MarcoGorelli commented Aug 2, 2024 • edited Loading

binste commented Aug 2, 2024 • edited Loading

dangotbanned commented Aug 2, 2024 • edited Loading

MarcoGorelli commented Aug 2, 2024

mattijn commented Aug 2, 2024 • edited Loading

dangotbanned commented Aug 2, 2024 • edited Loading

dangotbanned commented Aug 2, 2024 • edited Loading

TLDR: Simple idea got complex 👎

MarcoGorelli commented Aug 3, 2024

dangotbanned commented Aug 3, 2024

dangotbanned commented Aug 3, 2024

binste commented Aug 3, 2024

joelostblom commented Aug 3, 2024 • edited Loading

binste commented Aug 11, 2024

MarcoGorelli Aug 18, 2024

Choose a reason for hiding this comment

ritchie46 left a comment

Choose a reason for hiding this comment

v1gnesh commented Aug 29, 2024

MarcoGorelli commented Aug 29, 2024

dwootton commented Aug 29, 2024

v1gnesh commented Aug 29, 2024

joelostblom commented Aug 29, 2024

v1gnesh commented Aug 30, 2024 • edited Loading

mjmdavis commented Oct 26, 2024

MarcoGorelli commented Oct 26, 2024

joelostblom commented Oct 26, 2024

mjmdavis commented Oct 26, 2024

MarcoGorelli commented Aug 1, 2024 •

edited

Loading

MarcoGorelli Aug 1, 2024 •

edited

Loading

dangotbanned Aug 1, 2024 •

edited

Loading

dangotbanned Aug 1, 2024 •

edited

Loading

dangotbanned commented Aug 1, 2024 •

edited

Loading

codecov bot commented Aug 1, 2024 •

edited

Loading

MarcoGorelli commented Aug 2, 2024 •

edited

Loading

binste commented Aug 2, 2024 •

edited

Loading

dangotbanned commented Aug 2, 2024 •

edited

Loading

mattijn commented Aug 2, 2024 •

edited

Loading

dangotbanned commented Aug 2, 2024 •

edited

Loading

dangotbanned commented Aug 2, 2024 •

edited

Loading

joelostblom commented Aug 3, 2024 •

edited

Loading

v1gnesh commented Aug 30, 2024 •

edited

Loading