-
Notifications
You must be signed in to change notification settings - Fork 795
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: add method-based attribute setting to encoding classes #1629
Conversation
I think this is awesome! I wanted to raise a few questions/concerns to make sure I fully understand the proposal. Do we support the method-style API for nested properties or just encodings?One thing I am worried about is how people access nested properties. For example, if I want to customize the scheme. This is how I set the Scale's scheme property. alt.Chart(data).mark_point().encode(
alt.Color('color:Q').scale(scheme='viridis')
) If I now want to customize the scheme, I need to switch from the method style to the argument style. I hope it's not going to be too surprising for people. alt.Chart(data).mark_point().encode(
alt.Color('color:Q').scale(scheme=alt.Scheme(name='viridis', extent=[0, 0.9]))
) I assume that Altair supports this method syntax for everything, not just encodings, right? alt.Chart(data).mark_point().encode(
alt.Color('color:Q').scale(alt.Scheme().name('viridis').extent([0, 0.9]))
) Otherwise, I worry about making a special syntax for encodings. Formatting long specsThe other thing I think about is formatting. Is this valid Python? alt.Chart(data).mark_point().encode(
alt.X('column:Q')
.axis(None)
.scale(zero=False)
.title('my title'),
alt.Color('color:N')
.legend(orient='bottom')
.scale(scheme='viridis')
) In the current syntax, one can write this. alt.Chart(data).mark_point().encode(
alt.X(
'column:Q',
axis=alt.Axis(None),
scale=alt.Scale(zero=False),
title='my title'),
alt.Color(
'color:N',
legend=alt.Legend(orient='bottom'),
scale=alt.Scale(scheme='viridis')
)
) Help people use the new wayAnother concern is that Altair now has two ways to express the same thing. I definitely think that we want to be backward compatible but I wonder how we can help people transition to the new API without too much confusion. How difficult would it be to write a converter? Kind of like 2to3 for Python 3? |
Thanks @domoritz – a few thoughts:
Up until now we've supported limited method-style APIs for top-level objects (e.g. alt.Color('column',
scale=alt.Scale().scheme('viridis', extent=[0, 0.9])
) but it's not clear to me that's better than alt.Color('column').scale(
scheme=alt.Scheme('viridis', extent=[0, 0.9])
) particularly since nested settings like this are so much less common than their unnested counterparts. A question: how does the JS API deal with nested settings like this?
Roughly, in Python whitespace doesn't matter if you're inside a parentheses block. So for example this is not valid Python alt.Chart()
.mark_point()
.encode(x='foo') but it's really nice, which is why sometimes you come across code written like this, which is valid Python: (alt.Chart()
.mark_point()
.encode(x='foo')) Because these encoding properties generally appear within parentheses, they can take advantage of this, and this is indeed valid Python: alt.Chart(data).mark_point().encode(
alt.X('column:Q')
.axis(None)
.scale(zero=False)
.title('my title'),
alt.Color('color:N')
.legend(orient='bottom')
.scale(scheme='viridis')
)
The old way would of course still be valid, so most code will not break (the exception is, e.g., if someone tries to access |
Would it make sense to consider using “magic underscore” style arguments for nested properties beyond the method’s depth? The Plotly.py library recently implemented that syntax based on successful usage in the Julia version of Plotly. Read more about the Plotly.py implementation in this issue. That would turn: alt.Chart(data).mark_point().encode(
alt.Color('color:Q').scale(scheme=alt.Scheme(name='viridis', extent=[0, 0.9]))
) into: alt.Chart(data).mark_point().encode(
alt.Color('color:Q').scale(scheme_name='viridis', scheme_extent=[0, 0.9])
) |
Maybe: alt.Chart(data).mark_point().encode(
alt.Color('color:Q', alt.Scale(alt.Scheme(name='viridis', extent=[0, 0.9]))
) |
@iliatimofeev that sort of specification was available in Altair v1, but we abandoned it in later versions. |
@jakevdp for encoding channels it still works in your prototype :). The idea of “magic underscore” is cool too but it will break compatibility with vega-lite documentation. I guess type based notation is more universal Do you remember the reasons why it is not supported in v2? |
It's always worked for arguments of The main reason we dropped it is because it became difficult to implement in a consistent way when we moved from a traitlet-based prescriptive type system to a jsonschema-based proscriptive type system. |
I know two cases where "types" style wouldn't work: tooltips channel and conditions. |
tl;dr - I like method chaining, but let's not break the attribute API in the process. The original idea was for the chained method API to provide a thin optional layer on top of the attribute based approach, in particular for "single expression" interactive usage. However, for deeper API usage where you are closing, reusing, mutating, etc. Chart objects across a library, the attribute based approach remains powerful and expressive. Because of this, the main concern I have here is breaking the attribute based API. I think that will cause a very high level of pain for users and major API breakage. The reason the existing |
Thanks for the input, @ellisonbg. Philosophically, I like the "verb/noun" construct we started with, but note that it breaks down almost immediately when applied to the Vega-Lite schema (transform/transform and bin/bin come to mind). And what is the verb form of a setting like What this PR comes down to is this: I think I'm willing to forego philosophical purity of a rarely-used API (nested attribute access in charts) in favor of simplicity for an often-used API (chart construction via uncluttered method chaining). I guess my concrete question for you would be: given the constraints you're concerned about, what API would you propose for this chart? Or are you OK keeping the status quo? alt.Chart(data).mark_point().encode(
alt.X('column:Q', axis=alt.Axis(None), scale=alt.Scale(zero=False), title='my title'),
alt.Color('color:N', legend=alt.Legend(orient='bottom'), scale=alt.Scale(scheme='viridis'))
) |
Will think on this a bit - it is a good question as it is definitely
verbose. A few other questions to explore as well:
* If we go with the method chaining, what is the API for mutating an
existing spec. Let's say I want to take the spec above, and change the x
title.
* If we go with the method chaining, how do I ask a spec the value of an
attribute?
I don't view my objection as one of philosophical purity - only that the
purity of encode/encoding let us avoid the conflict between the attribute
and method APIs. I too am fine with pushing the boundary, bluring the
edges, etc. for practicality's sake (I am aligned with you and the zen of
python on this front).
…On Mon, Aug 26, 2019 at 7:57 AM Jake Vanderplas ***@***.***> wrote:
Thanks for the input, @ellisonbg <https://github.com/ellisonbg>.
Philosophically, I like the "verb/noun" construct we started with, but note
that it breaks down almost immediately when applied to the Vega-Lite schema
(transform/transform and bin/bin come to mind). And what is the verb form
of a setting like scale? I suppose set_scale() makes sense, but I find
explicit getters & setters like this to be a bit cluttery.
What this PR comes down to is this: I think I'm willing to forego
philosophical purity of a rarely-used API (nested attribute access in
charts) in favor of simplicity for an often-used API (chart construction
via uncluttered method chaining).
I guess my concrete question for you would be: given the constraints
you're concerned about, what API would you propose for this chart? Or are
you OK keeping the status quo?
alt.Chart(data).mark_point().encode(
alt.X('column:Q', axis=alt.Axis(None), scale=alt.Scale(zero=False), title='my title'),
alt.Color('color:N', legend=alt.Legend(orient='bottom'), scale=alt.Scale(scheme='viridis'))
)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1629>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAGXUDY5CZXORQNBCUKTTTQGPVNRANCNFSM4IHAKFGQ>
.
--
Brian E. Granger
Principal Technical Program Manager, AWS AI Platform (brgrange@amazon.com)
On Leave - Professor of Physics and Data Science, Cal Poly
@ellisonbg on GitHub
|
Because
That's what changes here... |
One option, which is admittedly a bit weird, would be to initialize each value with a setter function, and have that function replace itself with the value once it's called. Then this would work: x = alt.X('foo').title('foo')
x.title == 'foo' # True |
But this wouldn't, right? x = alt.X('foo').title('foo')
xx = x.title('bar') |
Right, that wouldn't work. |
I think that broadly, the problem is that Altair was originally built on top of traitlets, and we've been slowly moving away from a traitlets-like model because it's fundamentally incompatible with a proscriptive schema-based object validation. Now we're trying to evolve the chart construction API, and we're running into conflicts with vestiges of the original traitlets-based approach that still exist, but for no good reason other than a modicum of backward compatibility between Altair v1 and v2. If it comes down to a choice between the two, I'd personally lean toward jettisoning the traitlets-inspired model altogether in favor of a better plotting API. |
Our choice of traitlets and the attribute based model are correlated, but
the direction of causation was the opposite. Namely, we wanted the API to
be pythonic (don't hide things behind getters/setters, and dict based
access is overly verbose) so we choose Traitlets as a way to provide a
validation layer for an attribute based API. Even though we are not using
traitlets, I think the underlying logic still holds.
It is a good point that we do have dict ([key]) based access as a separate
API available so we can cleanly separate out methods from values, but I am
still not convinced it is worth breaking the entire API for this. To be
clear though, my argument is not a backwards compatibility one (I am
generally in favor of moving APIs forward to serve users). I am not
convinced this change will be a net improvement. I would propose we get
creative to figure out a way to improve the method based API, while
preserving the attribute based one. What about having a special attribute
on the objects that serves as a namespace for the method based API?
…On Mon, Aug 26, 2019 at 10:47 AM Jake Vanderplas ***@***.***> wrote:
I think that broadly, the problem is that Altair was originally built on
top of traitlets, and we've been slowly moving away from a traitlets-like
model because it's fundamentally incompatible with a proscriptive
schema-based object validation.
Now we're trying to evolve the chart construction API, and we're running
into conflicts with vestiges of the original traitlets-based approach that
still exist, but for no good reason other than a modicum of backward
compatibility between Altair v1 and v2. If it comes down to a choice
between the two, I'd personally lean toward jettisoning the
traitlets-inspired model altogether in favor of a better plotting API.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1629>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAGXUHFUEDN2UPW6LUADEDQGQJJ7ANCNFSM4IHAKFGQ>
.
--
Brian E. Granger
Principal Technical Program Manager, AWS AI Platform (brgrange@amazon.com)
On Leave - Professor of Physics and Data Science, Cal Poly
@ellisonbg on GitHub
|
Thanks Brian - one possiblity I thought of: we could make any undefined attributes evaluate as a callable setter function. But then I could imagine users getting pretty confused by that if they try to override an existing value... |
Another possibility is to have method-based setters only for channel classes, and to replace these setters with the attributes once they're used in the chart. So this would work correctly: chart = alt.Chart(data).encode(
x=alt.X('x').bin(maxbins=20)
)
assert chart.x.bin.maxbins == 20 |
In the last example, would this work? x = alt.X('x').bin(maxbins=20)
chart1 = alt.Chart(data).encode(
x=x
)
assert chart1.x.bin.maxbins == 20
xx = x.bin(maxbins=30) # does this work???
chart2 = alt.Chart(data).encode(
x=xx
)
assert chart2.x.bin.maxbins == 30 |
@domoritz - yes, it could be made so that would work. |
It seems like we're getting close to a new major vega-lite release again. I'm wondering if it is worth revisiting this change to the API. I for one, would love to see this change. I think that it would be a usability improvement and make it easier to move between Altair and vega-lite-api. I'd be happy to help translating docs, if you decide to go forward with it. |
This would be a super cool enhancement that I would love to see as well! (but I understand probably quite time consuming to implement) I would be happy to help out with testing, writing docs, or something of similar difficulty/skill level if needed. |
We're still stalled on wrapping Vega-Lite 4.9... I think it's going to be a while before we can get to the newer Vega-Lite features, and a change like this is even more major. |
Would the next release of Altair be a natural time to add this? I would be very happy to experiment with this PR if there is interest! |
I would love to see this too and have been thinking that a 5.0 release would be an opportune moment to introduce a big addition like this! I started working on this a little bit myself a few weeks ago and I just uploaded my branch to #2592, feel free to build on it or just copy the files and start over, whatever is easier. As far as I understand, the biggest outstanding item from the discussion above is whether the new method-based syntax warrants losing the ability of getting and setting values via the syntax For me personally, the added value of using the syntax proposed here outweighs the inconvenience of having to use the dictionary-like syntax to get/set values, but I think it is up to @jakevdp and @ellisonbg to decide on which direction they prefer for Altair and what would be the most helpful way to contribute here. |
I agree this api would be super nice and that breaking the access as you suggested makes sense. I guess there may be some benefit to maintaining backwards compatibility and we could do so with a warning. |
* Add method based syntax from #1629 to v5 * Add correct function signatures and docstrings, and initial autocompletion parameter support * Add general param info to docstrings and pass through adding the helper class docstring when not appropriate * Deal with docstring missing "Attributes" tag * Fix black formatting * Fix black formatting * Update relative imports syntax * Only generate new schema for the latest version of vega lite to avoid backported changes that we do not test on older versions * Type hints for tab completion * Fix NoneType notation * Replace Type[None] with None * Basic documentation for attribute setter methods * Apply suggestions from code review Co-authored-by: Stefan Binder <binder_stefan@outlook.com> * define test_examples for both syntax * renaming iter_examples for methods syntax * add test examples using methods syntax * improve comments * exclude method examples dir from flake8 and black * include commit credits original prototype: #1629 Co-authored-by: Jake VanderPlas <781659+jakevdp@users.noreply.github.com> Co-authored-by: Joel Ostblom <joel.ostblom@gmail.com> Co-authored-by: Stefan Binder <binder_stefan@outlook.com> Co-authored-by: mattijn <mattijn@gmail.com> Co-authored-by: Jake VanderPlas <781659+jakevdp@users.noreply.github.com>
Implemented in #2795 |
Do not merge until Altair 4.0.
This is a new prototype of how to set encoding attributes via methods rather than arguments. So rather than
you could instead do
note that the old way still works as well.
The tradeoff is that attributes of encoding classes can no longer be accessed directly via
getattr
, but this was not a very common pattern anyway.Still TODO:
If you'd like to try this in its current state, you can run