Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use object specific counter instead of a global counter for unnamed parameter & view names #3416

Open
lukasmasuch opened this issue May 7, 2024 · 7 comments

Comments

@lukasmasuch
Copy link

What is your suggestion?

Unnamed parameters and sub-charts are automatically named by using a global counter:

Parameter:

_counter: int = 0
@classmethod
def _get_name(cls) -> str:
cls._counter += 1
return f"param_{cls._counter}"

Subcharts/Views:

_counter: int = 0
@classmethod
def _get_name(cls) -> str:
cls._counter += 1
return f"view_{cls._counter}"

This will lead to different specs for the same charts in a multi-threaded/multi-user environment (e.g., Streamlit). In Streamlit, we rely on the spec to be stable to calculate a deterministic element identity across app reruns. Having this global counter makes the usage of Altair complicated. It would be awesome if the global counter would be replaced with a counter that's specific to the chart object -> which would make the chart spec stable independent of the global state.

Have you considered any alternative solutions?

No response

@jonmmease
Copy link
Contributor

Hi @lukasmasuch, thanks for the report.

Unfortunately it's not quite as straightforward as moving the counters from class-level to instance-level. The purpose of these is to make sure that subcharts and params have unique names across concatenation.

For example:

chart1.to_dict()
chart2.to_dict()
(chart1 | chart2).to_dict()

In the result of (chart1 | chart2).to_dict(), the params associated with chart1 must be named differently than those associated with chart2.

To avoid a global counter, we might need to refactor to delay assigning these names until the conversion to a dictionary.

@lukasmasuch
Copy link
Author

@jonmmease Thanks for the quick response :)

To avoid a global counter, we might need to refactor to delay assigning these names until the conversion to a dictionary.

This sounds like a reasonable solution.

Our temporary solution in Streamlit will probably be some hacky replacements in the vega-lite spec, but it would be much better if it were eventually solved with a counter that is applied only in the to_dict() logic.

@mattijn
Copy link
Contributor

mattijn commented May 8, 2024

It's the same issue with data referencing, that one was stabilized by using the hash value of the dataset. We might be able to do this by introducing a hash method on the chart specification, and use that somehow when referencing views and params. Thanks for raising!

lukasmasuch added a commit to streamlit/streamlit that referenced this issue May 8, 2024
## Describe your changes

Altair uses global counters to name unnamed parameter & view/subchart
names (see this issue
[here](vega/altair#3416)). This leads to
different vega-lite chart specs across reruns for the same charts since
the auto-named parameters will change the counter on every new chart
creation. Some of our features like forward message cache & the upcoming
chart selections will require a stable identity for the chart element.

Therefore, we are applying a (temporary) slightly hacky solution to
replace these generated names with names that are stable across reruns.
The relevant naming patterns are: `view_<number>` and `param_<number>`.
Eventually, we can replace this hack if it gets addressed in Altair (see
issue [here](vega/altair#3416))

I performance-tested the `_stabilize_vega_json_spec` method with a
bigger concatenated chart spec by running it 10k times. The total
runtime was ~0.3s, about ~0.03 milliseconds per execution. There are a
couple of ways to make this a lot more performant, but I think it would
be fine to put that into a follow-up PR after the selections release.

## Testing Plan

- Added unit tests.

---

**Contribution License Agreement**

By submitting this pull request you agree that all contributions to this
project are made under the Apache 2.0 license.
@binste
Copy link
Contributor

binste commented May 12, 2024

I agree, it would be cleaner and useful for web apps if there is no global counter. We have the same issue in dash-vega-components (although less severe as there is no "rerun" of the whole app) and maybe there are other use cases which would benefit from this too.

I'm worried that delaying the parameter naming breaks user code. In the documentation, and many examples on other sites, use the pattern of accessing parameter names in expressions via the .name attribute. Search for .name in https://altair-viz.github.io/user_guide/interactions.html, e.g.:

bind_range = alt.binding_range(min=100, max=300, name='Slider value:  ')
param_width = alt.param(bind=bind_range)

# Examples of how to write both js and python expressions
param_color_js_expr = alt.param(expr=f"{param_width.name} < 200 ? 'red' : 'black'")

So I think names already need to be available when a parameter is instantiated. We could port the fix of @lukasmasuch in streamlit/streamlit#8628. Doing this on the JSON string feels hacky but I also can't come up with something better, see https://github.com/streamlit/streamlit/pull/8628/files#diff-611496cd0d1726cf41aed099fd6723e35d40833db11afbafef9be914fc425f26R270 for reasoning on why working on the dict does not work. Same goes for chart objects. Basically, parameters can just be referenced in too many places and even in strings as shown above and hence the replacement needs to happen on strings anyway.

It could be a new argument to to_dict which by default should be False as it slows down the conversion and would be a breaking change. chart.to_dict(reset_unnamed_parameter_names=True)` (need a better name for it) would do the following:

  • chart.to_json()
  • The fix from @lukasmasuch
  • return json.loads(fixed_spec) <- this can be skipped if to_json was called which is what Streamlit would need.

For Streamlit, not much would change compared to the current fix apart from that it's something officially supported and maintained in Altair, but the upside would be that it's also available for other libraries.

Thoughts?

@jonmmease
Copy link
Contributor

Thanks @binste, I like the solution of adding an optional argument to to_dict/to_json. Maybe something like stable_names=True?

@mattijn
Copy link
Contributor

mattijn commented Sep 6, 2024

Streamlit has to introduce custom code that utilise internal altair functions. This might be one of reasons that #3554 is popping up as an issue. I say, might, because this specific issue might not be related.
But if we want to restructure the altair internal code, it is a good idea to first address this issue so we produce vega-lite output of altair specifications which are stable when they include parameters.

benjamin-awd pushed a commit to benjamin-awd/streamlit that referenced this issue Sep 29, 2024
## Describe your changes

Altair uses global counters to name unnamed parameter & view/subchart
names (see this issue
[here](vega/altair#3416)). This leads to
different vega-lite chart specs across reruns for the same charts since
the auto-named parameters will change the counter on every new chart
creation. Some of our features like forward message cache & the upcoming
chart selections will require a stable identity for the chart element.

Therefore, we are applying a (temporary) slightly hacky solution to
replace these generated names with names that are stable across reruns.
The relevant naming patterns are: `view_<number>` and `param_<number>`.
Eventually, we can replace this hack if it gets addressed in Altair (see
issue [here](vega/altair#3416))

I performance-tested the `_stabilize_vega_json_spec` method with a
bigger concatenated chart spec by running it 10k times. The total
runtime was ~0.3s, about ~0.03 milliseconds per execution. There are a
couple of ways to make this a lot more performant, but I think it would
be fine to put that into a follow-up PR after the selections release.

## Testing Plan

- Added unit tests.

---

**Contribution License Agreement**

By submitting this pull request you agree that all contributions to this
project are made under the Apache 2.0 license.
@dangotbanned
Copy link
Member

dangotbanned commented Oct 12, 2024

Streamlit has to introduce custom code that utilise internal altair functions. This might be one of reasons that #3554 is popping up as an issue. I say, might, because this specific issue might not be related.

We found a different source for that particular bug, which was fixed in #3637

(@mattijn you can ignore this, just making sure it is recorded here)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants