Skip to content

Determine length of column-oriented dataframe? #1088

@Fil

Description

@Fil

Suppose the data is given as an object of columns:

df = {x: [1, 2, 3], y: [1, 2, 3]}

(This is how Quarto returns dataframes, and arquero does something similar.)

To use this in a mark we can call:

Plot.barX(data, {x: df.x, y: df.y, …)

But there is no good way to specify data:

  • if we specify it as {length: n} it will get materialized at some point, which is not optimal if the dataframe has millions of rows.
  • if we pass df.x as data, it is semantically incorrect
  • technically new Array(df.x.length) is fine, but it's a mental stretch

I wonder if we could have either: data = n (a number) —which would be read as new Array(n)—; or a special symbol that would say "use the channels' length". Another useful possibility would be for "dataframe objects" to have some sort of length property.

cc: @allisonhorst ; discussion after reading https://allisonhorst.github.io/posts/2022-10-14-bird-attacks/

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestquestionFurther information is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions