Skip to content

Mutability #10

Open
Open
@TomAugspurger

Description

@TomAugspurger

On the call yesterday, the topic of mutability came up in the vaex demo.

The short version is that it may be difficult or impossible for some systems to implement inplace mutation of dataframes. For example, I believe that neither vaex nor Dask implement the following:

In [8]: df = pd.DataFrame({"A": [1, 2]})

In [9]: df
Out[9]:
   A
0  1
1  2

In [10]: df.loc[0, 'A'] = 0

In [11]: df
Out[11]:
   A
0  0
1  2

I think in the name of simplicity, the API standard should just not define any methods that mutate existing data inplace.

There is one mutation-adjacent area that might be considered: using DataFrame.__setitem__ to add an additional column

In [12]: df['B'] = [1, 2]

In [13]: df
Out[13]:
   A  B
0  0  1
1  2  2

Or perhaps to update the contents of an entire column

In [14]: df['B'] = [3, 4]

In [15]: df
Out[15]:
   A  B
0  0  3
1  2  4

In these case, no values are actually being mutated inplace. Is that acceptable?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions