Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pl.concat on a series of series fails with: TypeError: 'Series' object cannot be converted to 'Sequence' #6656

Closed
2 tasks done
omgrnd opened this issue Feb 3, 2023 · 5 comments
Labels
bug Something isn't working python Related to Python Polars

Comments

@omgrnd
Copy link

omgrnd commented Feb 3, 2023

Polars version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of Polars.

Issue description

It was observed that after the switch from 0.16.1 to 0.16.2, applying pl.concat to a series of series started to fail. Where previously it would yield a concatenated series it now fails with the following error:

TypeError: 'Series' object cannot be converted to 'Sequence'

Searching through past issues pointed to two (somewhat) related threads:

In case it is an expected behavior now - could you please advise on how to best (w.r.t. scalability) concatenate a series of series?

Thank you!

Reproducible example

import polars as pl

pl.concat(pl.Series([pl.Series([0, 1]), pl.Series([2, 3])]))

# TypeError: 'Series' object cannot be converted to 'Sequence'

Expected behavior

>>> pl.concat(pl.Series([pl.Series([0, 1]), pl.Series([2, 3])]))
shape: (4,)
Series: '' [i64]
[
        0
        1
        2
        3
]

Installed versions

---Version info---
Polars: 0.16.2
Index type: UInt32
Platform: Linux-5.4.0-137-generic-x86_64-with-glibc2.29
Python: 3.8.10 (default, Nov 14 2022, 12:59:47)
[GCC 9.4.0]
---Optional dependencies---
pyarrow: 11.0.0
pandas: 1.5.3
numpy: 1.24.1
fsspec: 2023.1.0
connectorx: <not installed>
xlsx2csv: <not installed>
deltalake: <not installed>
matplotlib: 3.6.3
@omgrnd omgrnd added bug Something isn't working python Related to Python Polars labels Feb 3, 2023
@ritchie46
Copy link
Member

You are passing a single Series, but the types indidcate that you should pass a sequence of Series, e.g.:

pl.concat(
    [
        pl.Series([pl.Series([0, 1]), pl.Series([2, 3])])
        
    ]
)

@alexander-beedie
Copy link
Collaborator

alexander-beedie commented Feb 4, 2023

As Ritchie says, you are passing the wrong type into the function.
(If it worked before somehow, it really shouldn't have done... ;)

You want one of the following:

  • Concatenate the two series -

    pl.concat( [pl.Series([0, 1]), pl.Series([2, 3])] )
    # shape: (4,)
    # Series: '' [i64]
    # [
    #     0
    #     1
    #     2
    #     3
    # ]  
  • Concatenate the series' components -

    pl.Series( [pl.Series([0, 1]), pl.Series([2, 3])] )
    # shape: (2,)
    # Series: '' [list[i64]]
    # [
    #     [0, 1]
    #     [2, 3]
    # ]

@omgrnd
Copy link
Author

omgrnd commented Feb 6, 2023

@ritchie46 - thank you for a quick reply but what you propose does not achieve the goal here, which is to concatenate the inner series (which I probably should have clarified - my apologies). Both for 0.16.2 and before it will leave them not concatenated:

>>> pl.concat(
...     [
...         pl.Series([pl.Series([0, 1]), pl.Series([2, 3])])
...     ]
... )
shape: (2,)
Series: '' [list[i64]]
[
        [0, 1]
        [2, 3]
]

@alexander-beedie - thank you for your reply and the insight! This clarifies that Series is NOT a Sequence and SHOULD NOT be used as such. Thus, suppose I have no control over getting the data structure a series-of-series, I take it that your recommendation is to re-wrap it as a Sequence by e.g. using unpacking operator:

series_of_series = pl.Series([pl.Series([0, 1]), pl.Series([2, 3])])
pl.concat([*series_of_series])

@ritchie46
Copy link
Member

You can use reshape to rehape the list. Or call flatten/explode.

@omgrnd
Copy link
Author

omgrnd commented Feb 6, 2023

You can use reshape to reshape the list.

Ah - right!

Then my:

pl.concat([*series_of_series])

is equivalent to:

count = sum(map(lambda series: len(series), series_of_series))
series_of_series.reshape((1, count))[0]

Or call flatten/explode.

I'm not sure how to take that path though :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working python Related to Python Polars
Projects
None yet
Development

No branches or pull requests

3 participants