Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Program hangs during termination #6538

Closed
2 tasks done
Gigioliva opened this issue Jan 28, 2023 · 4 comments
Closed
2 tasks done

Program hangs during termination #6538

Gigioliva opened this issue Jan 28, 2023 · 4 comments
Labels
bug Something isn't working python Related to Python Polars

Comments

@Gigioliva
Copy link

Polars version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of Polars.

Issue description

I use Polars to read several parquet files. I also put it in an async loop to use it with asyncio. More or less the code is this:

def _parquet_file_to_df(file_path: str, columns: tuple[str, ...]) -> polars.DataFrame:
    return polars.read_parquet(
        file_path, columns=list(columns) if columns else None, use_pyarrow=True
    )

async def main() -> None:
    loop = asyncio.get_running_loop()
    for file in [...]:
        await loop.run_in_executor(None, func)

if __name__ == "__main__":
    asyncio.run(main())
    print("done")

Sometimes my program does not terminate even after printing done. Here some pieces of the stack trace:

Screenshot 2023-01-28 alle 19 05 30

Screenshot 2023-01-28 alle 19 06 18

Screenshot 2023-01-28 alle 19 07 35

Screenshot 2023-01-28 alle 19 09 52

I am not sure the problem is 100% related to Polars. Can you help me understand the bug?

Reproducible example

def _parquet_file_to_df(file_path: str, columns: tuple[str, ...]) -> polars.DataFrame:
    return polars.read_parquet(
        file_path, columns=list(columns) if columns else None, use_pyarrow=True
    )

async def main() -> None:
    loop = asyncio.get_running_loop()
    for file in [...]:
        await loop.run_in_executor(None, func)

if __name__ == "__main__":
    asyncio.run(main())
    print("done")

Expected behavior

The program terminates execution

Installed versions

---Version info---
Polars: 0.15.16
Index type: UInt32
Platform: Linux-5.13.0-52-generic-x86_64-with-glibc2.35
Python: 3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0]
---Optional dependencies---
OpenBLAS WARNING - could not determine the L2 cache size on this system, assuming 256k
pyarrow: 10.0.1
pandas: 1.5.3
numpy: 1.24.1
fsspec: <not installed>
connectorx: <not installed>
xlsx2csv: <not installed>
deltalake: <not installed>
matplotlib: 3.6.3
@Gigioliva Gigioliva added bug Something isn't working python Related to Python Polars labels Jan 28, 2023
@ritchie46
Copy link
Member

Does your event loop run in threads or process pools?

@Gigioliva
Copy link
Author

Does your event loop run in threads or process pools?

@ritchie46 the default one that is a ThreadPool

@ritchie46
Copy link
Member

Can you make an MWE? I have no clue.

@Gigioliva
Copy link
Author

@ritchie46 it seems that my issue is this. So not related to Polars 🙏🏼

@zundertj zundertj closed this as completed Mar 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working python Related to Python Polars
Projects
None yet
Development

No branches or pull requests

3 participants