Skip to content

Make it easier to create a Pandas dataframe from DataFusion query results #139

@andygrove

Description

@andygrove

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
DataFrame.collect returns a list of PyArrow record batches. Each batch can be turned into a Pandas datraframe but I do not know how to create a Pandas dataframe that contains data from all of the batches in an efficient way.

Describe the solution you'd like
Either an example for this, or new features to help with this. Perhaps a DataFrame.collect_single_batch could work.

Describe alternatives you've considered
None

Additional context
None

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions