Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing examples of connecting and loading pandas dataframes #246

Closed
yimingli opened this issue Nov 26, 2024 · 3 comments
Closed

Missing examples of connecting and loading pandas dataframes #246

yimingli opened this issue Nov 26, 2024 · 3 comments
Labels
question Further information is requested

Comments

@yimingli
Copy link

❓Search before asking

I have searched for issues similar to this one.

❓Description

I'm not sure whether it's a feature request or documentation enhancement, so posing it as a question for now.

I looked through all the examples in the repo, but only found examples using CsvConnector. Curious how to connect and load data that are not csv, for example, how to use a pandas dataframe. This is a common use case, because we often read data from some database or cloud storage (ie non-csv).

@yimingli yimingli added the question Further information is requested label Nov 26, 2024
@jalr4ever
Copy link
Collaborator

@yimingli Hi there.

I think this PR(#247) contains the feature DataFrameConnector you need, coming soon!

@cyantangerine
Copy link
Contributor

Before this PR merged, i think you can use GeneratorConnector instead.

def generator():
    yield df.copy()
touse = GeneratorConnector(generator)

@jalr4ever
Copy link
Collaborator

Now we support DataFrameConnector in new release 0.2.4 just like this:

from pathlib import Path

path_obj = Path(file_path)

# Create a data connector and data loader for large csv data
# SDG will load data with chunk, can reduce memory usage.
data_connector = CsvConnector(path=path_obj)
# For small data you can use DataFrameConnector
# from sdgx.data_connectors.dataframe_connector import DataFrameConnector
# data_connector = DataFrameConnector(dataframe)
data_loader = DataLoader(data_connector)

Check this out for full code!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants