Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can the worksheet be converted into csv without reading row by row? #15

Closed
lbhtran opened this issue Apr 4, 2019 · 4 comments
Closed
Labels

Comments

@lbhtran
Copy link

lbhtran commented Apr 4, 2019

Hi, I work with xlsb with a large number of rows so I wonder if there is a way to avoid read data in rows by rows as it's taking a long time.

@willtrnr
Copy link
Owner

willtrnr commented Apr 5, 2019

Given how the data it laid out in the xlsb file, reading row by row one way or an other is the only option.

I didn't get much time to work on it, but I'm generally trying to improve performance. After all, reading huge files was the original motivation for writing this.

@lbhtran
Copy link
Author

lbhtran commented Apr 5, 2019

I got a workaround which involves using Powershell to convert xlsb to csv then read into pandas dataframe. It's working for now but it would be nice to improve performance when reading files with pyxlsb

@chfw
Copy link

chfw commented Apr 5, 2019

Pyexcel with pyexcel-xlsb uses pyxlsb can help the conversion to csv.

I suppose you would use pandas to read csv trunk by trunk.

@willtrnr
Copy link
Owner

willtrnr commented Apr 6, 2019

I'll close this in favor of a general performance issue I've opened #16

@willtrnr willtrnr closed this as completed Apr 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants