Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Passing in a URL to an excel file produces undesirable results #103

Open
andylolz opened this issue Dec 4, 2013 · 12 comments
Open

Passing in a URL to an excel file produces undesirable results #103

andylolz opened this issue Dec 4, 2013 · 12 comments
Labels

Comments

@andylolz
Copy link
Collaborator

andylolz commented Dec 4, 2013

e.g. this:
http://datapipes.okfnlabs.org/none?url=https://github.com/okfn/messytables/raw/master/horror/simple.xls

@davidmiller
Copy link
Contributor

I would v. much like to pass in this excel sheet [1] as the url and then drop the nonsense headers with a datapipes transform...

[1] https://indicators.ic.nhs.uk/download/GP%20Practice%20data/summaries/demography/Practice%20Addresses%20Final.xls

@rufuspollock
Copy link
Member

@davidmiller issue is we need excel parsing in node and it doesn't seem to exist (maybe for xlsx) ...

@davidmiller
Copy link
Contributor

Pass off to http://okfnlabs.org/dataconverters/ As-A-Service?

@rufuspollock
Copy link
Member

@davidmiller sure but we need that deployed "as a service" :-) (easy to do but needs a small bit of work i imagine).

@SheetJSDev
Copy link

we need excel parsing in node and it doesn't seem to exist (maybe for xlsx) ...

@rgrp shameless plug: xlsjs on npm is an XLS parser (the javascript also works in-browser: http://oss.sheetjs.com/js-xls/ )

@rufuspollock
Copy link
Member

@SheetJSDev that is awesome :-) We'd love to use this if that was ok :-)

@SheetJSDev
Copy link

It's Apache 2.0 licensed and the source is on github ( https://github.com/SheetJS/js-xls ) so there really shouldn't be a problem.

@rufuspollock
Copy link
Member

@SheetJSDev this is absolutely fantastic. Please say what kind of credit you'd like us to have on the site.

@rufuspollock
Copy link
Member

@davidmiller would you be up for having a go at an incoming parser based on this?

@davidmiller
Copy link
Contributor

Entirely possible.

What's the status of implementing all the transforms etc as fail-early streams - that was the major issue last time I was paying close attention?

@rufuspollock
Copy link
Member

@davidmiller fail early streams is ongoing in #110 but it wouldn't be a blocker for this (i mean we can't stream an excel file anyway in the true sense since you need to read the whole file to use IIRC).

@davidmiller
Copy link
Contributor

Also - you're less likely to get 12GB excel files - so less of an issue here one suspects.

Can turn into a streamable for downstream consumption as a reasonable compromise

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants