-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CSV Handling Issues #44
Comments
Thanks for laying out the problem. JSON in CSV was pioneered by @jlord @maxogden and is used as the output format from http://filebakery.com. I'd like to keep supporting it. It looks like it is under discussion by other CSV folks too: https://issues.apache.org/jira/browse/CSV-116 /cc @maxogden - do you have a geocsv parser/validator in pure js we could leverage? If this exists and we see any major differences to the Mapnik one I would be inclined to change Mapnik's behavior to conform. |
Ahh I see. The best solution would be just to make a legit geocsv OGR driver that we could use consistently across mapnik-omnivore, node-gdal, mapnik, etc. It would also open the door for getting data out of this format easily with ogr2ogr. I'm not sure if I have the time to devote to this though... I'd imagine it would be substantial undertaking. |
For now I think using a normal node CSV parser will work fine. Here's a hackish way to deal with the JSON geometry if all mapnik-omnivore needs is the extent: function getJSONExtent(str){
// example inputs:
// "{\"type\":\"Point\",\"coordinates\":[30.0,10.0]}" // escaped "
// '{"type":"Point","coordinates":[30.0,10.0]}' // single quotes no need for escaping "
// "{""type"":""Point"",""coordinates"":[30.0,10.0]}" // filebakery.com style ""
//Assumption: coordinate array will be the only array in a JSON geometry object
//1. Strip away all characters but array text
//2. Remove any brackets/whitespace and repeated commas (may be unnecessary if no empty subarrays)
//3. Parse as JSON
str = str.substring(str.indexOf('['), str.lastIndexOf(']'));
str = str.replace(/[\[\]\s]/g, '').replace(/\.{2,}/g, ',');
var arr = JSON.parse('['+str+']');
var len = arr.length;
var dim = len % 3 == 0 ? 3 : 2;
var minX = Number.POSITIVE_INFINITY;
var minY = Number.POSITIVE_INFINITY;
var maxX = Number.NEGATIVE_INFINITY;
var maxY = Number.NEGATIVE_INFINITY;
for(var i=0; i<len; i+=dim){
var x = arr[i];
var y = arr[i+1];
minX = Math.min(minX, x);
minY = Math.min(minY, y);
maxX = Math.max(maxX, x);
maxY = Math.max(maxY, y);
}
return new gdal.Envelope({minX:minX, minY:minY, maxX:maxX, maxY:maxY});
}
function getWKTExtent(str){
return gdal.Geometry.fromWKT(str).getEnvelope();
} Speed comparison
|
@springmeyer I don't have any geocsv specific parsers in JS, sorry! But that's not a bad idea... |
Ah, okay! Seems like we should formalize a quick spec for geocsv that spells out it is valid csv:
And then the implementations would follow. What do you think?
@brandonreavis - I agree, a quick js CSV parser I think is a good workaround for now.
Good idea. Another idea I had would be to use emscripten to generate a pure js geocsv parser from Mapnik's impl: which would be a bit round-about but a decent medium term solution for node.js apps wanting to stay light. |
Alright the quick and lightweight js geoCSV parser for finding basic infomation is over here: https://github.com/naturalatlas/geocsv-info I'll close this and we can bring up more specific issues at: |
By removing mapnik (#43), we will need to figure out to read and validate CSV's the same way mapnik does.
JSON geometry column
Does mapnik-omnivore need to support JSON geometry within a CSV like mapnik does? Do some CSVs actually do this?
GDAL doesn't really have a great way to go from JSON straight to a geometry object yet, so it would have to be a purely JS conversion which seems nasty and slow. The only "fast" alternative I've thought of is dumping the json column into a Buffer object, then opening a VSI memory file with the GeoJSON driver: naturalatlas/node-gdal#80.
I suppose if only the extent is needed from the spatial column, you could just flatten the
coordinates
array in the each JSON object and then manually come up with the extent... but, again, this seems pretty slow for just getting an extent / center.cc: @springmeyer
The text was updated successfully, but these errors were encountered: