-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracker: GIS Category Loaders / Flat Binary Array support #685
Comments
If I'm not mistaken, the current flow of the
So in order to speed this up, I see two options:
I wrote up a quick prototype of the latter, specifically taking predefined Code// used to filter specified features within each layer
const geometryIndices = {
waterway: [0, 2, 4]
}
// used to override default styling properties
const properties = {
waterway: {
0: {
color: [100, 150, 200],
width: 1
}
}
}
function toPathLayer({tile, geometryIndices, properties}) {
let startIndexCounter = 0;
const startIndices = [0];
const positions = []
const colors = []
const widths = []
for (const layerName in geometryIndices) {
const layerIndices = geometryIndices[layerName]
for (const index of layerIndices) {
const feature = tile.layers[layerName].feature(index)
const geometry = feature.loadGeometry()
// Add geometry
let nVertices = 0
for (const point of geometry[0]) {
positions.push(point.x)
positions.push(point.y)
nVertices += 2
}
startIndexCounter += nVertices;
startIndices.push(startIndexCounter);
// Find properties for this feature
let color = DEFAULT_COLOR;
let width = DEFAULT_WIDTH;
if (properties[layerName] && properties[layerName][index]) {
if (properties[layerName][index].color) {
color = properties[layerName][index].color
}
if (properties[layerName][index].width) {
width = properties[layerName][index].width
}
}
// Add properties to arrays
for (let i = 0; i < nVertices; i++) {
colors.push.apply(color);
widths.push(width)
}
}
}
// data object ready to be passed to PathLayer
return {
length: startIndices.slice(-1)[0],
startIndices: startIndices,
positionFormat: 'XY',
attributes: {
getPath: Float32Array.from(positions),
getColors: Uint8ClampedArray.from(colors),
getWidths: Float32Array.from(widths)
}
}
} Remarks:
From visgl/deck.gl#3935 (comment)
Yes, I'm working under the assumption that a separate flat array would be created for each type of geometry. This would make it easier to pass each response type to a different Deck.gl layer without any further processing. |
While I agree this avoiding intermediate GeoJson object creation is the long-term desirable setup, given the number of moving parts here, it could be worth not dealing with that issue initially. I would start with creating a generic function that converts geojson (already parsed/extracted) to binary arrays. If you then call that function in the worker loader, you will be able to pass back typed arrays instead of javascript data structure, avoiding You can then test the resulting typed arrays with deck.gl and make sure everything works. We can finally go back to each loader that returns geojson formatted data and see if there is a way to avoid the creation of intermediary geojson inside the loader. You can see where I am going with this: avoiding the intermediate GeoJson requires custom work for each loader and is really the final touch when this all works end-to-end. |
PS - love your expandable code section... Neat trick!
The deck.gl binary attribute support have been evolving for a while now. It is maturing but is still quite deck.gl specific (i.e. the key are called Given that loaders.gl is strongly advertised as being framework independent, I would make a function that returns a very general set of typed arrays and offer a function that converts to most current deck.gl binary data object. Maybe similar to the point cloud example? |
While this may not be the very most optimized version for deck.gl, the most natural way to represent this in binary IMHO is with multiple arrays, that contain the indices. Points:
Lines:
Polygons
|
Ok, here's a stab at it. Essentially filling data arrays while iterating over an array of Codefunction featuresToArrays({ features }) {
const points = {
positions: [],
objectIds: []
};
const lines = {
pathIndices: [0],
positions: [],
objectIds: []
};
const polygons = {
polygonIndices: [0],
primitivePolygonIndices: [0],
positions: [],
objectIds: []
};
let featureCounter = 0;
for (const feature of features) {
const geometry = feature.geometry;
if (geometry.type === "Point") {
points.objectIds.push(featureCounter);
points.positions.push(...geometry.coordinates);
} else if (geometry.type === "MultiPoint") {
points.objectIds.push(featureCounter);
points.positions.push(...geometry.coordinates.flat());
} else if (geometry.type === "LineString") {
lines.objectIds.push(featureCounter);
const index = lines.positions.push(...geometry.coordinates.flat());
lines.pathIndices.push(index);
} else if (geometry.type === "MultiLineString") {
lines.objectIds.push(featureCounter);
for (const line of geometry.coordinates) {
const index = lines.positions.push(...line.flat());
lines.pathIndices.push(index);
}
} else if (geometry.type === "Polygon") {
polygons.objectIds.push(featureCounter);
let linearRingCounter = 0;
for (const linearRing of geometry.coordinates) {
const index = polygons.positions.push(...linearRing.flat());
polygons.primitivePolygonIndices.push(index);
if (linearRingCounter === 0) polygons.polygonIndices.push(index);
linearRingCounter++;
}
} else if (geometry.type === "MultiPolygon") {
polygons.objectIds.push(featureCounter);
for (const polygon of geometry.coordinates) {
let linearRingCounter = 0;
for (const linearRing of polygon) {
const index = polygons.positions.push(...linearRing.flat());
polygons.primitivePolygonIndices.push(index);
if (linearRingCounter === 0) polygons.polygonIndices.push(index);
linearRingCounter++;
}
}
}
featureCounter++;
}
return {
points: {
positions: Float32Array.from(points.positions),
objectIds: Uint32Array.from(points.objectIds)
},
lines: {
pathIndices: Uint32Array.from(lines.pathIndices),
positions: Float32Array.from(lines.positions),
objectIds: Uint32Array.from(lines.objectIds)
},
polygons: {
polygonIndices: Uint32Array.from(polygons.polygonIndices),
primitivePolygonIndices: Uint32Array.from(
polygons.primitivePolygonIndices
),
positions: Float32Array.from(polygons.positions),
objectIds: Uint32Array.from(polygons.objectIds)
}
};
} Remarks:
|
Yes, this looks awesome just along the lines I was thinking! (micro-nit: not sure I ever saw a better use case for a If you look at the loaders.gl module structure, you will see that some loader categories have a their own module with helper classes ( Setting up a new module is straightforward but requires a bit of scaffolding. Let me know if you are interested in doing this, if not, if you are interested in contributing the code above if we scaffold this for you. You could also start by adding this as an extra experimental (underscore) export to the mvt loader module. |
For 2D/3D coordinates I think this should be an input parameter to your conversion routine. At maturity, you probably want to provide a sniffer/detector function that scans the geojson first
|
About |
I can try to scaffold; I'll see how it goes.
I wasn't sure if it would be faster to iterate through
Right now I don't touch properties at all, because it seems like typed arrays for properties would only be useful for specific deck.gl classes. Since properties could be arbitrary types, I'm not sure the best way to handle them.
Ok, this makes sense. The value of |
Great, don't overwork it, just put something up, I'll land your PR asap and make any necessary cleanup.
Always hard to predict performance for sure, but my guess is that 2-pass is on the average equal or faster once you factor in that you can essentially avoid any temporary object creation, increasing memory pressure etc, and overall usability/robustness will be better. Also the conversion logic will be simpler when you don't have to worry about suddenly finding an unexpected 3D coordinate etc - especially once you start adding more features such as props.
Yes, unless we add significant complexity it probably only makes sense to support props that are always numbers (think price, elevation, ...). It seems to me that numbers could be supported fairly easily: if you have a sniffer function, you could scan for property types which are always numbers and offer these. Like the objectIds it may make most sense to make this match the length of the positions array so that they can easily be used in shaders. You could have filter options for props to avoid creating such arrays if not needed.
Yes you want
The candidates for this index are the index in the original geojson array or the index into the "filtered" point/path/polygon-only array. You could create both, if not I vote for the original geojson index as you suggest since it is most likely to be meaningful to the end user. |
It might be a few days before I get to it, so I wanted to keep track of the ideas from #703
|
It looks like the Now if parsing and JSON -> binary arrays can happen on a worker thread, it would be nice to add (optional) support for workers to the Also, it might be worth it to add |
Yes. There are a few reasons:
Just ideating here, there are probably quite a few more wrinkles, so please keep the discussion going... |
Good points. Since JSON is arbitrary without any data schema, while GeoJSON has a well-defined schema, I think you're right to suggest that On the binary format itself, I've been reading more about Arrow, and it's quite exciting. While the GeoJSON to binary code is still new, should there be discussion about the binary format stability? If Deck.gl and Loaders.gl support Arrow more directly in the future, it might be worthwhile to at some point have a "GeoJSON to Arrow" utility, where maybe each geometry type is its own Arrow Table. Is it of interest to support conversion to two separate binary formats? If not, is it ok to start with the current binary arrays and move to Arrow when it's more supported? Also, I think it's worthwhile to start thinking about how these binary arrays could be integrated directly into deck.gl. E.g. as directly supported by the |
Yes. I am certainly suggesting two separate loaders though they could go into the same module. But since jsonloader is generic, it seems unfair that every JSONLoader user should need to import geojson specific "bloatware", so probably two modules :)
Yes. That is the plan! I believe that we can simply create a generic converter from maps of There is a
If you mean the maps of
Yes, loading geojson into three arrow tables make sense, just like we are currently loading them into three binary object maps (also note that some data sources, while technically geojson, only contain one type of geojson primitives meaning only one table is used, we may be able to offer some options to simplify such usage).
Yes. Note that even though we generate binary attributes, those cannot always be used directly by the GPU. They still need to be tesselated, at least for polygons (I believe some normalization may also be needed for paths). The good news is that a lot of the binary groundwork has already been done in deck, so I recommend we progress a bit more here, then we can start dealing with that. |
Yes, simple JSON loading shouldn't have to deal with this bloat... Also that means we can workerize only the GeoJSONLoader.
Interesting. I think I overlooked the potential for keeping so much of the existing code and generating Arrow columns from the typed array output. So my question about binary format stability was mainly based on thinking much code would need to be rewritten for Arrow, and then whether both output formats were desired. The one other question is when converting to Arrow whether it's desired to include string properties, so that all properties can be included in the Arrow Table. I haven't looked too deep into how strings are encoded; are they always encoded as a Dictionary type?
True. Also a
Ah I didn't realize the tessellation would be computed on CPU and not GPU? From the So what's the next step here? Create a GeoJSONLoader and support add a binary conversion option to that and the |
No. There is a
Yes GeoJsonLoader is a good start. For orthogonality, we should probably support this feature (binary output and worker loaders) for all GIS category loaders (includes at least |
Mostly implemented in 3.0 |
The loaders in GIS category, specifically
MVTLoader
could benefit from being able to parse data into flat binary arrays.The text was updated successfully, but these errors were encountered: