-
Notifications
You must be signed in to change notification settings - Fork 944
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New @turf/clusters module (getCluster, clusterEach, clusterReduce) #847
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this module would be mostly used after turf-cluster-kmeans or another geometrical clustering algorithm, so it's kind of confusing on its own, because it essentially is a filter module? Seems like something that's mostly to be solved in the documentation, like ideally this would show how it interacts with the modules that assign clusters, rather than creating geojson with cluster attributes from scratch, since I think using it as a step after kmeans would be more common than what the docs currently show.
packages/turf-clusters/index.js
Outdated
if (!geojson) throw new Error('geojson is required'); | ||
if (filter === undefined || filter === null) throw new Error('filter is required'); | ||
if (geojson.type === 'FeatureCollection') geojson = geojson.features; | ||
if (!Array.isArray(geojson)) throw new Error('invalid geojson'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like an appropriate place to use turf/meta to handle any sort of data. This currently wouldn't support valid GeoJSON constructs like bare feature objects, and would support invalid geojson constructs, like an array of features.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@turf/invariant
right? To test for a valid FeatureCollection.
would support invalid geojson constructs, like an array of features.
Agreed.. 👍 That's my "cheeky" style, i'll drop it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like an appropriate place to use turf/meta to handle any sort of data.
Oh, you mean featureEach
in @turf/meta
? To add support for Feature
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, noticed that the docs say it's a FeatureCollection specifically - so, yep! turf/invariant would be useful to enforce that requirement. turf/meta would be useful if the docs promised that the input was any geojson object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Originally I had it as FeatureCollection|Feature[]
(FeatureCollection|Array<Feature>
), now that we've dropped the Array of features (since it's not a valid GeoJSON), there isn't much complex validation to do which would require using @turf/invariant
. Using geojsonType
is already being handled here:
if (geojson.type !== 'FeatureCollection') throw new Error('geojson must be a FeatureCollection');
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did however add featureEach
from @turf/meta
to handle the iteration of the FeatureCollection.
Agreed, this module will be used after Note:
Agreed! 😄 Kinda wanted to call it
Agreed, once we have |
Based on @tmcw's review #847 (review)
Based on @tmcw's review #847 (comment)
@tmcw 👍 Thanks for the review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Chapeau @DenisCarriere! Great implementation 👍 😃
**Parameters** | ||
|
||
- `geojson` **[FeatureCollection](http://geojson.org/geojson-spec.html#feature-collection-objects)** GeoJSON Features | ||
- `filter` **Any** Filter used on GeoJSON properties to get Cluster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I think about it, only in k-means
the number of clusters is known before applying the module (which often is considered its drawbacks), so you can easily "query" for a specific cluster id
for example; on the other side if you apply dbscan
or other algorithms you don't know a priori how many clusters will be identified.
It would be useful, I guess, to have a method that returns the total number of clusters in the FeatureCollection
(basically how many groups of points have the same property with different values); it could be this one when passed null
or 'all'
as filter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This type of questions would be better answered using clusterEach
or clusterReduce
.
Calculating how many total clusters you could use clusterReduce
and previousValue++
which would give you the total, or if you want to figure all of the values which created a bin you could also use clusterReduce
.
I'll add some more examples to
clusterReduce
&clusterEach
]); | ||
|
||
// Create a cluster using K-Means (adds `cluster` to GeoJSON properties) | ||
var clustered = turf.clustersKmeans(geojson); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although, as @tmcw said, their primary use is definitely on actual clustered points, I mean as output of a clustering module, these functions could be useful to identify any group of points with a common property, even among points where not all have said property (like identify all points with a certain marker-symbol
).
I think it could be beneficial to the user to see examples that are not necessarily related to clustered (in the above mentioned meaning) points, in order to suggest other uses and applications for this module's functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, feel free to change some of these examples or add a 2nd example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅ Added
var geojson = turf.featureCollection([
turf.point([0, 0], {'marker-symbol': 'circle'}),
turf.point([2, 4], {'marker-symbol': 'star'}),
turf.point([3, 6], {'marker-symbol': 'star'}),
turf.point([5, 1], {'marker-symbol': 'square'}),
turf.point([4, 2], {'marker-symbol': 'circle'})
]);
// Create a cluster using K-Means (adds `cluster` to GeoJSON properties)
var clustered = turf.clustersKmeans(geojson);
// Retrieve first cluster (0)
var cluster = turf.getCluster(clustered, {cluster: 0});
//= cluster
// Retrieve cluster based on custom properties
turf.getCluster(clustered, {'marker-symbol': 'circle'}).length;
//= 2
turf.getCluster(clustered, {'marker-symbol': 'square'}).length;
//= 1
packages/turf-clusters/package.json
Outdated
"author": "Turf Authors", | ||
"contributors": [ | ||
"Denis Carriere <@DenisCarriere>", | ||
"Stefano Borghi <@stebogit>" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't really do anything in here, it's all your own work 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got preserved from the last module, I can drop you ;) lol
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅ Dropped you 😄
packages/turf-clusters/index.js
Outdated
getCluster: getCluster, | ||
clusterEach: clusterEach, | ||
clusterReduce: clusterReduce, | ||
createBins: createBins, // Only exposed for testing purposes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DenisCarriere out of curiosity, how do you exclude these functions from the official release (if that's the plan) if they are exported here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going to comment them out, or... we leave them there and no document their usage.
At the moment all of these utils
functions have tests cases for them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Come to think about it, these are ONLY going to be exposed in the @turf/clusters
module, however they will not be exposed to @turf/turf
(which is the main repo that most users will be using).
The only people who will actually know that these methods are exposed will be the ones that dive into the source code or require('@turf/clusters
)` and see which methods the module exposes.
Since there's test cases for each of these functions already I'm going to keep them exposed, I don't see any benefit "commented" them out for the sake of it.
- Add more `@example` - Update Typescript - Dropped @stebogit from contributors :) - Add notes about exposed methods
Incredible work! 🎉 |
Thanks @morganherlocker! |
New
@turf/clusters
moduleRef: #845
Methods
getCluster
clusterEach
clusterReduce
To-Do
@turf/clusters-kmeans
for@example
(improve documentation)@turf/helpers
&@turf/meta
for GeoJSON compatibilityJSDocs -
getCluster
JSDocs -
clusterEach
JSDocs -
clusterReduce
Benchmark Results