-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RANSAC Outlier Detection #51
Conversation
e337fa6
to
7359715
Compare
1f26ada
to
befd2ae
Compare
which removes outliers before fitting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! looks great
albatross/models/gp.h
Outdated
|
||
template <typename FeatureType> | ||
inline GaussianProcessFit<FeatureType> | ||
fit_from_covariance(const std::vector<FeatureType> &features, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make this a constructor of GaussianProcessFit<T>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yeah, good idea.
albatross/models/ransac_gp.h
Outdated
} | ||
|
||
protected: | ||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah definitely. sometimes comments are helpful not for what they say, but what they don't say.
} | ||
|
||
double threshold; | ||
std::size_t min_inliers; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add these to a RansacConfig struct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah. I'll likely end up throwing out this entire class :) but a config struct would probably be useful in that case as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, it's not clear to me where to put the config struct. Assuming the config looked like this ...
struct RansacConfig {
double inlier_threshold;
std::size_t min_inliers;
std::size_t min_features;
std::size_t max_iterations;
};
Do you think it should
- Live in
outlier.h
(so at the low level interface to ransac). In this case I'd either want to change the function signatures on all the ransac functions to look like:
ransac(const RegressionDataset<FeatureType> &dataset,
const FoldIndexer &fold_indexer, RegressionModel<FeatureType> *model,
EvaluationMetric<PredictType> &metric, RansacConfig &config)
or maintain multiple function signatures one that takes a config one that takes raw arguments but that both do the same thing.
-
live in
ransac_gp.h
In which case we'd have the same decision about whether there should be a single constructor ofRansacGaussianProcess
or multiple that all end up doing the same thing. -
live inside
RansacGaussianProcess
which would mostly just add as a conceptual aid which would tell people looking at the code "these class members deal with configuration of the ransac algorithm not the gaussian process".
Personally I'd vote for 3 mostly because it's the easiest :) but also because I think I prefer flat arguments,
ransac(fitter, outlier_metric, model_metric, inlier_threshold, min_inliers, min_features, max_iterations);
over nested arguments
RansacConfig ransac_config(inlier_threshold, min_inliers, min_features, max_iterations);
ransac(fitter, outlier_metric, model_metric, ransac_config);
but I'm open to any of those three options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
decided offline to just leave any of the RansacConfig
logic to anything that wants to use the ransac
function. Ie, that logic will not live in albatross
.
albatross/outlier.h
Outdated
const GroupIndexer &indexer) { | ||
|
||
Indexer output; | ||
for (const auto i : subset_indices) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
&i
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch
std::default_random_engine gen; | ||
|
||
Indexer reference; | ||
double best_metric = HUGE_VAL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use an optional here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't included optional
in albatross
yet. There have been a few use cases ... but I'm not sure enough that it seems worth the added dependency.
ffd5386
to
58df994
Compare
Adds a method
ransac
which performs random sample consensus outlier detection.The core implementation takes functions which provide a way of fitting a model, identifying inliers and evaluating a consensus set. These functions are then used to run though the process of creating a model using a small random subset of data, finding any other data points that agree with that model and repeating keeping track of the set of "inliers" that perform best.