Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RANSAC Outlier Detection #51

Merged
merged 2 commits into from
Oct 4, 2018
Merged

RANSAC Outlier Detection #51

merged 2 commits into from
Oct 4, 2018

Conversation

akleeman
Copy link
Collaborator

@akleeman akleeman commented Aug 24, 2018

Adds a method ransac which performs random sample consensus outlier detection.

The core implementation takes functions which provide a way of fitting a model, identifying inliers and evaluating a consensus set. These functions are then used to run though the process of creating a model using a small random subset of data, finding any other data points that agree with that model and repeating keeping track of the set of "inliers" that perform best.

@akleeman akleeman changed the title RANSAC Outlier Detection WIP RANSAC Outlier Detection Aug 24, 2018
@akleeman akleeman force-pushed the ransac branch 2 times, most recently from e337fa6 to 7359715 Compare August 25, 2018 15:08
@akleeman akleeman force-pushed the ransac branch 8 times, most recently from 1f26ada to befd2ae Compare September 19, 2018 18:28
@akleeman akleeman changed the title WIP RANSAC Outlier Detection RANSAC Outlier Detection Sep 19, 2018
Copy link
Contributor

@benjamin0 benjamin0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! looks great


template <typename FeatureType>
inline GaussianProcessFit<FeatureType>
fit_from_covariance(const std::vector<FeatureType> &features,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this a constructor of GaussianProcessFit<T>?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yeah, good idea.

}

protected:
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah definitely. sometimes comments are helpful not for what they say, but what they don't say.

}

double threshold;
std::size_t min_inliers;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add these to a RansacConfig struct?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah. I'll likely end up throwing out this entire class :) but a config struct would probably be useful in that case as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it's not clear to me where to put the config struct. Assuming the config looked like this ...

struct RansacConfig {
  double inlier_threshold;
  std::size_t min_inliers;
  std::size_t min_features;
  std::size_t max_iterations;
};

Do you think it should

  1. Live in outlier.h (so at the low level interface to ransac). In this case I'd either want to change the function signatures on all the ransac functions to look like:
ransac(const RegressionDataset<FeatureType> &dataset,
       const FoldIndexer &fold_indexer, RegressionModel<FeatureType> *model,
       EvaluationMetric<PredictType> &metric, RansacConfig &config)

or maintain multiple function signatures one that takes a config one that takes raw arguments but that both do the same thing.

  1. live in ransac_gp.h In which case we'd have the same decision about whether there should be a single constructor of RansacGaussianProcess or multiple that all end up doing the same thing.

  2. live inside RansacGaussianProcess which would mostly just add as a conceptual aid which would tell people looking at the code "these class members deal with configuration of the ransac algorithm not the gaussian process".

Personally I'd vote for 3 mostly because it's the easiest :) but also because I think I prefer flat arguments,

ransac(fitter, outlier_metric, model_metric, inlier_threshold, min_inliers, min_features, max_iterations);

over nested arguments

RansacConfig ransac_config(inlier_threshold, min_inliers, min_features, max_iterations);
ransac(fitter, outlier_metric, model_metric, ransac_config);

but I'm open to any of those three options.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decided offline to just leave any of the RansacConfig logic to anything that wants to use the ransac function. Ie, that logic will not live in albatross.

const GroupIndexer &indexer) {

Indexer output;
for (const auto i : subset_indices) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

&i?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch

std::default_random_engine gen;

Indexer reference;
double best_metric = HUGE_VAL;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use an optional here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't included optional in albatross yet. There have been a few use cases ... but I'm not sure enough that it seems worth the added dependency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants