You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In RCF3, we plan to drop the DynamicScoringRandomCutForest class. This class added API methods to the base RandomCutForest class that allowed users to customize the anomaly scoring algorithms by directly passing lambdas to the method. The intention of these methods was to encourage experimentation with the anomaly score algorithm, but I believe these methods are too specialized and not used frequently enough to appear in the API of RandomCutForest. Users have always had the capability to create specialized scoring routines through the traverseForest method, but the signature on this method may be a little daunting to new users. I propose to split the difference and create a new API method that takes a VisitorFactory<Double> or Supplier<Visitor<Double>> and averages the results of visitors across trees.
Current method in DynamicScoringRandomCutForest and example usage:
public double getDynamicScore(double[] point, int ignoreLeafMassThreshold, BiFunction<Double, Double, Double> seen,
BiFunction<Double, Double, Double> unseen, BiFunction<Double, Double, Double> damp);
public double getDynamicSimulatedScore(double[] point, BiFunction<Double, Double, Double> seen,
BiFunction<Double, Double, Double> unseen, BiFunction<Double, Double, Double> damp,
Function<IBoundingBoxView, double[]> vecSep);
double result = forest.getDynamicScore(point, 0, (x, y) -> 1.0 * (x + Math.log(y)), (x, y) -> 1.0 * x, (x, y) -> 1.0);
Proposed methods and example usage:
public double getAverageScalarResult(double[] point, VisitorFactory<Double> visitorFactory);
public double getAverageScalarResult(double[] point, Supplier<Visitor<Double>> supplier);
double result = forest.getAverageScalarResult(point, (tree, y) -> new DynamicScoreVisitor(
tree.projectToTree(y), tree.getMass(), forest.getIgnoreLeafMassThreshold(), (x, y) -> 1.0 * (x + Math.log(y)), (x, y) -> 1.0 * x, (x, y) -> 1.0);
One issue, seen in the example, is that with the current implementation we have access to forest configuration fields like ignoreLeafMassThreshold. In the proposed implementation, this has to be passed into the closure, which fields a little awkward. We may also want to rethink VisitorFactory interface.
The text was updated successfully, but these errors were encountered:
In RCF3, we plan to drop the DynamicScoringRandomCutForest class. This class added API methods to the base RandomCutForest class that allowed users to customize the anomaly scoring algorithms by directly passing lambdas to the method. The intention of these methods was to encourage experimentation with the anomaly score algorithm, but I believe these methods are too specialized and not used frequently enough to appear in the API of RandomCutForest. Users have always had the capability to create specialized scoring routines through the
traverseForest
method, but the signature on this method may be a little daunting to new users. I propose to split the difference and create a new API method that takes aVisitorFactory<Double>
orSupplier<Visitor<Double>>
and averages the results of visitors across trees.Current method in DynamicScoringRandomCutForest and example usage:
Proposed methods and example usage:
One issue, seen in the example, is that with the current implementation we have access to forest configuration fields like
ignoreLeafMassThreshold
. In the proposed implementation, this has to be passed into the closure, which fields a little awkward. We may also want to rethinkVisitorFactory
interface.The text was updated successfully, but these errors were encountered: