Skip to content

Evaluation procedure

Andrey Ustyuzhanin edited this page Jul 20, 2015 · 1 revision

The evaluation procedure is complex in this challenge (details about evaluation Flavours of Physics: evaluation):

  • classifier must not distinguish simulated data and real data collected at LHC (agreement)
  • classifier's predictions mustn't correlate with reconstructed mass of the particle (correlation)
  • the quality is the weighted ROC AUC

Agreement and correlation can be checked on additional data sets (check_agreement.csv, check_correlation.csv). The file evaluation.py provides metrics functions for all evaluation stages.

evaluation.py

This module contains functions to compute the correlation metric (Cramer-von Mises test, or CvM), the agreement metric (Kolmogorov-Smirnov test, or KS) and the weighted ROC AUC:

  • compute_ks: it computes KS-metric
def compute_ks(data_prediction, mc_prediction, weights_data, weights_mc)

where data_predictions and mc_prediction are the vectors of classifier's output for real data (from LHC) and simulated data (Monte Carlo) corresponding, weights_data and weights_mc are vectors of their weights

  • compute_cvm: computes CvM-metric
def compute_cvm(predictions, masses, n_neighbours=200, step=50)

where predictions is a vector of probabilities (classifier output), masses is a vector of corresponding masses. Other parameters should be left by default (these values are used during check on kaggle server)

  • roc_auc_truncated: computes weighted ROC AUC (final quality)
def roc_auc_truncated(labels, predictions, tpr_thresholds=(0.2, 0.4, 0.6, 0.8),
                      roc_weights=(4, 3, 2, 1, 0)):

where labels is true labels vector, predictions is vector of classifier's output. For other parameters use default values: they define TPR bins (tpr_thresholds) and their weights (roc_weights).

Note, that weights (4, 3, 2, 1, 0) correspond to competition weights ((4, 3, 2, 1, 0) are weights for unnormalized weighted ROC AUC, and in the function normalization of the AUC to 1 for ideal classifier is done)