Kernel-based Online Anomaly Detection
Online Sequential Diagnosis of Network Anomalies
Student: Tarem Ahmed, Ph.D. Student Supervisor: Prof. Mark Coates
Description: High-speed backbones are regularly affected by network anomalies generated by a wide range of sources, from malicious denial-of-service attacks and viruses to harmless large data transfers and accidental equipment failures. Different types of anomalies affect the network in different ways, and it is difficult to know a priori how a potential anomaly will exhibit itself in traffic statistics.
The goal of this project is to develop sequential, learning algorithms for anomaly detection that are suitable for online use with multivariate data. Most prior work in anomaly detection has used block-based methods such as Principal Component Analysis (PCA), which are only suitable for offline applications, requiring waits of up to hours before alerts occur.
We suggest an alternative approach and propose an online, recursive algorithm that detects anomalies in multivariate network-wide data within minutes. We propose a Kernel-based Online Anomaly Detection (KOAD) algorithm that assumes no prior model for normal or anomalous network traffic. Instead, it sequentially builds up a dictionary of features that approximately spans the subspace of normal network behaviour. The dictionary is dynamic, and it adapts to variations in the structure of normal traffic itself. The algorithm then raises an alarm immediately upon encountering a deviation from the norm.
The data and a zip file containing the Matlab source code to run our algorithms, are available here: Abilene.mat & KOADcode.zip. The code package includes the following files: "KOAD.m" which is to be used with the "kernel.m" function; "PCA.m"; "OCNM.m" which is to be used with the "M1.m" sparsity function. Another zip file containing the code used to generate the figures can be found here: Figs.zip. To obtain the figures, we run KOAD, PCA and OCNM on Abilene.mat to produce the following files (storing the experimental results): "Fig1.mat"; "Fig2.mat"; "Fig3a.mat"; "Fig3b.mat". The Matlab code to subsequently plot the figures from the relevant experimental results file are: "Fig1.m"; "Fig2.m"; "Fig3a.m"; "Fig3b.m".