KolmogorovArnold Networks (KANs) are proposed as alternatives to MLPs, with learnable activation functions on edges instead of fixed ones on nodes. KANs outperform MLPs in accuracy and interpretability. KANs are visually intuitive and interact well with humans. They serve as valuable collaborators in discovering mathematical and physical laws, suggesting promising improvements over MLP-based deep learning models.
Dataset used is the heart disease dataset here which was collected and combined at one place to help advance research on CAD-related machine learning and data mining algorithms, and hopefully to ultimately advance clinical diagnosis and early treatment.
Additionally, I've also added some implementation with the IRIS dataset and MNIST dataset for image classification.
(Note - Images resized to 8x8 for ease of computation)
While MLPs have fixed activation functions on their nodes, KANs have learnable activation functions on their edges. This means that instead of fixed weights connecting nodes, KANs have functions that determine the strength of connections, typically represented as splines.
This change means that KANs don't have linear weights like MLPs. Instead, the nodes in a KAN simply add up the incoming signals without applying any non-linear functions. This difference can lead to significant improvements in performance and makes the network more interpretable.
Kolmogorov-Arnold Networks (KANs) work by learning both the structure of a problem and the functions within it.
- Structure Learning (External Degrees of Freedom): KANs, like MLPs, understand how different input features relate to each other and contribute to the output. They do this through layers of nodes connected by edges.
- Univariate Function Optimization (Internal Degrees of Freedom): Each edge in a KAN holds a learnable activation function, typically a spline. Splines are flexible, piecewise functions that can closely match complex univariate functions. During training, KANs adjust these spline activation functions to best fit the target function.
- Combining Strengths of Splines and MLPs: Splines excel in accuracy for low-dimensional functions and local adaptability but struggle with high-dimensional problems due to the curse of dimensionality. On the other hand, MLPs are better suited for high-dimensional problems but face challenges in optimizing univariate functions effectively.