Code for NeurIPS 2023 paper - FeCAM: Exploiting the Heterogeneity of Class Distributions in Exemplar-Free Continual Learning
This work studies Class-Incremental Learning (CIL) in both normal supervised settings using sufficient training samples (we call it Many-Shot CIL - MSCIL) as well as in few-shot settings using only 5 samples per class for all new classes added after the first step (Few-Shot CIL - FSCIL). We also study the CIL settings using pretrained ViTs.
Exemplar-free class-incremental learning (CIL) poses several challenges since it prohibits the rehearsal of data from previous tasks and thus suffers from catastrophic forgetting. Recent approaches to incrementally learning the classifier by freezing the feature extractor after the first task have gained much attention. In this paper, we explore prototypical networks for CIL, which generate new class prototypes using the frozen feature extractor and classify the features based on the Euclidean distance to the prototypes. In an analysis of the feature distributions of classes, we show that classification based on Euclidean metrics is successful for jointly trained features. However, when learning from non-stationary data, we observe that the Euclidean metric is suboptimal and that feature distributions are heterogeneous. To address this challenge, we revisit the anisotropic Mahalanobis distance for CIL. In addition, we empirically show that modeling the feature covariance relations is better than previous attempts at sampling features from normal distributions and training a linear classifier. Unlike existing methods, our approach generalizes to both many- and few-shot CIL settings, as well as to domain-incremental settings. Interestingly, without updating the backbone network, our method obtains state-of-the-art results on several standard continual learning benchmarks.
FeCAM Implementation using pre-trained models is now available in PILOT.
Refer to for the method and fecam.json for setting the configurations.
FeCAM Implementation is also available in Avalanche
Refer to for the FeCAM classifier code and for the utils with adiitional settings to explore using FeCAM classifier with memory buffer and also the oracle setting (upper bound when computing mean and covariance matrix from all old data seen so far).
Note - If you are using FeCAM classifier with a pre-trained ViT, make sure to not use the Tukey's transformation (see supplementary materials for more details).
The framework for many-shot CIL setting is taken from PyCIL.
We performed experiments for CIFAR100
, ImageNet100,
and TinyImageNet
. When training on CIFAR100
, this framework will automatically download it. When training on ImageNet100
or TinyImageNet
, you should specify the folder of your dataset in utils/
def download_data(self):
train_dir = '[DATA-PATH]/train/'
test_dir = '[DATA-PATH]/val/'
To download ImageNet-Subset dataset: Link
Edit the exps/[Model name].json to change the experiment settings.
Run the following command for FeCAM
python --config=exps/FeCAM_{dataset}.json
- memory-size: The total exemplar number in the incremental learning process. We do not need to store exemplars for FecAM.
- init-cls: The number of classes in the first incremental stage.
- increment: The number of classes in each incremental stage.
- convnet-type: The backbone network for the incremental model. We use
for all the experiments . - seed: The random seed adopted for shuffling the class order. According to the benchmark setting of PyCIL, it is set to 1993 by default.
- beta: The degree of feature transformation using Tukey’s Ladder of Powers Transformation.
- alpha1, alpha2: The hyperparameters for covariance shrinkage.
Other algorithm-specific hyperparameters can be modified in the corresponding json files. There are options to use NCM Classifier instead of FeCAM.
The trained weights for the first task used in our experiments can be found here
Download the ImageNet-R and CoRe50 datasets.
- Run the following command:
python FeCAM_vit_{dataset}.py
- The hyperparameters can be modified in the corresponding python files. To try the NCM classifier, run the following command:
python NCM_vit_{dataset}.py
FeCAM can be used in combination with differnt few-shot learning approaches. In our paper, we use FeCAM with two recent works, ALICE and FACT.
The code can be used as a plug-in in different codebases by adding the two components: the FeCAM classifier from models/ and a utils function which performs the transformations and computes the covariance matrices like in utils/