Skip to content

dfavenfre/Transfer-Learning-CNN-Fine-Tuning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Transfer-Learning-CNN-Tensorflow

This repository contains a TensorFlow-based implementation of transfer learning using EfficientNetV2 for a Food Vision dataset (with 10 classes). The primary focus is on fine-tuning EfficientNetV2B0 for image classification, utilizing callbacks to prevent overfitting and exponential learning rate scheduling for optimization.

Dataset

The dataset used in this project is a subset of the Food101 dataset, but instead of 101 labels, only 10 labels have been used. Each image corresponds to one of the food categories, and the dataset is split into training and testing sets.

  • Training Set: Contains labeled images for model training.
  • Validation Set: A separate portion of the data for evaluating model performance during training.

Preprocessing

Images are preprocessed into a consistent shape and format using TensorFlow's image_dataset_from_directory, where images are automatically resized to the model's expected input shape (224x224) and batched for efficient processing.

train_data = tf.keras.preprocessing.image_dataset_from_directory(
    directory=_TRAINING_SET,
    image_size=_IMG_SIZE,
    label_mode="categorical",
    labels="inferred",
    batch_size=_BATCH_SIZE,

)

test_data = tf.keras.preprocessing.image_dataset_from_directory(
    directory=_VALIDATION_SET,
    image_size=_IMG_SIZE,
    label_mode="categorical",
    labels="inferred",
    batch_size=_BATCH_SIZE,

)

Training

Transfer Learning With EfficientNetV2

The EfficientNetV2B0 model is used as the base model for transfer learning. The pre-trained weights on ImageNet are used, with the top classification layers removed. A global average pooling layer and a dense layer with softmax activation are added to predict the 10 food categories. The base model is initially frozen, and only the newly added layers are trained.

def train_model_0() -> tf.keras.Model:

    EfficientNetV2B0_model = tf.keras.applications.efficientnet_v2.EfficientNetV2B0(
        include_top=False
        )
    EfficientNetV2B0_model.trainable=False
    input_layer = tf.keras.layers.Input(
    shape=(224,224,3),
    name="Input_Layer"
    )
    x = EfficientNetV2B0_model(input_layer)
    x = tf.keras.layers.GlobalAveragePooling2D(
        name="GAP2D_1"
    )(x)

    output_layer = tf.keras.layers.Dense(
        units=_OUTPUT_SIZE,
        activation="softmax",
        name="Ouput_Layer",
        kernel_regularizer=tf.keras.regularizers.L1()
    )(x)

    model_0 = tf.keras.Model(
        input_layer,
        output_layer,

        )

    return model_0

Overfit Callback

The custom callback TrainingCheckPoint monitors the training process for signs of overfitting. It compares validation loss with training loss and stops training if overfitting persists beyond a defined patience.

class TrainingCheckPoint(Callback):
  def __init__(self, threshold: Optional[int] = 1, patience: Optional[int] = 5):
    super(TrainingCheckPoint, self).__init__()
    self.threshold = threshold
    self.patience = patience

  def on_epoch_end(self, epoch, logs):
    overfit_patience = 0
    overfit_ratio = logs["val_loss"] / logs["loss"]

    if self.threshold >= overfit_ratio:
      # self.model.save(f"model_{epoch}_{logs['val_accuracy']}.h5", overwrite=False)
      print(f"\ncurrent loss: {logs['loss']}\ncurrent validation loss: {logs['val_loss']}\n Epoch {epoch} was saved with {logs['val_accuracy']} accuracy")
    else:
      overfit_patience += 1
      print(f"Current overfitting epoch count {overfit_patience}")
      if overfit_patience >= self.patience:
        self.model.stop_training = True

Exponential Learning Rate with Epoch Warm-up

The learning rate is scheduled using exponential decay with an initial warm-up phase, allowing a gradual increase in the learning rate for the first few epochs and a slow decay afterward to stabilize training.

class ExpLRScheduler(Callback):
  def __init__(self, k: Optional[float] = 0.1):
    super(ExpLRScheduler, self).__init__()
    self.k = k

  def schedule_lr(self, epoch, lr):
    # Learning rate warm-up
    if epoch <= 8:
        return lr * math.exp((self.k * 0.125) * epoch)
    # LR exponential decay over k
    else:
        return lr * math.exp(-self.k (epoch / 512))

  def on_epoch_end(self, epoch, logs=None):
    updated_lr = self.schedule_lr(epoch, self.model.optimizer.lr.numpy())
    self.model.optimizer.lr.assign(updated_lr)
    print(f"*** Updated Learning Rate: {updated_lr} for epoch: {epoch + 1}")

Evaluation

Full Results

Validation Accuracy

W B Chart 9_30_2024, 3_28_25 PM

Validation Loss

W B Chart 9_30_2024, 3_29_14 PM

Training Loss

W B Chart 9_30_2024, 3_30_00 PM

Learning Rate

W B Chart 9_30_2024, 3_31_05 PM

Hyperparameter Importance

W B Chart 9_30_2024, 3_29_23 PM