SigmoidAI · CalinCezar · Sep 9, 2025
diff --git a/article-face_verification/.gitignore b/article-face_verification/.gitignore
@@ -0,0 +1,3 @@
+venv
+__pycache__
+resnet_triplet.pth
diff --git a/article-face_verification/README.md b/article-face_verification/README.md
@@ -0,0 +1,50 @@
+# Face Verification from Scratch using ResNet-18
+
+Welcome to an article dedicated to Face Verification from Scratch using ResNet-18.
+Buckle up, prepare your coffee, and get ready to dive into one of the most exciting areas of modern computer vision: teaching machines to decide whether two faces belong to the same person.
+
+## Article Summary
+
+The proposed article focuses on building a face verification pipeline using the Labeled Faces in the Wild (LFW) dataset and ResNet-18 as the backbone for feature extraction. By combining triplet sampling with triplet loss, we train the network to generate discriminative embeddings, enabling robust comparison of facial identities.
+
+## Getting Started
+
+Follow the following steps to check how Deep Belief Networks work for classification
+of handwritten digits!
+
+
+1. **Create a New Project**: Create a new empty Python Project and navigate to the
+ project directory.
+
+    ```sh
+    cd face_verification_project
+    ```
+
+2. **Prepare Your Environment**: Before you begin, make sure you have a virtual 
+environment set up for your project. If not, create and activate a virtual environment:
+    for Linux/Mac
+    ```bash
+    source venv/bin/activate
+    ```
+
+    for Windows:
+
+    ```bash
+    venv\Scripts\activate
+    ```
+
+3. **Copy the source code**: Inside the empty directory, add the file with the provided
+source (`face_verification.py`) code and the requirements (`requirements.txt`).
+
+4. **Install Requirements**: Inside your virtual environment, install the required packages from the `requirements.txt` file:
+
+    ```sh
+    pip install -r requirements.txt
+    ```
+
+5. **Run the source code**: To run the source code provided for the DBN, use the command
+provided below. It can take some time, as the MNIST dataset should be installed.
+
+    ```sh
+    python face_verification.py
+    ```
diff --git a/article-face_verification/article.md b/article-face_verification/article.md
@@ -0,0 +1,219 @@
+<p align="center">
+    <img src="images/introduction_image.png" alt="Fast_Learning" style="width:900px;"/>
+<br><em>
+</p>
+<h1>Face Verification from Scratch</h1>
+<h2>Face Verification is technique in the computer vision branch used most of the times in biometric authentication that uses unique facial features to confirm an individual's identity by comparing a live image of a face to a stored template or document<h2>
+
+In this article, we build a face verification system from scratch, using:<br>
+* LFW dataset (Labeled Faces in the Wild)
+* Triplet sampling (anchor, positive, negative)
+* ResNet-18 backbone for embeddings
+* Triplet Loss for metric learning
+* Evaluation with ROC, AUC, and EER
+
+# Difference between Face Verification and Face Recognition
+
+Face verification is a 1:1 comparison that authenticates a person's identity by matching their live face against a pre-stored image of themselves, typically with their consent for a specific transaction. Face recognition is a broader 1:many comparison that identifies an individual by searching for a match within a large database of faces
+
+<p align="center">
+    <img src="images/FVerificationVSFRecognition.jpg" alt="Fast_Learning" style="width:900px;"/>
+<br><em>
+Source:https://learnopencv.com/face-recognition-an-introduction-for-beginners/
+</p>
+
+## Setting the enviroment
+
+```python
+import argparse, random
+import numpy as np
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+from torch.utils.data import Dataset, DataLoader
+import torchvision.models as models
+from sklearn.datasets import fetch_lfw_people
+from sklearn.metrics import roc_curve, auc
+```
+## LFW dataset
+
+**LFW dataset** (Labeled Faces in the Wild) is an open source dataset constitued of face photo and labeled with the persons name. We used
+this dataset to create pairs of the same person for training. Each face is labeled with the name of the person, with **1,680 people** having two or more distinct photos in the set. LFW was the **most widely** used facial recognition benchmark in the world, according to the Financial Times.
+
+```Python
+class LFWTriplet(Dataset):
+    def __init__(self, resize=0.5, color=True):
+        data = fetch_lfw_people(resize=resize, color=color, funneled=True)
+        self.images = data.images.astype(np.float32)
+        self.targets = data.target
+        self.names = data.target_names
+
+        if not color:
+            # shape (N,H,W) -> (N,1,H,W)
+            self.images = self.images[:, None, :, :] / 255.0
+        else:
+            # shape (N,H,W,3) -> (N,3,H,W)
+            self.images = np.transpose(self.images, (0,3,1,2)) / 255.0
+
+        self.class_to_indices = {}
+        for idx, label in enumerate(self.targets):
+            self.class_to_indices.setdefault(label, []).append(idx)
+
+    def __len__(self):
+        return len(self.images)
+
+    def __getitem__(self, idx):
+        anchor_img = torch.from_numpy(self.images[idx])
+        anchor_label = self.targets[idx]
+
+        # choose positive
+        pos_idx = random.choice(self.class_to_indices[anchor_label])
+        positive_img = torch.from_numpy(self.images[pos_idx])
+
+        # choose negative
+        neg_label = random.choice([l for l in self.class_to_indices.keys() if l != anchor_label])
+        neg_idx = random.choice(self.class_to_indices[neg_label])
+        negative_img = torch.from_numpy(self.images[neg_idx])
+
+        return anchor_img, positive_img, negative_img
+```
+
+<h3>
+Triplet sampling is how you pick (anchor, positive, negative) examples for metric learning with triplet loss(𝐴,𝑃,𝑁).
+* A is an "Anchor" image--a picture of a person.
+* P is a "Positive" image--a picture of the same person as the Anchor image.
+* N is a "Negative" image--a picture of a different person than the Anchor image.
+
+
+**The goal** of Triplet sampling is to make the embedding of the anchot closer to the Positive than Negative.
+
+$$
+|| f\left(A^{(i)}\right)-f\left(P^{(i)}\right)||_{2}^{2}+\alpha<|| f\left(A^{(i)}\right)-f\left(N^{(i)}\right)||_{2}^{2}
+$$
+
+
+You would thus like to minimize the following "triplet cost":
+
+$$\mathcal{J} = \sum^{m}_{i=1} \large[ \small \underbrace{\mid \mid f(A^{(i)}) - f(P^{(i)}) \mid \mid_2^2}_\text{(1)} - \underbrace{\mid \mid f(A^{(i)}) - f(N^{(i)}) \mid \mid_2^2}_\text{(2)} + \alpha \large ] \small_+ \tag{3}$$
+Here, the notation "$[z]_+$" is used to denote $max(z,0)$.
+</h3>
+
+```Python
+class TripletLoss(nn.Module):
+    def __init__(self, margin=1.0):
+        super().__init__()
+        self.margin = margin
+
+    def forward(self, anchor, positive, negative):
+        pos_dist = torch.norm(anchor - positive, dim=1)
+        neg_dist = torch.norm(anchor - negative, dim=1)
+        losses = F.relu(pos_dist - neg_dist + self.margin)
+        return losses.mean()
+```
+
+## ResNet-18
+
+<h3>Ok, the math part is done, we are going into the pretrained model. ResNet-18 is a convolutional neural network that is 18 layers deep. You can load a pretrained version of the network trained on more than a million images from the ImageNet database. The authors Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun introduce the idea of **residual block**.
+
+What is the **residual block**. </h3>
+```
+Input ──► Conv ─► BN ─► ReLU ─► Conv ─► BN ──┐─► ReLU
+     └───────────────────────────────────────┘
+                  skip connenction
+```
+
+The **skip connection** (bottom arrow) adds the input directly to the block’s output.  
+$$
+a^{[l+1]} = \text{ReLU}\big(W^{[l+1]} a^{[l]} + b^{[l+1]}\big)
+$$
+
+$$
+a^{[l+2]} = \text{ReLU}\big(W^{[l+2]} a^{[l+1]} + b^{[l+2]} + a^{[l]}\big)
+$$
+
+```Python
+class ResNetEmbedding(nn.Module):
+    def __init__(self, out_dim=128, in_ch=3):
+        super().__init__()
+        base = models.resnet18(weights=models.ResNet18_Weights.IMAGENET1K_V1)
+        if in_ch == 1:
+            base.conv1 = nn.Conv2d(1, 64, kernel_size=7, stride=2, padding=3, bias=False)
+        base.fc = nn.Identity()
+        self.base = base
+        self.fc = nn.Linear(512, out_dim)
+
+    def forward(self, x):
+        x = self.base(x)
+        x = self.fc(x)
+        return F.normalize(x, p=2, dim=1)
+```
+
+## Training and Evaluation
+
+<h3> We train with Adam optimizer and evaluate using:
+ROC Curve, AUC (Area Under Curve), EER (Equal Error Rate), Optimal Threshold for verification</h3>
+
+# Train
+```Python
+def train(model, loader, optimizer, criterion, device):
+    model.train()
+    running = 0.0
+    for a,p,n in loader:
+        a,p,n = a.to(device), p.to(device), n.to(device)
+        optimizer.zero_grad()
+        za, zp, zn = model(a), model(p), model(n)
+        loss = criterion(za, zp, zn)
+        loss.backward()
+        optimizer.step()
+        running += loss.item() * a.size(0)
+    return running / len(loader.dataset)
+```
+
+# Evaluate
+```Python
+@torch.no_grad()
+def evaluate(model, loader, device):
+    model.eval()
+    sims, labels = [], []
+    for a,p,n in loader:
+        a,p,n = a.to(device), p.to(device), n.to(device)
+        za, zp, zn = model(a), model(p), model(n)
+        # positives
+        sims.extend(F.cosine_similarity(za, zp).cpu().numpy())
+        labels.extend([1]*len(za))
+        # negatives
+        sims.extend(F.cosine_similarity(za, zn).cpu().numpy())
+        labels.extend([0]*len(za))
+    sims = np.array(sims)
+    labels = np.array(labels)
+    fpr, tpr, thresh = roc_curve(labels, sims)
+    roc_auc = auc(fpr, tpr)
+
+    # Find threshold where FPR ~ FNR
+    fnr = 1 - tpr
+    idx = np.nanargmin(np.abs(fnr - fpr))
+    eer = (fpr[idx] + fnr[idx]) / 2.0
+    thr = thresh[idx]
+    return float(roc_auc), float(eer), float(thr)
+```
+
+<p align="center">
+    <img src="images/plot_loss.png" alt="Fast_Learning" style="width:800px;"/>
+<br><em>
+</p>
+
+## Run test cases
+
+```Python
+python test_case.py
+```
+
+<p align="center">
+    <img src="images/test_case_results.png" alt="Fast_Learning" style="width:800px;"/>
+<br><em>
+</p>
+
+# References
+
+* He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep Residual Learning for Image Recognition. https://arxiv.org/pdf/1512.03385
+* LFW dataset official page: http://vis-www.cs.umass.edu/lfw/
diff --git a/article-face_verification/images/FVerificationVSFRecognition.jpg b/article-face_verification/images/FVerificationVSFRecognition.jpg
diff --git a/article-face_verification/images/introduction_image.png b/article-face_verification/images/introduction_image.png
diff --git a/article-face_verification/images/plot_loss.png b/article-face_verification/images/plot_loss.png
diff --git a/article-face_verification/images/test_case_results.png b/article-face_verification/images/test_case_results.png
diff --git a/article-face_verification/images/test_image1.jpg b/article-face_verification/images/test_image1.jpg
diff --git a/article-face_verification/images/test_image2.jpg b/article-face_verification/images/test_image2.jpg
diff --git a/article-face_verification/images/test_image3.jpg b/article-face_verification/images/test_image3.jpg