This repository is dedicated to projects and some theoretical material that I used to get into topics of Computer Vision (CV) in a practical/efficient way.
- Image Classification
- Object Detection/Location
- Image Segmentation
- Transfer Learning
- Vizualization and Interpretability
- Vision Transformers & Vision Language Models (VLMS)
- Generative Models & Synthetic Data
- 3D Computer Vision
- Zero-shot Computer Vision
Find my notes: Part I, Part II
AI for Medical Diagnosis (DeepLearning.AI)
AI is transforming the practice of medicine. It’s helping doctors diagnose patients more accurately, make predictions about patients’ future health, and recommend better treatments. But how can AI be applied to medical imaging to diagnose diseases?
This course offered:
- Nuances of working with both 2D and 3D medical image data, for multi-class classification and image segmentation.
- Practical/theorical material of how to classify diseases in x-ray images and segment tumors in 3D MRI brain images.
- How to properly evaluate the performance of your models.
Advanced Computer Vision with TensorFlow (DeepLearning.AI & TensorFlow)
- Explore image classification, image segmentation, object localization, and object detection. Apply transfer learning to object localization and detection.
- Object detection models such as regional-CNN and ResNet-50, customize existing models, and build your own models to detect, localize, and label images.
- Implement image segmentation using variations of the fully convolutional network (FCN) including U-Net and Mask-RCNN to identify and detect numbers, pets, zombies, and more.
- Identify which parts of an image are being used by your model to make its predictions using class activation maps and saliency maps and apply these ML interpretation methods to inspect and improve the design of a famous network, AlexNet.
Prompt Vision Models (DeepLearning.AI & Comet)
- Image Generation: Generate images from text prompts using Stable Diffusion, adjusting hyperparameters (strength & guidance scale) for precise control over the outputs.
- Image Segmentation: With Meta’s SAM by prompting with coordinates and bounding boxes to accurately identify and separate objects within images.
- Object Detection: OWL-ViT for zero-shot object detection, prompting with natural language to detect specific objects and generate bounding boxes for precise isolation.
- In-painting: Combine image generation, segmentation, and detection techniques to replace or add objects within images seamlessly, ensuring smooth integration with existing content.
- Personalization (w Fine-tuning): DreamBooth to fine-tune diffusion models, associating text labels with specific objects to generate custom images based on provided pictures for personalized outputs.
Computer Vision Course (by Hugging Face)
- This course delves into the fundamentals of computer vision, covering essential topics such as image processing, convolutional neural networks, and vision transformers.
- It explores advanced concepts like multimodal models, vision-language models, and generative models, with a focus on both 2D and 3D computer vision tasks.
- Addresses emerging topics like model optimization, synthetic data, and zero-shot computer vision.
Course: More Info
Copyright of all materials in thoses courses belongs to DeepLearning.AI, TensorFlow and HuggingFace and can only be used or distributed for educational purpose. You may not use or distribute them for commercial purposes.
Here are links to my Computer Vision & Image Processing projects:
- ML\DL model for detecting drowsiness recognition based on facial image/video.
- Automate cell counting in microscopy images.
- (Private repo) Automate reading of total value of receipts by OCR, automatically select/extract region of total value over all possible numbers/regions, retrieve result in correct format.
- Brain tumor diagnostic app developed with Gradio. ViT fine-tuned for binary classification of brain scans.
With the aim of deepening my knowledge on topics that interest me most and that are more complex and require deeper knowledge to understand and master, I created additional repositories with notes and enthusiastic projects.