Abstract and Introduction

There are eight notebooks here, and each notebook will address different types of problems based on different datasets. Although some datasets are not really reliable, it's worth trying and researching. For example, we approach this depression detection problem using the ensemble technique by combining all the best models trained in each notebook.

Notebook 1 - ML/Structured/Tabular/CSV
Notebook 2 - ML/Structured/Tabular/CSV
Notebook 3 - DL/Unstructured/Image/TIFF
Notebook 4 - DL/Unstructured/Image/PNG
Notebook 5 - DL/Unstructured/Image/PNG
Notebook 6 - DL/Unstructured/Text/CSV
Notebook 7 - DL/Unstructured/Text/CSV
Notebook 8 - DL/Unstructured/Text/CSV

Production Deliverables

Web App
Web GitHub

Notebook 1 - Link

Tabular 5-class classification problem - 10 questions to classify the person is normal or having mild, moderate, severe or extremely severe depression.

Steps including

Data analysis
Feature Engineering
Feature Selection
Data Preparation
Model Experiment
Model Evaluation
Model Export

6 types of models were being built

Naive Bayes (acc: 0.8722)
K Nearest Neighbour (acc: 0.8983)
Support Vector Machine (acc: 0.9576)
Decision Tree (acc: 0.8066)
Random Forest (acc: 0.9017)
Neural Network (acc: 0.9636)

The accuracy and time of each model were compared.

Data source: GitHub

Notebook 2 - Link

Tabular 2-class classification problem - 30 questions to classify the person is depression or non-depression.

1 type of model was being built

Neural Network (acc: 0.893)

Data source: GitHub

Notebook 3 - Link

No implementation yet.

The JAFFE dataset consists of 213 images of different facial expressions from 10 different Japanese female subjects. Each subject was asked to do 7 facial expressions (6 basic facial expressions and neutral) and the images were annotated with average semantic ratings on each facial expression by 60 annotators.

Data source: JAFFE

Notebook 4 - Link

Image binary (2-class) classification problem - Images with 3 color channels (RGB) were trained to perform binary classification on depression and non-depression classes.

Models built - Notebook 4

Self-defined CNN (acc: 0.505)
Efficient Net Fine Tuned (acc: 0.504)

Data source: Kaggle

Note: Dataset was being classified based on self-experienced (eg: Happy -> Non-depression) but not fully reliable

Notebook 5 - Link

The technique performed and data are the same as Notebook 4. The only difference is dataset is not processed based on self-experienced (eg: Happy -> Happy). This notebook is more towards emotion classification based on images.

Models built - Notebook 5

Self-defined CNN (acc: 0.3435)
Efficient Net Fine Tuned (acc: 0.4648)

Notebook 6 - Link

Text classification (28-class) problem - Text loaded from Go Emotions HuggingFace dataset to fit into a pretrained Bert tokenizer and model that classify the text emotion (e.g.: fear, embarrassment, happy...).

This model is a fine-tuned version of microsoft/xtremedistil-l6-h384-uncased on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.1234

Data source: HuggingFace

Notebook 7 - Link

Text classification (2-class) problem - Text loaded from HuggingFace dataset (self-pushed dataset from Kaggle) to fit into a pretrained Bert tokenizer and model that classify whether the text is depression or non-depression. \

This model is a fine-tuned version of microsoft/xtremedistil-l6-h384-uncased on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.1606

Accuracy: 0.9565

I have pushed the model to HuggingFace

Data Source:

Data have been preprocess - preprocess notebook and push to HuggingFace

Notebook 8 - Link

Unsupervised representational text generation problem - Approximately 100 rows of texts were collected from multiple data source. Fine-tune a pretrained Distilled GPT2 model from HuggingFace.

This model is a fine-tuned version of distilgpt2. It achieves the following results on the evaluation set:

Loss: 3.3740

This model couldn't be exported after several trials. So we decided to train a model pipeline to be able to generate 1000 suggestions and save them to a CSV file.

Generated Suggestions: See Generated Text

The model was pushed to HuggingFace Hub

Data source: GitHub

Data have been preprocess - preprocess notebook and push to HuggingFace Hub

Conclusion

Based on the work done:

Tabular - Notebook 1, Notebook 2, Notebook 3
Computer Vision - Notebook 4, Notebook 5
Natural Language Processing - Notebook 6, Notebook 7

We chose:

Notebook 1 - model 10
Notebook 2 - model 1
Notebook 3 - ❌
Notebook 4 - ❌
Notebook 5 - ❌
Notebook 6 - model 1
Notebook 7 - model 1
Notebook 8 - generated_suggestions (1000 records)

to build our web application.

Seem's like some models perform inaccurately, especially vision models; we are not going to choose the models to build our web applications. So only tabular and language model will be used in the web application.

We will be using Next.js and some of the amazing tools to build the web application.

Web App
Web GitHub

Reference

https://www.mayoclinic.org/diseases-conditions/depression/diagnosis-treatment/drc-20356013

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
data		data
metrics		metrics
model		model
.gitignore		.gitignore
Notebook 1.ipynb		Notebook 1.ipynb
Notebook 2.ipynb		Notebook 2.ipynb
Notebook 3.ipynb		Notebook 3.ipynb
Notebook 4.ipynb		Notebook 4.ipynb
Notebook 5.ipynb		Notebook 5.ipynb
Notebook 6.ipynb		Notebook 6.ipynb
Notebook 7.ipynb		Notebook 7.ipynb
Notebook 8.ipynb		Notebook 8.ipynb
README.md		README.md
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Abstract and Introduction

Production Deliverables

Notebook 1 - Link

Steps including

6 types of models were being built

Notebook 2 - Link

1 type of model was being built

Notebook 3 - Link

Notebook 4 - Link

Models built - Notebook 4

Notebook 5 - Link

Models built - Notebook 5

Notebook 6 - Link

Notebook 7 - Link

Notebook 8 - Link

Conclusion

Reference

About

Releases

Packages

Languages

ziqinyeow/Depression-ML-Problem

Folders and files

Latest commit

History

Repository files navigation

Abstract and Introduction

Production Deliverables

Notebook 1 - Link

Steps including

6 types of models were being built

Notebook 2 - Link

1 type of model was being built

Notebook 3 - Link

Notebook 4 - Link

Models built - Notebook 4

Notebook 5 - Link

Models built - Notebook 5

Notebook 6 - Link

Notebook 7 - Link

Notebook 8 - Link

Conclusion

Reference

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages