Skip to content

almarengo/gpt2-text-classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 

Repository files navigation

Text classification (sentiment Analysis) fine tuning GPT2 using Tensorflow

Text classification (sentiment analysis) on tweets using GPT2 and transfer learning

In this project we classify the tweets (sentiment analysis) like we did in this other project COVID Tweets Analysis (Notebook 3) where we used logistic regression and random forest.

The dataset can be found in the same repository.


🤖 GPT2 Model

GPT2 documnetation can be found on the Official Hugging Face page

In this exercise we are going to use TFGPT2Model: the bare GPT2 Model transformer outputting raw hidden-states without any specific head on top.

Th idea is to get the raw hidden states from the model (exit of the blue box, before two heads, in the image) and build a classifier on top of it.

We will use Tensorflow because it makes it easier and faster for coding.

Below is a quick example on how to use it, from the official documentation:

from transformers import GPT2Tokenizer, TFGPT2Model
import tensorflow as tf

tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = TFGPT2Model.from_pretrained("gpt2")

inputs = tokenizer("Hello, my dog is cute", return_tensors="tf")
outputs = model(inputs)

last_hidden_states = outputs.last_hidden_state


Conclusions

We tried to fine-tune GPT2 and compare its accurancy with our previous logistic and random forest models. Overall the model doesn't seem to outperform them. It's worth noticing that the model is overfitting so it would be nice to re-train it, playing with learning rate, weight decay and dropout rate.

About

Text classification (sentiment analysis) on tweets using GPT2 and transfer learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published