Text classification (sentiment Analysis) fine tuning GPT2 using Tensorflow

Text classification (sentiment analysis) on tweets using GPT2 and transfer learning

In this project we classify the tweets (sentiment analysis) like we did in this other project COVID Tweets Analysis (Notebook 3) where we used logistic regression and random forest.

The dataset can be found in the same repository.

🤖 GPT2 Model

GPT2 documnetation can be found on the Official Hugging Face page

In this exercise we are going to use TFGPT2Model: the bare GPT2 Model transformer outputting raw hidden-states without any specific head on top.

Th idea is to get the raw hidden states from the model (exit of the blue box, before two heads, in the image) and build a classifier on top of it.

We will use Tensorflow because it makes it easier and faster for coding.

Below is a quick example on how to use it, from the official documentation:

from transformers import GPT2Tokenizer, TFGPT2Model
import tensorflow as tf

tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = TFGPT2Model.from_pretrained("gpt2")

inputs = tokenizer("Hello, my dog is cute", return_tensors="tf")
outputs = model(inputs)

last_hidden_states = outputs.last_hidden_state

Conclusions

We tried to fine-tune GPT2 and compare its accurancy with our previous logistic and random forest models. Overall the model doesn't seem to outperform them. It's worth noticing that the model is overfitting so it would be nice to re-train it, playing with learning rate, weight decay and dropout rate.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
image		image
GPT2_Transfer_Learning_final.ipynb		GPT2_Transfer_Learning_final.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Text classification (sentiment Analysis) fine tuning GPT2 using Tensorflow

🤖 GPT2 Model

Conclusions

About

Uh oh!

Releases

Packages

Languages

almarengo/gpt2-text-classification

Folders and files

Latest commit

History

Repository files navigation

Text classification (sentiment Analysis) fine tuning GPT2 using Tensorflow

🤖 GPT2 Model

Conclusions

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages