G2 Review clustering using LLMs

This project is a proof-of-concept demonstraing how you can use LLMs to perform competitive intelligence on customer reviews and feedback.

In this scenario, we're taking G2 reviews and performing topic modelling in a simple streamlit app.

The overall (processing) pipeline is as follows:

Get the G2 company reviews for your target companies (manual step, instructions below)
Basic data reshaping from resulting json (preprocess.py)
Split reviews into sentences
Embed sentences
Reduce dimensionality (slightly) and cluster sentences
Find N points close to the center of each cluster and stuff them in the LLM to extract the topics
Reduce dimensions to 2D in order to visualize

Getting Started

Prerequisites

Rename `.env.example` to `.env`.

Setting the OPENAI_API_KEY is mandatory
If you want to fetch a new set of companies you need to set APIFY_API_TOKEN, otherwise, it will use the sample G2 reviews in the repo.

Installation

1. Clone the repository:

   git clone https://github.com/balmasi/g2_reviews_llm_topic_modeling

2. Create a virtual environment

The easiest way to do this is to use Conda.

# Create the g2_reviews_topic_modeling_llm virtual environment
conda create -n g2_reviews_topic_modeling_llm python=3.10
# Activate the virtual environment
conda activate g2_reviews_topic_modelling_llm

3. Install the required dependencies:

pip install -r requirements.txt

Getting the G2 Company reviews

Browse to your target G2 profiles to grab the slug from the url. For example https://www.g2.com/products/vena/reviews would be vena
Place each target company on a line in the data/slugs-to-fetch.txt file.
Set the APIFY_API_TOKEN in the .env file to your Apify API token
run the create_dataset.py command using python data/create_dataset.py

Running the App

To run the app, execute the following command:

streamlit run streamlit_app.py

This will start the Streamlit server and launch the app in your default web browser.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.streamlit		.streamlit
data		data
src		src
.env.example		.env.example
.gitignore		.gitignore
.langchain.db		.langchain.db
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
streamlit_app.py		streamlit_app.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

G2 Review clustering using LLMs

Getting Started

Prerequisites

Rename `.env.example` to `.env`.

Installation

1. Clone the repository:

2. Create a virtual environment

3. Install the required dependencies:

Getting the G2 Company reviews

Running the App

About

Releases

Packages

Languages

License

balmasi/g2_reviews_llm_topic_modeling

Folders and files

Latest commit

History

Repository files navigation

G2 Review clustering using LLMs

Getting Started

Prerequisites

Rename .env.example to .env.

Installation

1. Clone the repository:

2. Create a virtual environment

3. Install the required dependencies:

Getting the G2 Company reviews

Running the App

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Rename `.env.example` to `.env`.

Packages