Cisco Documentation RAG LLM

LLM Documentation Demo for Nexus Dashboard

A demo application showcasing how to run a local LLM on your own hardware. Includes samples that leverage open-source libraries (llama.cpp) and models (llama), as well as documentation from Nexus Dashboard.

Installation

First clone the project and navigate into project directory

git clone https://github.com/ndavidson19/ciscolive.git
cd ciscolive/ciscolive-demo/documentation-llm

Next you must download the modelfile. Huggingface has so many models to choose from and all have very elaborate names. We will be choosing a DPO finetuned version of StableLM. This small 3B model punches above its weight when it comes to RAG applications.

Rocket 3B

Next create a directory called llm in the backend folder

cd /cisco-live/documentation-llm/backend
mkdir llm

Then move the modelfile to the correct directory ciscolive/ciscolive-demo/documentation-llm/backend/llm/dolphin-2.6-mistral-7b-dpo-laser.Q4_K_M.gguf

cd /cisco-live/documentation-llm/backend
mkdir llm

Usage

This entire application has been dockerized and can be run with just

docker-compose up --build

This starts three different services.

The Vector Datastore (pgvector)
- This pulls a postgres image from ankane/pgvector that installs the correct extensions for allow vectors within postgres.
The Flask serving APIs and VectorDB insertion
- This service starts a flask API endpoint route (/get_message) on port :5000 that allows for a user to send queries to the LLM being served using LlamaCPP (https://github.com/abetlen/llama-cpp-python) using script at /backend/main.py
- This service also parses the pdf living in /training/pdfs/ using /training/pdf.py and then inserts it into the database using /training/db-embeddings.py
The UI service
- Uses nginx to start a basic webserver for the basic index.html file

Note: This is a very simplistic scaled down version of our full architecture we are running in production and should be treated as a starting point. Look into the llama-cpp-python OpenAI compatible webserver if you are going to be creating your own application.

Manual Usage

It is recommended to create a virtual-env before installing dependencies. Or use a dependency manager such as anaconda. Ex.

python3 -m venv venv_name
source venv_name/bin/activate

pip install -r requirements.txt

Next you must download the modelfile. Rocket 3B

Next move the modelfile to the correct directory /cisco-live/documentation-llm/backend/llm/llama-2-7b-chat.Q4_K_M.gguf

cd /cisco-live/documentation-llm/backend
mkdir llm

Deployment Scripts

Database Setup:

Run the PostgreSQL vector extension for embeddings:

docker pull ankane/pgvector
docker run -p 5432:5432 -e POSTGRES_PASSWORD=secret -e POSTGRES_USER=postgres ankane/pgvector

Training Pipeline:
- Navigate to the training directory.
- Run pdf.py to parse PDFs and db-embeddings.py to store embeddings:
```
python pdf.py
python db-embeddings.py
```
Start the Backend:
- Use llama-cpp-python OpenAI compatible webserver for managing model serving.
```
python3 -m llama_cpp.server --config_file /<USER_PATH>/documentation-llm/backend/llm/config.json
```
- Start the backend services located in backend/inference:
```
python main.py
```

Load UI (html)

Run the below command in the root directory of the project.

python -m http.server

Navigate to http://localhost:8000/ in your browser. To load the UI you just need to open the index.html file that lives in the cisco-live/documentation-llm/ui directory.

You should be all set to start asking questions!

Licensing info

A license is required for others to be able to use your code. An open source license is more than just a usage license, it is license to contribute and collaborate on code. Open sourcing code and contributing it to Code Exchange requires a commitment to maintain the code and help the community use and contribute to the code. More about open-source licenses

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
ciscolive-demo		ciscolive-demo
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cisco Documentation RAG LLM

LLM Documentation Demo for Nexus Dashboard

Installation

Usage

Manual Usage

Deployment Scripts

Load UI (html)

Licensing info

About

Releases

Packages

Languages

License

isheriff123/ciscolive

Folders and files

Latest commit

History

Repository files navigation

Cisco Documentation RAG LLM

LLM Documentation Demo for Nexus Dashboard

Installation

Usage

Manual Usage

Deployment Scripts

Load UI (html)

Licensing info

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages