Skip to content

feat: refactored tracker db folder structure. added alembic migrations #219

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
May 8, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
f1cd9ec
feat: refactored tracker db folder structure. added alembic migrations
MiNeves00 May 5, 2025
88b7231
added tracker logs integration test. added alembic check of migration…
MiNeves00 May 5, 2025
196e868
fix: fixed the test actions for alembic
MiNeves00 May 5, 2025
630871f
fix: fixed path for script on the test actions for alembic
MiNeves00 May 5, 2025
d1817a8
fix: addded tracker dependencies to integration test action
MiNeves00 May 5, 2025
c46ab31
fix: integration test action installation of tracker
MiNeves00 May 5, 2025
5439806
fix: integration test action cache poetry of tracker
MiNeves00 May 5, 2025
ef5dd14
chore: updated llmstudio lib poetry lock
MiNeves00 May 5, 2025
1292367
fix: integration test action changed working dir for alembic script
MiNeves00 May 5, 2025
b353e6d
fix: integration test action path for alembic.ini script
MiNeves00 May 5, 2025
890ba93
chore: changed the migration bash to python
MiNeves00 May 5, 2025
232ff13
chore: changed the migration bash to python and corrected path
MiNeves00 May 5, 2025
f968089
fix: action
MiNeves00 May 5, 2025
22ca8de
test: adding columns to check for integration tests for migrations
MiNeves00 May 5, 2025
c7dcfd5
fix: alembic was not recognizing changes
MiNeves00 May 5, 2025
948c00a
fix: alembic
MiNeves00 May 5, 2025
5e09f62
chore: removed testing columns for logs
MiNeves00 May 5, 2025
174b1d7
chore: added readme.md
MiNeves00 May 5, 2025
3a0f51f
feat: added extras to logs schema. alembic upgrades on tracker start …
MiNeves00 May 6, 2025
6d1484a
chore: added .env.template; changed llmstudio alembic default name
MiNeves00 May 6, 2025
37672df
chore: moved server alembic upgrade to utils; reverted poetry.lock of…
MiNeves00 May 8, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions .env.template
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
OPENAI_API_KEY="sk-proj-XXXXX"
ANTHROPIC_API_KEY="sk-XXXXX"
COHERE_API_KEY="XXXX"
GOOGLE_API_KEY="XXXX"
DECART_API_KEY="XXXX"
AI71_API_KEY="XXXX"
AI21_API_KEY="XXXX"
BEDROCK_ACCESS_KEY="XXXX"
BEDROCK_SECRET_KEY="r+"XXXX""
BEDROCK_REGION="us-west-2"
HUGGING_FACE_API_KEY="hf_"XXXX""
AZURE_API_KEY=""XXXX""
AZURE_API_ENDPOINT="https://XXXXX.openai.azure.com/"
AZURE_API_VERSION="2023-07-01-preview"
ENGINE_HOST="localhost"
ENGINE_PORT=8000
UI_HOST="localhost"
UI_PORT=3000
LOG_LEVEL="info"

#LLMSTUDIO_TRACKING_URI="postgresql://postgres:postgres@localhost:5433/tracker_db"
LLMSTUDIO_TRACKING_URI="sqlite:///./llmstudio_mgmt.db"
LLMSTUDIO_TRACKING_HOST="127.0.0.1"
LLMSTUDIO_TRACKING_PORT="50002"
LLMSTUDIO_ALEMBIC_TABLE_NAME="llmstudio_alembic_version"
17 changes: 15 additions & 2 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -82,19 +82,32 @@ jobs:
uses: actions/cache@v3
with:
path: ~/.cache/pypoetry
key: poetry-integration-${{ runner.os }}-${{ hashFiles('libs/llmstudio/poetry.lock') }}
key: poetry-integration-${{ runner.os }}-${{ hashFiles('libs/llmstudio/poetry.lock', 'libs/llmstudio/pyproject.toml') }}
restore-keys: |
poetry-integration-${{ runner.os }}-

# Install llmstudio
- name: Install llmstudio
working-directory: ./libs/llmstudio
run: |
poetry install
poetry install --extras tracker
INTEGRATION_ENV=$(poetry env info --path)
echo $INTEGRATION_ENV
echo "INTEGRATION_ENV=$INTEGRATION_ENV" >> $GITHUB_ENV

# Set Env vars for sqlite db
- name: Set hardcoded DB URI, HOST and PORT
run: |
echo "LLMSTUDIO_TRACKING_URI=sqlite:///./test_tracker.db" >> $GITHUB_ENV
echo "LLMSTUDIO_TRACKING_HOST=127.0.0.1" >> $GITHUB_ENV
echo "LLMSTUDIO_TRACKING_PORT=50002" >> $GITHUB_ENV

# Run Alembic migrations
- name: Run Alembic migrations
run: |
source ${{ env.INTEGRATION_ENV }}/bin/activate
poetry run alembic upgrade head

# Run Integration Tests
- name: Run Integration Tests
run: |
Expand Down
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -78,4 +78,4 @@ bun.lockb
llmstudio/llm_engine/logs/execution_logs.jsonl
*.db
.prettierignore
db

2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ repos:
hooks:
- id: autoflake
files: libs/
exclude: 'libs/core/llmstudio_core/providers/__init__.py|libs/llmstudio/llmstudio/providers/__init__.py'
exclude: 'libs/core/llmstudio_core/providers/__init__.py|libs/llmstudio/llmstudio/providers/__init__.py|libs/tracker/llmstudio_tracker/db/migrations/env.py|libs/tracker/llmstudio_tracker/base.py'
args:
- --remove-all-unused-imports
- --recursive
Expand Down
118 changes: 118 additions & 0 deletions alembic.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
# A generic, single database configuration.

[alembic]
# path to migration scripts
# Use forward slashes (/) also on windows to provide an os agnostic path
script_location = libs/tracker/llmstudio_tracker/db/migrations

# template used to generate migration file names; The default value is %%(rev)s_%%(slug)s
# Uncomment the line below if you want the files to be prepended with date and time
# see https://alembic.sqlalchemy.org/en/latest/tutorial.html#editing-the-ini-file
# for all available tokens
# file_template = %%(year)d_%%(month).2d_%%(day).2d_%%(hour).2d%%(minute).2d-%%(rev)s_%%(slug)s

# sys.path path, will be prepended to sys.path if present.
# defaults to the current working directory.
prepend_sys_path = .

# timezone to use when rendering the date within the migration file
# as well as the filename.
# If specified, requires the python>=3.9 or backports.zoneinfo library and tzdata library.
# Any required deps can installed by adding `alembic[tz]` to the pip requirements
# string value is passed to ZoneInfo()
# leave blank for localtime
# timezone =

# max length of characters to apply to the "slug" field
# truncate_slug_length = 40

# set to 'true' to run the environment during
# the 'revision' command, regardless of autogenerate
# revision_environment = false

# set to 'true' to allow .pyc and .pyo files without
# a source .py file to be detected as revisions in the
# versions/ directory
# sourceless = false

# version location specification; This defaults
# to alembic/versions. When using multiple version
# directories, initial revisions must be specified with --version-path.
# The path separator used here should be the separator specified by "version_path_separator" below.
# version_locations = %(here)s/bar:%(here)s/bat:alembic/versions

# version path separator; As mentioned above, this is the character used to split
# version_locations. The default within new alembic.ini files is "os", which uses os.pathsep.
# If this key is omitted entirely, it falls back to the legacy behavior of splitting on spaces and/or commas.
# Valid values for version_path_separator are:
#
# version_path_separator = :
# version_path_separator = ;
# version_path_separator = space
# version_path_separator = newline
#
# Use os.pathsep. Default configuration used for new projects.
version_path_separator = os

# set to 'true' to search source files recursively
# in each "version_locations" directory
# new in Alembic version 1.10
# recursive_version_locations = false

# the output encoding used when revision files
# are written from script.py.mako
# output_encoding = utf-8

sqlalchemy.url = placeholder

[post_write_hooks]
# post_write_hooks defines scripts or Python functions that are run
# on newly generated revision scripts. See the documentation for further
# detail and examples

# format using "black" - use the console_scripts runner, against the "black" entrypoint
# hooks = black
# black.type = console_scripts
# black.entrypoint = black
# black.options = -l 79 REVISION_SCRIPT_FILENAME

# lint with attempts to fix using "ruff" - use the exec runner, execute a binary
# hooks = ruff
# ruff.type = exec
# ruff.executable = %(here)s/.venv/bin/ruff
# ruff.options = check --fix REVISION_SCRIPT_FILENAME

# Logging configuration
[loggers]
keys = root,sqlalchemy,alembic

[handlers]
keys = console

[formatters]
keys = generic

[logger_root]
level = WARNING
handlers = console
qualname =

[logger_sqlalchemy]
level = WARNING
handlers =
qualname = sqlalchemy.engine

[logger_alembic]
level = INFO
handlers =
qualname = alembic

[handler_console]
class = StreamHandler
args = (sys.stderr,)
level = NOTSET
formatter = generic

[formatter_generic]
format = %(levelname)-5.5s [%(name)s] %(message)s
datefmt = %H:%M:%S
23 changes: 21 additions & 2 deletions examples/core.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,35 @@

from llmstudio_core.providers import LLMCore

from llmstudio.providers import LLM
from llmstudio_tracker.tracker import TrackingConfig
from llmstudio.server import start_servers

from pprint import pprint
import os
import asyncio
from dotenv import load_dotenv
import uuid
load_dotenv()

start_servers(proxy=False, tracker=True)

tracking_config = TrackingConfig(
host=os.environ["LLMSTUDIO_TRACKING_HOST"],
port=os.environ["LLMSTUDIO_TRACKING_PORT"]
)

session_id = str(uuid.uuid4())

use_logging = True


def run_provider(provider, model, api_key=None, **kwargs):
print(f"\n\n###RUNNING for <{provider}>, <{model}> ###")
llm = LLMCore(provider=provider, api_key=api_key, **kwargs)

if use_logging:
llm = LLM(provider=provider, api_key=api_key, session_id=session_id, tracking_config=tracking_config, **kwargs)
else:
llm = LLMCore(provider=provider, api_key=api_key, **kwargs)

latencies = {}
print("\nAsync Non-Stream")
Expand Down
78 changes: 78 additions & 0 deletions libs/llmstudio/tests/integration_tests/test_tracking_logs.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
import os
import uuid

import pytest

# Load .env
from dotenv import load_dotenv
from llmstudio.providers import LLM
from llmstudio.server import start_servers
from llmstudio_tracker.db.models.logs import LogDefault
from llmstudio_tracker.tracker import TrackingConfig
from sqlalchemy import create_engine, select
from sqlalchemy.orm import sessionmaker

load_dotenv()


DATABASE_URL = os.environ["LLMSTUDIO_TRACKING_URI"]
LLMSTUDIO_TRACKING_HOST = os.environ["LLMSTUDIO_TRACKING_HOST"]
LLMSTUDIO_TRACKING_PORT = os.environ["LLMSTUDIO_TRACKING_PORT"]

engine = create_engine(DATABASE_URL)
Session = sessionmaker(bind=engine)


@pytest.mark.parametrize(
"provider, model, api_key_name",
[
("openai", "gpt-4o-mini", "OPENAI_API_KEY"),
],
)
def test_llm_tracking_logs(provider, model, api_key_name):
session_id = str(uuid.uuid4())

start_servers(proxy=False, tracker=True)

tracking_config = TrackingConfig(
host=LLMSTUDIO_TRACKING_HOST, port=LLMSTUDIO_TRACKING_PORT
)

llm = LLM(
provider=provider,
api_key=os.environ[api_key_name],
session_id=session_id,
tracking_config=tracking_config,
)

chat_request = {
"chat_input": f"Hello, my name is Alice - session {session_id}",
"model": model,
"is_stream": False,
"retries": 0,
"parameters": {"temperature": 0, "max_tokens": 1000},
}

response = llm.chat(**chat_request)
print(response)

assert hasattr(response, "chat_output"), "Missing 'chat_output'"
assert response.chat_output is not None, "'chat_output' is None"

# DB: Check if row was logged
db = Session()
logs = (
db.execute(select(LogDefault).where(LogDefault.session_id == session_id))
.scalars()
.all()
)

assert len(logs) == 1, "No log entry found for session"
log = logs[0]

assert log.chat_input == f"Hello, my name is Alice - session {session_id}"
assert log.model == "gpt-4o-mini"
assert log.session_id == session_id
assert log.chat_output is not None
assert isinstance(log.parameters, dict)
db.close()
3 changes: 3 additions & 0 deletions libs/tracker/llmstudio_tracker/base.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
from llmstudio_tracker.db.models.logs import LogDefault
from llmstudio_tracker.db.models.prompt_manager import PromptDefault
from llmstudio_tracker.db.models.session import SessionDefault
5 changes: 5 additions & 0 deletions libs/tracker/llmstudio_tracker/base_class.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
from sqlalchemy.orm import DeclarativeBase


class Base(DeclarativeBase):
pass
4 changes: 1 addition & 3 deletions libs/tracker/llmstudio_tracker/database.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
from llmstudio_tracker.config import DB_TYPE, TRACKING_URI
from sqlalchemy import create_engine
from sqlalchemy.orm import declarative_base, sessionmaker
from sqlalchemy.orm import sessionmaker


def create_tracking_engine(uri: str):
Expand All @@ -13,8 +13,6 @@ def create_tracking_engine(uri: str):

SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)

Base = declarative_base()


def get_db():
db = SessionLocal()
Expand Down
30 changes: 30 additions & 0 deletions libs/tracker/llmstudio_tracker/db/crud/logs.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
from llmstudio_tracker.db.models.logs import LogDefault
from llmstudio_tracker.db.schemas.logs import LogDefaultCreate
from sqlalchemy.orm import Session


def get_project_by_name(db: Session, name: str):
return db.query(LogDefault).filter(LogDefault.name == name).first()


def get_logs_by_session(db: Session, session_id: str, skip: int = 0, limit: int = 100):
return (
db.query(LogDefault)
.filter(LogDefault.session_id == session_id)
.offset(skip)
.limit(limit)
.all()
)


def add_log(db: Session, log: LogDefaultCreate):
db_log = LogDefault(**log.model_dump())
db.add(db_log)
db.commit()
db.refresh(db_log)

return db_log


def get_logs(db: Session, skip: int = 0, limit: int = 100):
return db.query(LogDefault).offset(skip).limit(limit).all()
Loading