-
Notifications
You must be signed in to change notification settings - Fork 67
LCORE-1137: token history table #939
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LCORE-1137: token history table #939
Conversation
WalkthroughAdds persistent token usage history: a new TokenUsageHistory class manages per-user token consumption using PostgreSQL or SQLite. AppConfig gains a lazy-instantiated token_usage_history property guarded by a configuration flag. Minor docstring clarification in QuotaLimiter.connected. Changes
Sequence DiagramsequenceDiagram
participant App as Application
participant AC as AppConfig
participant TUH as TokenUsageHistory
participant DB as Database (SQLite/Postgres)
App->>AC: access token_usage_history
activate AC
AC->>AC: verify config loaded
AC->>AC: check quota_handlers.enable_token_history
alt enabled
AC->>TUH: instantiate TokenUsageHistory(config)
activate TUH
TUH->>DB: connect()
TUH->>DB: CREATE_TOKEN_USAGE_TABLE (if needed)
TUH-->>AC: return instance
deactivate TUH
else disabled
AC-->>App: return None
end
deactivate AC
App->>TUH: consume_tokens(user, provider, model, in, out)
activate TUH
alt Postgres backend
TUH->>DB: EXECUTE CONSUME_TOKENS_FOR_USER_PG
else SQLite backend
TUH->>DB: EXECUTE CONSUME_TOKENS_FOR_USER_SQLITE
end
TUH-->>App: record acknowledged
deactivate TUH
App->>TUH: connected()?
activate TUH
TUH->>DB: simple health query
TUH-->>App: true/false
deactivate TUH
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/configuration.py (1)
66-72: Missing reset of_token_usage_historywhen configuration changes.When
init_from_dictis called to reload configuration,_conversation_cacheand_quota_limitersare reset to their initial values, but_token_usage_historyis not. This could lead to stale database connections if the configuration changes (e.g., switching from PostgreSQL to SQLite).🔎 Proposed fix
def init_from_dict(self, config_dict: dict[Any, Any]) -> None: """Initialize configuration from a dictionary.""" # clear cached values when configuration changes self._conversation_cache = None self._quota_limiters = [] + self._token_usage_history = None # now it is possible to re-read configuration self._configuration = Configuration(**config_dict)
🧹 Nitpick comments (3)
src/quota/sql.py (2)
126-134: Redundant WHERE clause in ON CONFLICT DO UPDATE.The
WHEREclause in theDO UPDATEsection is unnecessary. WhenON CONFLICT (user_id, provider, model)triggers, the conflicting row is already uniquely identified by the composite primary key. TheWHEREclause re-specifies the same conditions, adding verbosity without functional benefit.🔎 Proposed simplification for SQLite
ON CONFLICT (user_id, provider, model) DO UPDATE SET input_tokens=token_usage.input_tokens+:input_tokens, output_tokens=token_usage.output_tokens+:output_tokens, updated_at=:updated_at - WHERE token_usage.user_id=:user_id - AND token_usage.provider=:provider - AND token_usage.model=:modelApply the same simplification for the PostgreSQL variant.
Also applies to: 139-147
118-121: Remove unusedINIT_TOKEN_USAGE_FOR_USERor add SQLite variant for consistency.
INIT_TOKEN_USAGE_FOR_USERis not imported or used anywhere in the codebase. If this constant is intended for future use, add a correspondingINIT_TOKEN_USAGE_FOR_USER_SQLITEvariant with?placeholders to match the pattern used byCONSUME_TOKENS_FOR_USER_PGandCONSUME_TOKENS_FOR_USER_SQLITE. Otherwise, remove the unused constant.src/quota/token_usage_history.py (1)
28-52: Add type annotations for class attributes.Per coding guidelines, class attributes should have complete type annotations. Consider adding them to improve type safety and documentation.
🔎 Proposed fix
+from typing import Any, Optional +from models.config import SQLiteDatabaseConfiguration, PostgreSQLDatabaseConfiguration + class TokenUsageHistory: """Class with implementation of storage for token usage history.""" + sqlite_connection_config: Optional[SQLiteDatabaseConfiguration] + postgres_connection_config: Optional[PostgreSQLDatabaseConfiguration] + connection: Optional[Any] + def __init__(self, configuration: QuotaHandlersConfiguration) -> None:
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
src/configuration.py(3 hunks)src/quota/quota_limiter.py(1 hunks)src/quota/sql.py(1 hunks)src/quota/token_usage_history.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.py: Use absolute imports for internal modules in LCS project (e.g.,from auth import get_auth_dependency)
All modules must start with descriptive docstrings explaining their purpose
Uselogger = logging.getLogger(__name__)pattern for module logging
All functions must include complete type annotations for parameters and return types, using modern syntax (str | int) andOptional[Type]orType | None
All functions must have docstrings with brief descriptions following Google Python docstring conventions
Function names must use snake_case with descriptive, action-oriented names (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying input parameters
Useasync deffor I/O operations and external API calls
All classes must include descriptive docstrings explaining their purpose following Google Python docstring conventions
Class names must use PascalCase with descriptive names and standard suffixes:Configurationfor config classes,Error/Exceptionfor exceptions,Resolverfor strategy patterns,Interfacefor abstract base classes
Abstract classes must use ABC with@abstractmethoddecorators
Include complete type annotations for all class attributes in Python classes
Useimport loggingand module logger pattern with standard log levels: debug, info, warning, error
Files:
src/quota/sql.pysrc/quota/token_usage_history.pysrc/configuration.pysrc/quota/quota_limiter.py
🧬 Code graph analysis (2)
src/quota/token_usage_history.py (6)
src/log.py (1)
get_logger(7-25)src/quota/connect_pg.py (1)
connect_pg(12-49)src/quota/connect_sqlite.py (1)
connect_sqlite(12-37)src/models/config.py (1)
QuotaHandlersConfiguration(1343-1382)src/utils/connection_decorator.py (1)
connection(6-48)src/configuration.py (1)
configuration(75-79)
src/configuration.py (1)
src/quota/token_usage_history.py (1)
TokenUsageHistory(28-191)
🪛 GitHub Check: Bandit
src/quota/sql.py
[notice] 106-107:
Possible hardcoded password: '
CREATE TABLE IF NOT EXISTS token_usage (
user_id text NOT NULL,
provider text NOT NULL,
model text NOT NULL,
input_tokens int,
output_tokens int,
updated_at timestamp with time zone,
PRIMARY KEY(user_id, provider, model)
);
'
[notice] 118-119:
Possible hardcoded password: '
INSERT INTO token_usage (user_id, provider, model, input_tokens, output_tokens, updated_at)
VALUES (%s, %s, %s, 0, 0, %s)
'
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: build-pr
- GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
- GitHub Check: E2E: library mode / ci
- GitHub Check: E2E: library mode / vertexai
- GitHub Check: E2E: server mode / ci
- GitHub Check: E2E: server mode / azure
- GitHub Check: E2E: server mode / vertexai
- GitHub Check: E2E: library mode / azure
🔇 Additional comments (4)
src/quota/quota_limiter.py (1)
104-105: LGTM!The docstring update accurately describes the method's purpose.
src/configuration.py (1)
185-210: LGTM!The
token_usage_historyproperty follows the established lazy-initialization pattern used byquota_limiters. The docstring is well-written and the type annotations are complete.src/quota/token_usage_history.py (2)
146-172: LGTM!The
connected()method properly handles connection verification with appropriate exception handling and cursor cleanup. The implementation mirrors the pattern inQuotaLimiter.connected().
174-191: LGTM!The
_initialize_tables()method correctly creates the token_usage table and commits the transaction. The null-check guard and logging are appropriate.
| if self.postgres_connection_config is not None: | ||
| self.connection = connect_pg(self.postgres_connection_config) | ||
| if self.sqlite_connection_config is not None: | ||
| self.connection = connect_sqlite(self.sqlite_connection_config) | ||
| if self.connection is None: | ||
| return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SQLite configuration silently overrides PostgreSQL when both are present.
If both postgres_connection_config and sqlite_connection_config are set, the SQLite connection will overwrite the PostgreSQL connection without any warning. Consider using elif to enforce mutual exclusivity or logging a warning when both are configured.
🔎 Proposed fix using elif
logger.info("Initializing connection to quota usage history database")
if self.postgres_connection_config is not None:
self.connection = connect_pg(self.postgres_connection_config)
- if self.sqlite_connection_config is not None:
+ elif self.sqlite_connection_config is not None:
self.connection = connect_sqlite(self.sqlite_connection_config)
if self.connection is None:
returnNote: The same pattern exists in QuotaLimiter.connect() (lines 90-93 in quota_limiter.py). Consider addressing both for consistency.
🤖 Prompt for AI Agents
In src/quota/token_usage_history.py around lines 69 to 74 the current logic sets
self.connection to Postgres then immediately to SQLite if both configs exist,
silently overwriting Postgres; change the second conditional to an elif so only
one connection is chosen (or explicitly detect both and log/raise) — update to
use elif to enforce mutual exclusivity and add a warning log if both configs are
provided; apply the same fix in src/quota/quota_limiter.py at lines ~90-93 for
consistency.
| logger.info( | ||
| "Token usage for user %s, provider %s and mode %s changed by %d, %d tokens", | ||
| user_id, | ||
| provider, | ||
| model, | ||
| input_tokens, | ||
| output_tokens, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in log message and potential PII logging concern.
- Line 109 has a typo:
"mode"should be"model". - Logging
user_idat INFO level may expose user identifiers in logs. Consider using DEBUG level or anonymizing/hashing the user ID per compliance guidelines.
🔎 Proposed fix
- logger.info(
+ logger.debug(
- "Token usage for user %s, provider %s and mode %s changed by %d, %d tokens",
+ "Token usage for user %s, provider %s and model %s changed by %d, %d tokens",
user_id,
provider,
model,
input_tokens,
output_tokens,
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| logger.info( | |
| "Token usage for user %s, provider %s and mode %s changed by %d, %d tokens", | |
| user_id, | |
| provider, | |
| model, | |
| input_tokens, | |
| output_tokens, | |
| ) | |
| logger.debug( | |
| "Token usage for user %s, provider %s and model %s changed by %d, %d tokens", | |
| user_id, | |
| provider, | |
| model, | |
| input_tokens, | |
| output_tokens, | |
| ) |
🤖 Prompt for AI Agents
In src/quota/token_usage_history.py around lines 108 to 115, fix the typo in the
log message ("mode" -> "model") and avoid logging raw user identifiers at INFO
level: either change the logger call to logger.debug(...) or replace user_id
with an anonymized value (e.g. hash or scrub helper) before logging; also ensure
the log format and placeholder order still match the supplied variables
(provider, model, input_tokens, output_tokens) after the change.
| if self.postgres_connection_config is not None: | ||
| query_statement = CONSUME_TOKENS_FOR_USER_PG | ||
| if self.sqlite_connection_config is not None: | ||
| query_statement = CONSUME_TOKENS_FOR_USER_SQLITE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same precedence issue as in connect(): SQLite overrides PostgreSQL.
The query statement selection has the same issue where SQLite config overwrites PostgreSQL config if both are present. Use elif for consistency with the recommended fix in connect().
🔎 Proposed fix
query_statement: str = ""
if self.postgres_connection_config is not None:
query_statement = CONSUME_TOKENS_FOR_USER_PG
- if self.sqlite_connection_config is not None:
+ elif self.sqlite_connection_config is not None:
query_statement = CONSUME_TOKENS_FOR_USER_SQLITE🤖 Prompt for AI Agents
In src/quota/token_usage_history.py around lines 117 to 120, the selection of
query_statement uses two separate ifs so the SQLite branch can overwrite the
PostgreSQL branch when both configs are present; change the second conditional
to elif so PostgreSQL takes precedence (matching the connect() fix) and ensure
query_statement is set only once based on that precedence.
d49a907 to
2e76698
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (3)
src/quota/token_usage_history.py (3)
69-72: SQLite configuration silently overrides PostgreSQL when both are present.This issue was already identified in a previous review. If both configurations are set, SQLite will overwrite PostgreSQL without warning. Use
eliffor mutual exclusivity.
108-115: Typo in log message and potential PII logging concern.This issue was already identified in a previous review:
- Line 109: typo
"mode"should be"model"- Logging
user_idat INFO level may expose user identifiers
117-120: SQLite configuration silently overrides PostgreSQL when both are present.This issue was already identified in a previous review. Use
elifto match the fix recommended for theconnect()method.
🧹 Nitpick comments (4)
src/quota/token_usage_history.py (4)
83-83: Redundant autocommit assignment.Both
connect_pgandconnect_sqlitealready setconnection.autocommit = True(see relevant code snippets). This assignment is redundant and can be removed.🔎 Proposed fix
try: self._initialize_tables() except Exception as e: self.connection.close() logger.exception("Error initializing quota usage history database:\n%s", e) raise - - self.connection.autocommit = True
128-128: Use timezone-aware datetime for cross-database compatibility.
datetime.now()returns a timezone-naive datetime, which can cause ambiguity when the database and application are in different timezones. Usedatetime.now(timezone.utc)or store timestamps as UTC explicitly.🔎 Proposed fix
+from datetime import datetime, timezone + # ... in consume_tokens method: - updated_at = datetime.now() + updated_at = datetime.now(timezone.utc)
146-173: LGTM with minor docstring suggestion.The connection check logic is robust, properly handling exceptions and ensuring cursor cleanup. The use of broad exception catching in the finally block is appropriate.
Minor suggestion: Update the docstring to use Python's boolean literals
TrueandFalseinstead of lowercasetrueandfalse.🔎 Proposed docstring fix
def connected(self) -> bool: """Check if connection to quota usage history database is alive. Verify that the storage connection for token usage history is alive. Returns: - `true` if the database connection is present and responds to a - simple query, `false` otherwise. + `True` if the database connection is present and responds to a + simple query, `False` otherwise. """
189-191: Explicit commit is redundant with autocommit enabled.Both
connect_pgandconnect_sqliteenable autocommit on the connection (see relevant code snippets). The explicitcommit()call is unnecessary and can be removed.🔎 Proposed fix
logger.info("Initializing tables for token usage history") cursor = self.connection.cursor() cursor.execute(CREATE_TOKEN_USAGE_TABLE) cursor.close() - self.connection.commit()
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/configuration.py(3 hunks)src/quota/quota_limiter.py(1 hunks)src/quota/token_usage_history.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- src/configuration.py
- src/quota/quota_limiter.py
🧰 Additional context used
📓 Path-based instructions (1)
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.py: Use absolute imports for internal modules in LCS project (e.g.,from auth import get_auth_dependency)
All modules must start with descriptive docstrings explaining their purpose
Uselogger = logging.getLogger(__name__)pattern for module logging
All functions must include complete type annotations for parameters and return types, using modern syntax (str | int) andOptional[Type]orType | None
All functions must have docstrings with brief descriptions following Google Python docstring conventions
Function names must use snake_case with descriptive, action-oriented names (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying input parameters
Useasync deffor I/O operations and external API calls
All classes must include descriptive docstrings explaining their purpose following Google Python docstring conventions
Class names must use PascalCase with descriptive names and standard suffixes:Configurationfor config classes,Error/Exceptionfor exceptions,Resolverfor strategy patterns,Interfacefor abstract base classes
Abstract classes must use ABC with@abstractmethoddecorators
Include complete type annotations for all class attributes in Python classes
Useimport loggingand module logger pattern with standard log levels: debug, info, warning, error
Files:
src/quota/token_usage_history.py
🧬 Code graph analysis (1)
src/quota/token_usage_history.py (6)
src/log.py (1)
get_logger(7-25)src/quota/connect_pg.py (1)
connect_pg(12-49)src/quota/connect_sqlite.py (1)
connect_sqlite(12-37)src/models/config.py (1)
QuotaHandlersConfiguration(1343-1382)src/utils/connection_decorator.py (1)
connection(6-48)src/configuration.py (1)
configuration(75-79)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
- GitHub Check: build-pr
- GitHub Check: E2E: library mode / azure
- GitHub Check: E2E: server mode / ci
- GitHub Check: E2E: server mode / vertexai
- GitHub Check: E2E: library mode / vertexai
- GitHub Check: E2E: server mode / azure
- GitHub Check: E2E: library mode / ci
🔇 Additional comments (1)
src/quota/token_usage_history.py (1)
12-23: No action required. The imports already follow the project's absolute import convention and require no changes.Likely an incorrect or invalid review comment.
| self.sqlite_connection_config = configuration.sqlite | ||
| self.postgres_connection_config = configuration.postgres | ||
| self.connection = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Add type annotations for class attributes.
The coding guidelines require complete type annotations for all class attributes. Add type hints for sqlite_connection_config, postgres_connection_config, and connection.
As per coding guidelines, all class attributes must include type annotations.
🔎 Proposed fix
+from typing import Any
+from models.config import SQLiteDatabaseConfiguration, PostgreSQLDatabaseConfiguration
+
class TokenUsageHistory:
"""Class with implementation of storage for token usage history."""
def __init__(self, configuration: QuotaHandlersConfiguration) -> None:
"""Initialize token usage history storage.
Initialize TokenUsageHistory with the provided configuration and
establish a database connection.
Stores SQLite and PostgreSQL connection settings for later reconnection
attempts, initializes the internal connection state, and opens the
database connection.
Parameters:
configuration (QuotaHandlersConfiguration): Configuration
containing `sqlite` and `postgres` connection settings.
"""
# store the configuration, it will be used
# by reconnection logic later, if needed
- self.sqlite_connection_config = configuration.sqlite
- self.postgres_connection_config = configuration.postgres
- self.connection = None
+ self.sqlite_connection_config: SQLiteDatabaseConfiguration | None = configuration.sqlite
+ self.postgres_connection_config: PostgreSQLDatabaseConfiguration | None = configuration.postgres
+ self.connection: Any | None = NoneCommittable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In src/quota/token_usage_history.py around lines 47-49, the three class
attributes lack type annotations; add explicit type hints for
sqlite_connection_config, postgres_connection_config, and connection. Import
typing (e.g., from typing import Optional, Any) and annotate
sqlite_connection_config and postgres_connection_config with the concrete config
types from your configuration module (e.g., configuration.SQLiteConfig /
configuration.PostgresConfig) or use Any if those types are not exported, and
annotate connection as Optional[...] with the appropriate DB connection type
(e.g., sqlite3.Connection / asyncpg.Connection) or Optional[Any] if unknown;
ensure the annotations are added at the class attribute level (not just in
__init__) so static type checkers pick them up.
| Raises: | ||
| ValueError: If neither PostgreSQL nor SQLite configuration is provided. | ||
| """ | ||
| logger.info("Initializing connection to quota usage history database") | ||
| if self.postgres_connection_config is not None: | ||
| self.connection = connect_pg(self.postgres_connection_config) | ||
| if self.sqlite_connection_config is not None: | ||
| self.connection = connect_sqlite(self.sqlite_connection_config) | ||
| if self.connection is None: | ||
| return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docstring claims ValueError is raised, but implementation returns silently.
The docstring states "Raises: ValueError: If neither PostgreSQL nor SQLite configuration is provided," but the code returns early without raising any exception when self.connection is None. Either update the docstring or add the ValueError.
🔎 Proposed fix to match docstring
if self.postgres_connection_config is not None:
self.connection = connect_pg(self.postgres_connection_config)
if self.sqlite_connection_config is not None:
self.connection = connect_sqlite(self.sqlite_connection_config)
if self.connection is None:
- return
+ raise ValueError("Neither PostgreSQL nor SQLite configuration is provided")Or update the docstring to reflect the current behavior:
- Raises:
- ValueError: If neither PostgreSQL nor SQLite configuration is provided.
+ Note:
+ Returns early without establishing a connection if neither PostgreSQL
+ nor SQLite configuration is provided.🤖 Prompt for AI Agents
In src/quota/token_usage_history.py around lines 65-74, the docstring says a
ValueError is raised when neither PostgreSQL nor SQLite config is provided, but
the code currently just returns when self.connection is None; update the
implementation to raise a ValueError with a clear message (e.g., "No database
configuration provided for quota usage history") instead of returning, so
behavior matches the docstring.
Description
LCORE-1137: token history table
Type of change
Tools used to create PR
Identify any AI code assistants used in this PR (for transparency and review context)
Related Tickets & Documents
Summary by CodeRabbit
New Features
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.