-
Notifications
You must be signed in to change notification settings - Fork 529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: clickhouse migration #2 #2135
Conversation
…atest verifications from Clickhouse feat(tinybird): remove unused import 'time' from node:console feat(tinybird): remove unused functions and pipes related to verifications and active keys feat(tinybird): add comment indicating completion of changes by @andreas
The latest updates on your projects. Learn more about Vercel for Git ↗︎
2 Skipped Deployments
|
|
Caution Review failedThe pull request is closed. 📝 WalkthroughWalkthroughThe pull request introduces significant changes primarily focused on migrating functionality from Tinybird to ClickHouse. Key modifications include updates to SQL schemas, the introduction of new tables and materialized views, and the restructuring of data retrieval methods across various components. The Changes
Possibly related PRs
Suggested labels
Suggested reviewers
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Thank you for following the naming conventions for pull request titles! 🙏 |
@coderabbitai you good? |
@chronark: Yes, I'm here to assist. I will review the changes. ✅ Actions performedReview triggered.
|
@coderabbitai let's try again, now that the ratelimit is over |
@chronark: Understood. I'll re-review the pull request now that the rate limit is over. ✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 46
🧹 Outside diff range and nitpick comments (111)
internal/clickhouse/schema/022_create_business_active_workspaces_per_month_v1.sql (1)
2-9
: Consider adding performance optimizations.A few suggestions to optimize the table:
- Add PARTITION BY clause to improve query performance and data management
- Include workspace_id in ORDER BY for better data locality
- Consider compression settings for large datasets
- Add a TTL policy if historical data isn't needed indefinitely
Here's an enhanced version with these optimizations:
CREATE TABLE business.active_workspaces_per_month_v1 ( time Date, workspace_id String, count UInt32 default 1 ) ENGINE = SummingMergeTree() PARTITION BY toYYYYMM(time) ORDER BY (time, workspace_id) TTL time + INTERVAL 2 YEAR SETTINGS index_granularity = 8192, enable_mixed_granularity_parts = 1, min_bytes_for_wide_part = 0;internal/clickhouse/schema/012_create_billing_billable_verifications_per_month_v1.sql (2)
2-8
: Consider adding constraints and documentation.The table structure looks good for basic billing aggregation, but consider these improvements:
- Add constraints to ensure
year
andmonth
are within valid ranges- Add table/column comments to document the purpose and expected values
CREATE TABLE billing.billable_verifications_per_month_v1 ( - year Int, - month Int, + year Int COMMENT 'Calendar year', + month Int COMMENT 'Month number (1-12)', + workspace_id String COMMENT 'Unique identifier for the workspace', + count Int64 COMMENT 'Number of billable verifications', - workspace_id String, - count Int64 ) +COMMENT 'Monthly aggregation of billable verifications per workspace'
1-15
: Consider adding data backfill mechanism.Since this table will be used for billing purposes, consider adding a companion migration script to backfill historical data from existing verification records.
Would you like me to help create a data backfill script or open an issue to track this task?
internal/clickhouse/schema/001_create_databases.sql (3)
3-8
: Add IF NOT EXISTS guards to CREATE DATABASE statements.While the current implementation works, adding IF NOT EXISTS guards would make the migration more robust against partial runs or retries.
-CREATE DATABASE verifications; -CREATE DATABASE telemetry; -CREATE DATABASE metrics; -CREATE DATABASE ratelimits; -CREATE DATABASE business; -CREATE DATABASE billing; +CREATE DATABASE IF NOT EXISTS verifications; +CREATE DATABASE IF NOT EXISTS telemetry; +CREATE DATABASE IF NOT EXISTS metrics; +CREATE DATABASE IF NOT EXISTS ratelimits; +CREATE DATABASE IF NOT EXISTS business; +CREATE DATABASE IF NOT EXISTS billing;
12-17
: Add IF EXISTS guards to DROP DATABASE statements.Similar to the CREATE statements, adding IF EXISTS guards would make the down migration more resilient.
-DROP DATABASE verifications; -DROP DATABASE telemetry; -DROP DATABASE metrics; -DROP DATABASE ratelimits; -DROP DATABASE business; -DROP DATABASE billing; +DROP DATABASE IF EXISTS verifications; +DROP DATABASE IF EXISTS telemetry; +DROP DATABASE IF EXISTS metrics; +DROP DATABASE IF EXISTS ratelimits; +DROP DATABASE IF EXISTS business; +DROP DATABASE IF EXISTS billing;
1-17
: Consider adding database settings and documentation.The migration successfully creates the basic database structure, but consider:
- Adding relevant ClickHouse database settings (e.g., engine settings, replication settings)
- Including comments to document the purpose of each database
Example enhancement:
-- Database for storing API key verification events and related metrics CREATE DATABASE IF NOT EXISTS verifications COMMENT 'Stores API key verification events and related metrics'; -- Database for collecting application telemetry data CREATE DATABASE IF NOT EXISTS telemetry COMMENT 'Stores application telemetry and usage data';internal/clickhouse/schema/015_create_ratelimits_ratelimits_per_hour_v1.sql (1)
1-14
: Consider optimizing the table schema for better performance.The schema looks good overall, but there are a few suggestions to optimize it:
- Consider adding a TTL policy to automatically clean up old data:
ENGINE = SummingMergeTree() ORDER BY (workspace_id, namespace_id, time, identifier) +TTL time + INTERVAL 90 DAY DELETE
- Consider using LowCardinality for string columns that have a limited set of possible values:
- workspace_id String, - namespace_id String, + workspace_id LowCardinality(String), + namespace_id LowCardinality(String),
- Consider adding compression codecs for better storage efficiency:
- time DateTime, + time DateTime CODEC(DoubleDelta, LZ4),internal/clickhouse/schema/016_create_ratelimits_ratelimits_per_day_v1.sql (2)
1-11
: Consider adding data constraints and TTL.The table structure could benefit from the following improvements:
- Use
DateTime64
with timezone for better timestamp precision and timezone handling- Add
NOT NULL
constraints where appropriate- Consider adding a TTL to automatically clean up old data
CREATE TABLE ratelimits.ratelimits_per_day_v1 ( - time DateTime, + time DateTime64(3) NOT NULL, - workspace_id String, + workspace_id String NOT NULL, - namespace_id String, + namespace_id String NOT NULL, - identifier String, + identifier String NOT NULL, - passed Int64, + passed Int64 NOT NULL, - total Int64 + total Int64 NOT NULL )
12-14
: LGTM! Optimal engine and ordering choice.The SummingMergeTree engine is perfect for this use case as it will automatically aggregate the
passed
andtotal
columns. The ordering is well-designed for efficient querying by workspace and namespace with time-based filtering.Consider these performance tips:
- Queries should include the ORDER BY columns in WHERE clauses for optimal performance
- For high-throughput scenarios, consider setting up a buffer table
internal/clickhouse/schema/024_create_ratelimits_last_used_mv_v1.sql (2)
2-15
: Consider adding a TTL policy for data retention.Since this view tracks the "last used" timestamps, consider adding a TTL policy to automatically clean up old data and maintain optimal performance.
Example addition:
-- Add TTL policy to automatically remove data older than X days ALTER TABLE ratelimits.ratelimits_last_used_v1 MODIFY TTL time + INTERVAL 90 DAY;
5-14
: Ensure consistent ordering of GROUP BY columns.While the logic is correct, consider maintaining a consistent order between the SELECT and GROUP BY clauses for better readability:
SELECT workspace_id, namespace_id, identifier, maxSimpleState(time) as time FROM ratelimits.raw_ratelimits_v1 GROUP BY - workspace_id, - namespace_id, - identifier + workspace_id, -- Aligned with SELECT + namespace_id, -- Aligned with SELECT + identifier -- Aligned with SELECTinternal/clickhouse/schema/017_create_ratelimits_ratelimits_per_month_v1.sql (2)
2-14
: Consider adding PARTITION BY for better query performance.Since this table stores time-series data, consider partitioning by month to improve query performance and data management. This is particularly important for rate limit data which is typically queried within specific time ranges.
CREATE TABLE ratelimits.ratelimits_per_month_v1 ( time DateTime, workspace_id String, namespace_id String, identifier String, passed Int64, total Int64 ) ENGINE = SummingMergeTree() +PARTITION BY toYYYYMM(time) ORDER BY (workspace_id, namespace_id, time, identifier)
2-14
: Add TTL for data retention management.Consider adding a TTL policy to automatically manage data retention. This helps in maintaining database performance and compliance with data retention policies.
CREATE TABLE ratelimits.ratelimits_per_month_v1 ( time DateTime, workspace_id String, namespace_id String, identifier String, passed Int64, total Int64 ) ENGINE = SummingMergeTree() ORDER BY (workspace_id, namespace_id, time, identifier) +TTL time + INTERVAL 12 MONTH DELETE
Note: Adjust the retention period (12 MONTH in this example) according to your data retention requirements.
internal/clickhouse/schema/014_create_ratelimits_ratelimits_per_minute_v1.sql (2)
4-4
: Consider using DateTime64 with timezone informationUsing
DateTime64
with timezone information would provide more precise timestamp tracking and avoid potential timezone-related issues.- time DateTime, + time DateTime64(3, 'UTC'),
1-14
: Consider adding skip indexes for improved query performanceAdding skip indexes on frequently filtered columns could improve query performance, especially for workspace and namespace filtering.
CREATE TABLE ratelimits.ratelimits_per_minute_v1 ( time DateTime, workspace_id String, namespace_id String, identifier String, passed Int64, total Int64 ) ENGINE = SummingMergeTree() PARTITION BY toYYYYMM(time) -ORDER BY (workspace_id, namespace_id, time, identifier) +ORDER BY (workspace_id, namespace_id, time, identifier) +INDEX idx_workspace workspace_id TYPE bloom_filter GRANULARITY 1 +INDEX idx_namespace namespace_id TYPE bloom_filter GRANULARITY 1internal/clickhouse/schema/010_create_ratelimits_raw_ratelimits_table.sql (2)
2-14
: Consider adding TTL for data retention.Rate limit data typically has a limited useful lifespan. Consider adding a TTL clause to automatically remove old data and manage storage growth:
ENGINE = MergeTree() -ORDER BY (workspace_id, namespace_id, time, identifier) +ORDER BY (workspace_id, namespace_id, time, identifier) +TTL time + INTERVAL 30 DAY DELETE
12-14
: Consider additional performance optimizations.For improved query performance, consider:
- Adding compression settings for the high-cardinality string columns
- Creating additional indices for common query patterns
ENGINE = MergeTree() ORDER BY (workspace_id, namespace_id, time, identifier) +SETTINGS + index_granularity = 8192, + enable_mixed_granularity_parts = 1 +COMPRESSION( + workspace_id LZ4, + namespace_id LZ4, + identifier LZ4 +)internal/clickhouse/schema/005_create_verifications_key_verifications_per_day_v1.sql (1)
2-11
: Review table structure and data types.The table structure looks well-designed with appropriate data types:
DateTime
for temporal dataString
for IDs (common in ClickHouse)LowCardinality(String)
for theoutcome
column which likely has few distinct valuesInt64
for count is appropriate for summationConsider adding a TTL (Time To Live) policy if this aggregated data doesn't need to be kept indefinitely. This can help manage storage costs and performance.
internal/clickhouse/schema/011_create_telemetry_raw_sdks_v1.sql (2)
Line range hint
2-19
: Consider adding compression and sampling settings for better performance.The table structure is well-defined, but since this is a telemetry table that could grow large, consider the following optimizations:
- Add compression settings for the String columns
- Add sampling expression for efficient querying of large datasets
CREATE TABLE telemetry.raw_sdks_v1( request_id String, time Int64, runtime String, platform String, versions Array(String) ) ENGINE = MergeTree() +SAMPLE BY cityHash64(request_id) ORDER BY (request_id, time) +SETTINGS + min_bytes_for_wide_part = 10485760, + index_granularity = 8192, + enable_mixed_granularity_parts = 1 ;
Line range hint
2-19
: Consider adding TTL for data retention.Since this is telemetry data, it's recommended to implement a data retention policy using TTL to automatically manage storage and comply with data retention requirements.
CREATE TABLE telemetry.raw_sdks_v1( request_id String, time Int64, runtime String, platform String, versions Array(String) ) ENGINE = MergeTree() ORDER BY (request_id, time) +TTL time + INTERVAL 90 DAY DELETE ;
internal/clickhouse/schema/006_create_verifications_key_verifications_per_month_v1.sql (1)
2-14
: LGTM! Consider adding performance optimizations.The table structure and engine choice are well-designed for aggregating verification metrics. The SummingMergeTree engine will automatically sum the
count
column during merges, which is perfect for this use case.Consider the following optimizations for production workloads:
- Add partitioning by time to improve query performance and enable efficient data retention management:
ENGINE = SummingMergeTree() +PARTITION BY toYYYYMM(time) ORDER BY (workspace_id, key_space_id, time, identity_id, key_id)
- Add a TTL for automated data retention:
ORDER BY (workspace_id, key_space_id, time, identity_id, key_id) +TTL time + INTERVAL 12 MONTH DELETE
- Consider compression settings for the high-cardinality String columns:
ENGINE = SummingMergeTree() +SETTINGS index_granularity = 8192, + compress_min_count = 1000, + compress_min_size = 1024internal/clickhouse/schema/004_create_verifications_key_verifications_per_hour_v1.sql (2)
2-14
: Consider adding TTL and partitioning for better performance.The table structure and SummingMergeTree engine choice are appropriate for aggregating verification metrics. However, consider these performance optimizations:
- Add TTL to automatically clean old data:
ENGINE = SummingMergeTree() +TTL time + INTERVAL 90 DAY DELETE ORDER BY (workspace_id, key_space_id, time, identity_id, key_id)
- Add partitioning by time for better query performance:
ENGINE = SummingMergeTree() +PARTITION BY toYYYYMM(time) ORDER BY (workspace_id, key_space_id, time, identity_id, key_id)
2-11
: Add documentation comments to explain column purposes.Consider adding comments to document the purpose of each column and any constraints. This will help future maintainers understand the schema design decisions.
Example:
CREATE TABLE verifications.key_verifications_per_hour_v1 ( + -- Timestamp of the verification attempt time DateTime, + -- ID of the workspace that owns the key workspace_id String, + -- ID of the key space containing the key key_space_id String, + -- ID of the identity that attempted verification identity_id String, + -- ID of the key that was verified key_id String, + -- Result of the verification (success/failure) outcome LowCardinality(String), + -- Number of verification attempts in this hour count Int64 )internal/clickhouse/schema/013_create_billing_billable_verifications_per_month_mv_v1.sql (3)
1-4
: LGTM! Well-structured materialized view creation.The schema organization and naming convention follow best practices. The materialized view will improve billing query performance by pre-computing monthly aggregates.
Consider adding appropriate indexes on the target table to optimize query performance, especially on
workspace_id
and the time-based columns.
5-16
: Consider adding ORDER BY clause for consistent results.The aggregation logic is correct, but adding an ORDER BY clause would ensure consistent result ordering, which is important for billing reports.
Consider adding:
GROUP BY workspace_id, year, month +ORDER BY + workspace_id, + year DESC, + month DESC
1-21
: Add documentation for better maintainability.Consider adding comments to describe:
- The purpose of this materialized view
- The expected refresh frequency
- Any dependencies or assumptions
Add at the top of the file:
-- +goose up +-- This materialized view aggregates billable verifications per month for workspace billing +-- It is used by the billing system to calculate monthly usage charges +-- The view is populated automatically when the source table is updatedinternal/clickhouse/schema/019_create_ratelimits_ratelimits_per_hour_mv_v1.sql (2)
5-11
: Consider adding SETTINGS for materialized view optimization.The aggregation logic looks correct, but for better performance, consider adding materialized view settings like:
allow_async_update_on_insertion
: For asynchronous updatesstorage_policy
: If specific storage requirements existGROUP BY workspace_id, namespace_id, identifier, time +SETTINGS + allow_async_update_on_insertion=1
11-11
: Consider adding time range constraints.The time transformation looks correct, but consider if you need to limit the historical data range to prevent processing unnecessary old records.
toStartOfHour(fromUnixTimestamp64Milli(time)) AS time +WHERE time >= toUnixTimestamp64Milli(now() - INTERVAL 90 DAY)
internal/clickhouse/schema/020_create_ratelimits_ratelimits_per_day_mv_v1.sql (1)
5-12
: Consider adding ORDER BY hint for better performanceThe query looks correct, but for better performance, consider adding an ORDER BY hint to help ClickHouse optimize the aggregation.
SELECT workspace_id, namespace_id, identifier, count(*) as total, countIf(passed > 0) as passed, toStartOfDay(fromUnixTimestamp64Milli(time)) AS time -FROM ratelimits.raw_ratelimits_v1 +FROM ratelimits.raw_ratelimits_v1 +/* Order hint for better aggregation performance */ +ORDER BY workspace_id, namespace_id, identifier, timeinternal/clickhouse/schema/021_create_ratelimits_ratelimits_per_month_mv_v1.sql (2)
2-4
: Consider adding IF NOT EXISTS clause.The materialized view creation could be more defensive to prevent errors if the view already exists.
-CREATE MATERIALIZED VIEW ratelimits.ratelimits_per_month_mv_v1 +CREATE MATERIALIZED VIEW IF NOT EXISTS ratelimits.ratelimits_per_month_mv_v1
1-22
: Consider adding documentation comments.The materialized view would benefit from documentation explaining:
- The purpose and business context
- The meaning of passed vs total metrics
- Any assumptions about the data
- Expected query patterns
-- +goose up +-- This materialized view aggregates rate limit data on a monthly basis per workspace, +-- namespace, and identifier. It tracks both successful (passed > 0) and total rate +-- limit checks to help monitor usage patterns and compliance. CREATE MATERIALIZED VIEW ratelimits.ratelimits_per_month_mv_v1internal/clickhouse/schema/018_create_ratelimits_ratelimits_per_minute_mv_v1.sql (1)
5-11
: Consider adding indexes for query optimization.The view calculates two metrics (
passed
andtotal
) and performs time-based aggregation. To optimize query performance, consider adding indexes on frequently queried columns.Suggested improvements:
- Add an index on the
time
column for time-range queries- Consider a composite index on
(workspace_id, time)
if these are commonly queried togetherdeployment/grafana/grafana.yaml (1)
19-26
: Consider adding additional configuration parametersThe current configuration misses some important parameters that could improve reliability and security:
- Connection timeouts
- Max open connections
- Query timeouts
- Secure connection settings (SSL/TLS)
Consider adding these parameters to the jsonData section:
jsonData: defaultDatabase: database port: 9000 host: clickhouse username: ${CLICKHOUSE_USER} isDefault: true editable: true tlsSkipVerify: false + timeout: 30 + maxOpenConnections: 10 + queryTimeout: 60 + secure: true + validateCertificate: trueinternal/clickhouse/schema/008_create_verifications_key_verifications_per_day_mv_v1.sql (3)
5-12
: Consider adding SETTINGS for materialized view optimization.While the column selection and aggregation logic look correct, consider adding SETTINGS clause to optimize the materialized view:
GROUP BY workspace_id, key_space_id, identity_id, key_id, outcome, time +SETTINGS + allow_async_insert = 1, + async_insert = 1This can improve insert performance for high-throughput scenarios.
24-25
: Consider using CASCADE in DROP VIEW.For safer cleanup during rollback, consider adding CASCADE to handle any dependent objects.
-DROP VIEW verifications.key_verifications_per_day_mv_v1; +DROP VIEW IF EXISTS verifications.key_verifications_per_day_mv_v1 CASCADE;
1-25
: Add documentation comments for the materialized view.Consider adding comments to explain:
- The purpose of this materialized view
- The expected refresh frequency
- Any dependencies or assumptions
-- +goose up +-- This materialized view aggregates daily key verification metrics +-- It's used for analytics and reporting purposes +-- The view is updated automatically as new data arrives in raw_key_verifications_v1 CREATE MATERIALIZED VIEW verifications.key_verifications_per_day_mv_v1internal/clickhouse/schema/007_create_verifications_key_verifications_per_hour_mv_v1.sql (2)
12-12
: Consider adding a TTL for historical data.The materialized view aggregates data by hour, but there's no data retention policy. Consider adding a TTL to automatically remove old data and manage storage costs.
Example addition:
-- Add to target table definition TTL time + INTERVAL 90 DAY DELETE
1-25
: Consider adding documentation comments.The materialized view would benefit from documentation comments explaining:
- The purpose of the hourly aggregation
- The meaning of the
outcome
column- Any assumptions about the data
Add comments like:
-- Materialized view for hourly key verification metrics -- Aggregates raw verification events into hourly buckets for performance optimization -- outcome: represents the verification result (success/failure)internal/clickhouse/schema/009_create_verifications_key_verifications_per_month_mv_v1.sql (2)
2-4
: Consider adding POPULATE clause for initial data.The materialized view creation looks good, but consider whether you need to backfill historical data.
If historical data needs to be included, modify the creation statement:
CREATE MATERIALIZED VIEW verifications.key_verifications_per_month_mv_v1 TO verifications.key_verifications_per_month_v1 +POPULATE AS
1-25
: Consider adding documentation comments.While the SQL is well-structured, adding documentation comments explaining the purpose of this materialized view and its refresh strategy would improve maintainability.
Add documentation at the top of the file:
-- +goose up +-- This materialized view aggregates key verifications data on a monthly basis +-- It's used for [describe specific use case] +-- Note: The view is [describe refresh strategy: real-time/periodic] CREATE MATERIALIZED VIEW verifications.key_verifications_per_month_mv_v1apps/api/src/pkg/ratelimit/noop.ts (1)
10-10
: Consider extracting the repeated response object.Both
limit
andmultiLimit
return identical response objects. Consider extracting this into a constant or private method to improve maintainability.export class NoopRateLimiter implements RateLimiter { + private static readonly DEFAULT_RESPONSE: RatelimitResponse = { + current: 0, + passed: true, + reset: 0, + remaining: 0, + triggered: null, + }; + public async limit( _c: Context, _req: RatelimitRequest, ): Promise<Result<RatelimitResponse, RatelimitError>> { - return Ok({ current: 0, passed: true, reset: 0, remaining: 0, triggered: null }); + return Ok(NoopRateLimiter.DEFAULT_RESPONSE); } public async multiLimit( _c: Context, _req: Array<RatelimitRequest>, ): Promise<Result<RatelimitResponse, RatelimitError>> { - return Ok({ current: 0, passed: true, reset: 0, remaining: 0, triggered: null }); + return Ok(NoopRateLimiter.DEFAULT_RESPONSE); } }Also applies to: 16-16
internal/clickhouse/src/success.ts (2)
5-6
: Update misleading commentThe comment mentions "billable verifications" but the function actually retrieves active workspace counts.
-// get the billable verifications for a workspace in a specific month. +// get the count of active workspaces per month across all time.
7-26
: Consider adding error handling and time boundariesThe implementation could be improved in several ways:
- Add error handling for query failures
- Consider adding time boundaries to prevent retrieving unnecessary historical data
- Add input parameters for time range filtering
Here's a suggested improvement:
-export function getActiveWorkspacesPerMonth(ch: Querier) { - return async () => { +export function getActiveWorkspacesPerMonth(ch: Querier) { + return async ({ startTime, endTime }: { startTime?: Date; endTime?: Date } = {}) => { + try { const query = ch.query({ query: ` - SELECT - count(DISTINCT workspace_id) as workspaces, - time - FROM business.active_workspaces_per_month_v1 - GROUP BY time - ORDER BY time ASC - ;`, + SELECT + count(DISTINCT workspace_id) as workspaces, + time + FROM business.active_workspaces_per_month_v1 + WHERE 1=1 + ${startTime ? 'AND time >= {startTime: DateTime}' : ''} + ${endTime ? 'AND time <= {endTime: DateTime}' : ''} + GROUP BY time + ORDER BY time ASC + ;`, schema: z.object({ time: dateTimeToUnix, workspaces: z.number().int(), }), }); - return await query({}); + return await query({ + startTime, + endTime, + }); + } catch (error) { + throw new Error(`Failed to get active workspaces: ${error.message}`); + } }; }apps/dashboard/app/(app)/logs/page.tsx (1)
19-19
: Consider adding error handling and making the limit configurable.While the migration to
clickhouse.api.logs
is good, there are a few improvements to consider:
- Add error handling for the Clickhouse API call
- Consider making the limit configurable, possibly through URL parameters
Here's a suggested implementation:
- const logs = await clickhouse.api.logs({ workspaceId: workspace.id, limit: 10 }); + try { + const limit = Number(searchParams.get('limit') ?? '10'); + const logs = await clickhouse.api.logs({ + workspaceId: workspace.id, + limit: Math.min(Math.max(1, limit), 100) // Ensure limit is between 1 and 100 + }); + } catch (error) { + console.error('Failed to fetch logs:', error); + return <div>Failed to fetch logs. Please try again later.</div>; + }internal/clickhouse/src/client/interface.ts (2)
Line range hint
3-16
: Add JSDoc comments for better IDE integration.The
Querier
interface has a solid design with strong type safety using Zod schemas. Consider adding JSDoc comments to improve IDE integration and documentation:+/** + * Interface for executing typed queries against Clickhouse. + */ export interface Querier { + /** + * Executes a parameterized SQL query with type-safe inputs and outputs. + * @example + * const query = client.query({ + * query: "SELECT * FROM users WHERE id = {id: String}", + * params: z.object({ id: z.string() }), + * schema: z.object({ id: z.string(), name: z.string() }) + * }); + * const results = await query({ id: "123" }); + */ query<TIn extends z.ZodSchema<any>, TOut extends z.ZodSchema<any>>(req: { query: string; params?: TIn; schema: TOut; }): (params: z.input<TIn>) => Promise<z.output<TOut>[]>; }
Line range hint
18-24
: Enhance type safety and documentation for the Inserter interface.The
Inserter
interface has a good design supporting both single and batch inserts. Consider these improvements:+/** Response from an insert operation */ +export interface InsertResponse { + /** Whether the insert was executed successfully */ + executed: boolean; + /** Unique identifier for the executed query */ + query_id: string; +} +/** + * Interface for inserting typed data into Clickhouse tables. + */ export interface Inserter { + /** + * Creates an insert function for a specific table with type-safe input validation. + * @example + * const insert = client.insert({ + * table: "events", + * schema: z.object({ id: z.string(), timestamp: z.number() }) + * }); + * await insert({ id: "123", timestamp: Date.now() }); + */ insert<TSchema extends z.ZodSchema<any>>(req: { - table: string; + table: string & {}; // Makes the type more strict, preventing accidental `undefined` schema: TSchema; }): ( events: z.input<TSchema> | z.input<TSchema>[], - ) => Promise<{ executed: boolean; query_id: string }>; + ) => Promise<InsertResponse>; }internal/clickhouse/schema/003_create_verifications_raw_key_verifications_v1.sql (3)
Line range hint
2-31
: Consider using DateTime64 for the time column and adding compression codecs.The current schema could be optimized in several ways:
- Use
DateTime64
instead ofInt64
for thetime
column to leverage ClickHouse's built-in datetime functions- Add compression codecs for better storage efficiency, especially for the String columns
CREATE TABLE verifications.raw_key_verifications_v1( request_id String, - time Int64, + time DateTime64(3) CODEC(Delta, ZSTD), - workspace_id String, - key_space_id String, - key_id String, + workspace_id String CODEC(ZSTD(1)), + key_space_id String CODEC(ZSTD(1)), + key_id String CODEC(ZSTD(1)),
32-32
: Add partitioning and TTL for better performance and data lifecycle management.The table lacks partitioning and TTL settings which are crucial for:
- Query performance optimization
- Automatic data cleanup
- Efficient data management
ENGINE = MergeTree() +PARTITION BY toYYYYMM(time) ORDER BY (workspace_id, key_space_id, key_id, time) +TTL time + INTERVAL 90 DAY +SETTINGS index_granularity = 8192
Line range hint
24-24
: Improve documentation for the identity_id column.The current comment for
identity_id
could be more descriptive about its purpose and usage.- -- Empty string if the key has no identity + -- Identifier for the authenticated entity using the key. + -- Contains an empty string if the key is not associated with any identity. identity_id String,internal/clickhouse/src/latest_verifications.ts (2)
4-8
: Consider enhancing input validation rules.While the basic string validation is good, consider adding more specific validation rules for the IDs to ensure data integrity:
const params = z.object({ - workspaceId: z.string(), - keySpaceId: z.string(), - keyId: z.string(), + workspaceId: z.string().min(1), + keySpaceId: z.string().min(1), + keyId: z.string().min(1).uuid(), });
11-29
: Improve SQL query formatting and table reference.The SQL query could be better formatted for readability. Also, consider extracting the table name to a constant.
+const TABLE_NAME = "default.raw_key_verifications_v1"; + const query = ch.query({ query: ` - SELECT - time, - outcome, - region, - FROM default.raw_key_verifications_v1 - WHERE workspace_id = {workspaceId: String} - AND key_space_id = {keySpaceId: String} - AND key_id = {keyId: String} - ORDER BY time DESC - LIMIT 1`, + SELECT time, outcome, region + FROM ${TABLE_NAME} + WHERE workspace_id = {workspaceId: String} + AND key_space_id = {keySpaceId: String} + AND key_id = {keyId: String} + ORDER BY time DESC + LIMIT 1`,internal/clickhouse/schema/002_create_metrics_raw_api_requests_v1.sql (2)
2-2
: Consider adding schema creation statement.The table is created under the
metrics
schema, but there's no explicit creation of the schema. While ClickHouse automatically creates schemas, it's better to be explicit.Add this statement before the table creation:
+CREATE DATABASE IF NOT EXISTS metrics; CREATE TABLE metrics.raw_api_requests_v1(
38-40
: Consider adding TTL for data retention.The table uses MergeTree engine with ordering by workspace_id, time, and request_id, which is good for query performance. However, consider adding a TTL policy to automatically manage data retention, as API request logs can grow significantly over time.
Example addition:
ENGINE = MergeTree() ORDER BY (workspace_id, time, request_id) +TTL time + INTERVAL 90 DAY DELETE ;
internal/clickhouse/src/last_used.ts (1)
5-11
: Consider adding environment variable validation.While the code handles the case where CLICKHOUSE_URL is undefined by using a Noop client, it might be better to fail fast in production environments where ClickHouse is required.
Consider adding this validation:
export async function getLastUsed(args: { workspaceId: string; keySpaceId: string; keyId: string; }) { const { CLICKHOUSE_URL } = env(); + if (process.env.NODE_ENV === 'production' && !CLICKHOUSE_URL) { + throw new Error('CLICKHOUSE_URL is required in production'); + }internal/clickhouse/src/billing.ts (3)
4-11
: Enhance function documentation for better clarity.While the month indexing note is important, consider expanding the documentation to include:
- Return value description
- Parameter constraints (e.g., month range 1-12)
- Example usage
-// get the billable verifications for a workspace in a specific month. -// month is not zero-indexed -> January = 1 +/** + * Get the total count of billable verifications for a workspace in a specific month. + * @param ch - ClickHouse querier instance + * @returns Async function that accepts workspace, year, and month parameters + * @example + * const getBillable = getBillableVerifications(ch); + * const count = await getBillable({ workspaceId: "w1", year: 2024, month: 1 }); + * @note Month is one-indexed (January = 1, December = 12) + */
12-30
: Consider optimizing the SQL query.The GROUP BY clause appears redundant since the WHERE conditions already ensure uniqueness of workspace_id, year, and month combination.
SELECT sum(count) as count FROM billing.billable_verifications_per_month_v1 WHERE workspace_id = {workspaceId: String} AND year = {year: Int64} AND month = {month: Int64} - GROUP BY workspace_id, year, month
32-37
: Consider adding error logging for unexpected cases.While the error handling is functional, adding logging for unexpected cases would help with debugging and monitoring.
const res = await query(args); if (!res) { + console.warn(`No results found for workspace ${args.workspaceId} in ${args.year}-${args.month}`); return 0; } + if (res.length > 1) { + console.warn(`Multiple results found for workspace ${args.workspaceId} in ${args.year}-${args.month}`); + } return res.at(0)?.count ?? 0;apps/dashboard/app/(app)/settings/root-keys/[keyId]/history/page.tsx (2)
48-48
: Consider adding error handling for the Clickhouse query.The direct usage of
history
in the return statement assumes the Clickhouse query will always succeed. Consider adding error handling to gracefully handle potential query failures.- return <AccessTable verifications={history} />; + try { + const history = await clickhouse.verifications.logs({ + workspaceId: UNKEY_WORKSPACE_ID, + keySpaceId: key.keyAuthId, + keyId: key.id, + }); + return <AccessTable verifications={history} />; + } catch (error) { + console.error('Failed to fetch verification logs:', error); + return <div>Failed to load verification history. Please try again later.</div>; + }
Line range hint
1-48
: Consider adding data loading states.Since this is a Next.js page component fetching data, consider adding loading states to improve user experience during data fetching.
+import { Suspense } from 'react'; +import { LoadingSpinner } from '@/components/ui/loading-spinner'; +const VerificationHistory = async ({ history }: { history: typeof clickhouse.verifications.logs }) => { + return <AccessTable verifications={history} />; +} export default async function HistoryPage(props: { params: { keyId: string }; }) { // ... existing code ... - return <AccessTable verifications={history} />; + return ( + <Suspense fallback={<LoadingSpinner />}> + <VerificationHistory history={history} /> + </Suspense> + ); }apps/dashboard/app/(app)/ratelimits/sparkline.tsx (1)
Line range hint
33-38
: Consider adding ARIA labels for better accessibility.While the tooltip content is correctly updated, consider enhancing accessibility by adding ARIA labels to provide better context for screen readers.
<p className="text-content-subtle"> <strong className="text-content"> - {nFormatter(d.values.passed, { full: true })} + <span aria-label={`${d.values.passed} passed requests`}> + {nFormatter(d.values.passed, { full: true })} + </span> </strong>{" "} /{" "} - <strong className="text-content">{nFormatter(d.values.total, { full: true })}</strong>{" "} + <strong className="text-content"> + <span aria-label={`${d.values.total} total requests`}> + {nFormatter(d.values.total, { full: true })} + </span> + </strong>{" "} Passed </p>internal/clickhouse/src/logs.ts (3)
4-5
: Consider adding a type definition for the args object.While the function structure is good, we can improve type safety by extracting the args interface.
+interface GetLogsArgs { + workspaceId: string; + limit: number; +} + export function getLogs(ch: Querier) { - return async (args: { workspaceId: string; limit: number }) => { + return async (args: GetLogsArgs) => {
7-25
: Consider adding pagination support and index hints.The current query implementation has two potential performance concerns:
- It only uses LIMIT without proper pagination, which could be inefficient for large datasets
- The WHERE clause on workspace_id might benefit from index hints
Consider implementing cursor-based pagination using the
time
field and adding index hints:query: ` SELECT request_id, time, - workspace_id, + workspace_id FINAL, host, method, path, request_headers, request_body, response_status, response_headers, response_body, error, service_latency FROM default.raw_api_requests_v1 WHERE workspace_id = {workspaceId: String} + AND time < {cursor: DateTime} ORDER BY time DESC LIMIT {limit: Int}`,
37-42
: Consider optimizing data storage format for headers and body.Storing headers as array and body as string might not be the most efficient approach. Consider:
- Using a more structured format for headers (key-value pairs)
- Implementing compression for response/request bodies
- Adding size limits for the body fields
apps/www/components/stats.tsx (4)
Line range hint
6-28
: Consider refactoring repetitive database queries.The current implementation has duplicate code patterns across the three queries. Consider extracting this into a reusable function for better maintainability.
+async function getCount(table: typeof schema.workspaces | typeof schema.apis | typeof schema.keys): Promise<number> { + return db + .select({ count: sql<number>`count(*)` }) + .from(table) + .then((res) => res.at(0)?.count ?? 0) + .catch((err) => { + console.error(err); + return 0; + }); +} -const [workspaces, apis, keys] = await Promise.all([ - db.select({ count: sql<number>`count(*)` }) - .from(schema.workspaces) - // ... error handling - db.select({ count: sql<number>`count(*)` }) - .from(schema.apis) - // ... error handling - db.select({ count: sql<number>`count(*)` }) - .from(schema.keys) - // ... error handling -]); +const [workspaces, apis, keys] = await Promise.all([ + getCount(schema.workspaces), + getCount(schema.apis), + getCount(schema.keys), +]);
Line range hint
30-46
: Add loading and error states to improve user experience.The component could benefit from showing loading states while fetching data and error states when queries fail. Consider using a loading skeleton or spinner during data fetching.
Example implementation:
export function Stats() { const [isLoading, setIsLoading] = useState(true); const [error, setError] = useState<Error | null>(null); const [stats, setStats] = useState<{ workspaces: number; apis: number; keys: number; }>({ workspaces: 0, apis: 0, keys: 0 }); useEffect(() => { async function fetchStats() { try { const [workspaces, apis, keys] = await Promise.all([...]); setStats({ workspaces, apis, keys }); } catch (err) { setError(err instanceof Error ? err : new Error('Failed to fetch stats')); } finally { setIsLoading(false); } } fetchStats(); }, []); if (isLoading) return <LoadingSkeleton />; if (error) return <ErrorDisplay error={error} />; // ... rest of the component }
Line range hint
30-46
: Add JSDoc documentation for the Stats component.Consider adding component documentation to improve maintainability and help other developers understand the component's purpose and props.
+/** + * Stats component displays key metrics including the total number of workspaces, + * APIs, and keys in the system. + * + * @returns {JSX.Element} A card displaying various statistics + */ export function Stats() {
Enhance error handling in Stats component
The current error handling approach needs improvement since:
- The Stats component is rendered within a Suspense boundary but lacks proper error handling
- Silent failures returning 0 could mask database issues
- No user-facing error states are implemented
Recommendations:
- Implement error states in the StatList/StatListItem components to show when data fetching fails
- Add error logging to a monitoring service instead of just console.error
- Consider implementing a dedicated ErrorBoundary component for the Stats section
🔗 Analysis chain
Line range hint
6-28
: Enhance error handling strategy.Current error handling silently fails by returning 0. Consider:
- Adding error boundaries for React component
- Implementing proper error reporting
- Showing error states to users when queries fail
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Check if error boundaries are implemented in parent components rg -l "ErrorBoundary" apps/www/Length of output: 31
Script:
#!/bin/bash # Let's check for error handling patterns in the codebase more broadly # 1. Look for error boundary implementations rg -l "class.*extends.*Error" apps/www/ # 2. Check for error state handling in components rg "error.*state|isError|hasError" apps/www/ # 3. Look for error handling utilities or hooks rg -l "useError|handleError|ErrorHandler" apps/www/ # 4. Check the parent component that uses stats.tsx fd stats.tsx apps/www/ -x rg -l "stats" apps/www/ \;Length of output: 440
Script:
#!/bin/bash # Let's examine the error handling in stat-list.tsx since it's related cat apps/www/components/stat-list.tsx # Also check how stats are rendered in the parent component rg -A 5 -B 5 "<Stats" apps/www/Length of output: 1636
apps/dashboard/app/(app)/settings/root-keys/[keyId]/history/access-table.tsx (3)
Line range hint
38-40
: Consider using a more stable unique key.Using array indices as keys can lead to rendering issues if the array items are reordered or filtered. Consider using a combination of fields to create a unique key.
- // biome-ignore lint/suspicious/noArrayIndexKey: I got nothing better right now - <TableRow key={i}> + <TableRow key={`${verification.time}-${verification.region}-${verification.outcome}`}>
Line range hint
41-46
: Consider using Intl.DateTimeFormat for consistent date formatting.The current date formatting uses separate calls to
toDateString()
andtoTimeString()
. UsingIntl.DateTimeFormat
would provide more consistent and localized formatting.- <span className="text-content">{new Date(verification.time).toDateString()}</span> - <span className="text-xs text-content-subtle"> - {new Date(verification.time).toTimeString().split("(").at(0)} - </span> + {(() => { + const date = new Date(verification.time); + return ( + <> + <span className="text-content"> + {new Intl.DateTimeFormat('en-US', { dateStyle: 'long' }).format(date)} + </span> + <span className="text-xs text-content-subtle"> + {new Intl.DateTimeFormat('en-US', { timeStyle: 'medium' }).format(date)} + </span> + </> + ); + })()}
Line range hint
49-51
: Consider extracting outcome rendering to a separate component.The conditional rendering of the outcome could be extracted to a separate component for better maintainability and reusability.
+const VerificationOutcome: React.FC<{ outcome: string }> = ({ outcome }) => { + return outcome === "VALID" ? <Check /> : <Badge>{outcome}</Badge>; +}; // Then in the table: - {verification.outcome === "VALID" ? <Check /> : <Badge>{verification.outcome}</Badge>} + <VerificationOutcome outcome={verification.outcome} />apps/dashboard/app/(app)/ratelimits/card.tsx (2)
1-1
: LGTM! Consider adding error handling.The migration to Clickhouse looks clean and maintains parallel request efficiency. However, the Promise.all could benefit from error handling.
Consider wrapping the Promise.all with try/catch:
- const [history, lastUsed] = await Promise.all([ + const [history, lastUsed] = await Promise.all([ clickhouse.ratelimits.perMinute({ workspaceId: workspace.id, namespaceId: namespace.id, start: end - intervalMs, end, }), clickhouse.ratelimits .latest({ workspaceId: workspace.id, namespaceId: namespace.id }) - .then((res) => res.at(0)?.time), + .then((res) => res.at(0)?.time) + .catch(() => undefined), ]);Also applies to: 20-28
Line range hint
63-65
: Remove hardcoded datetime in time element.The time element contains a hardcoded datetime value "2024-03-11T19:38:06.192Z" which should be dynamic.
Apply this fix:
- <time dateTime="2024-03-11T19:38:06.192Z" title="20:38:06 CET"> + <time dateTime={new Date(lastUsed).toISOString()} title={new Date(lastUsed).toLocaleTimeString()}> {ms(Date.now() - lastUsed)} agointernal/clickhouse/src/index.ts (3)
21-23
: Add JSDoc documentation for the ClickHouseConfig typeConsider adding documentation to explain the purpose of the optional URL and its implications for the client behavior.
+/** + * Configuration for the ClickHouse client. + * @property url - Optional ClickHouse connection URL. If not provided, a Noop client will be used. + */ export type ClickHouseConfig = { url?: string; };
39-79
: Consider adding return types and result caching for gettersThe getter methods would benefit from explicit return type definitions and potential caching for frequently accessed data.
Example implementation for one getter:
interface VerificationsAPI { perHour: ReturnType<typeof getVerificationsPerHour>; perDay: ReturnType<typeof getVerificationsPerDay>; perMonth: ReturnType<typeof getVerificationsPerMonth>; latest: ReturnType<typeof getLatestVerifications>; } private verificationCache?: VerificationsAPI; public get verifications(): VerificationsAPI { if (!this.verificationCache) { this.verificationCache = { perHour: getVerificationsPerHour(this.client), perDay: getVerificationsPerDay(this.client), perMonth: getVerificationsPerMonth(this.client), latest: getLatestVerifications(this.client), }; } return this.verificationCache; }
25-80
: Consider implementing connection pooling and retry mechanismsFor production use, the ClickHouse client would benefit from:
- Connection pooling to handle multiple concurrent requests efficiently
- Retry mechanisms for transient failures
- Circuit breaker pattern for graceful degradation
This is especially important during the migration from Tinybird to ensure reliable operation.
apps/dashboard/app/(app)/apis/[apiId]/keys/[keyAuthId]/[keyId]/verification-table.tsx (3)
12-12
: LGTM! Consider using zod for runtime type safety.The migration from Tinybird to Clickhouse is properly reflected in both the imports and type definitions.
Consider adding zod schema validation for the verification data to ensure runtime type safety, especially since this data is coming from an external source:
import { z } from "zod"; const verificationSchema = z.object({ time: z.date(), region: z.string(), outcome: z.enum(["VALID", "RATE_LIMITED", "EXPIRED", "USAGE_EXCEEDED", "DISABLED", "FORBIDDEN", "INSUFFICIENT_PERMISSIONS"]) }); type LatestVerifications = z.infer<typeof verificationSchema>[];Also applies to: 20-20
43-54
: Great documentation! Consider adding a screenshot.The ASCII art comments effectively explain the visual grouping logic.
Consider adding a screenshot or link to a design spec in addition to the ASCII art to provide a more concrete visual reference for future developers.
65-65
: Consider using a more unique key strategy.While combining timestamp and index is better than using index alone, there could still be collisions if two verifications have the exact same timestamp.
Consider using a more unique identifier if available in the verification data, or create a more robust composite key:
-key={`${verification.time}-${i}`} +key={`${verification.time}-${verification.region}-${verification.outcome}-${i}`}apps/dashboard/components/array-input.tsx (1)
44-44
: Consider removing setSelected from dependency array.While including
setSelected
in the dependency array is technically correct, it's unnecessary as React guarantees that setState functions are stable and don't change between renders.- [selected, setSelected], + [selected],internal/clickhouse/src/active_keys.ts (2)
11-18
: Improve query readability and add error handling.Consider these improvements:
- Format the SQL query consistently with other functions
- Add error handling for potential ClickHouse query failures
const query = ch.query({ - query: ` SELECT count(DISTINCT keyId) as keys, time, FROM verifications.key_verifications_per_hour_v1 WHERE workspace_id = {workspaceId: String} AND key_space_id = {keySpaceId: String} AND time >= {start: Int64} AND time < {end: Int64} GROUP BY time - ORDER BY time ASC - WITH FILL - FROM toStartOfHour(fromUnixTimestamp64Milli({start: Int64})) - TO toStartOfHour(fromUnixTimestamp64Milli({end: Int64})) - STEP INTERVAL 1 HOUR - ;`, + query: ` + SELECT + count(DISTINCT keyId) as keys, + time + FROM verifications.key_verifications_per_hour_v1 + WHERE + workspace_id = {workspaceId: String} + AND key_space_id = {keySpaceId: String} + AND time >= {start: Int64} + AND time < {end: Int64} + GROUP BY time + ORDER BY time ASC + WITH FILL + FROM toStartOfHour(fromUnixTimestamp64Milli({start: Int64})) + TO toStartOfHour(fromUnixTimestamp64Milli({end: Int64})) + STEP INTERVAL 1 HOUR + ;`,Add try-catch for error handling:
try { return await query(args); } catch (error) { throw new Error(`Failed to fetch hourly active keys: ${error.message}`); }
4-114
: Consider refactoring to reduce code duplication.All three functions share similar structure and differ mainly in their time granularity. Consider extracting the common logic into a reusable function.
type TimeGranularity = 'hour' | 'day' | 'month'; function getActiveKeys(ch: Querier, granularity: TimeGranularity) { const tableMap: Record<TimeGranularity, string> = { hour: 'key_verifications_per_hour_v1', day: 'key_verifications_per_day_v1', month: 'key_verifications_per_month_v1', }; const timeFunction: Record<TimeGranularity, string> = { hour: 'toStartOfHour', day: 'toStartOfDay', month: 'toStartOfMonth', }; const intervalStep: Record<TimeGranularity, string> = { hour: 'HOUR', day: 'DAY', month: 'MONTH', }; return async (args: { workspaceId: string; keySpaceId: string; start: number; end: number; }) => { const query = ch.query({ query: ` SELECT count(DISTINCT keyId) as keys, time FROM verifications.${tableMap[granularity]} WHERE workspace_id = {workspaceId: String} AND key_space_id = {keySpaceId: String} AND time >= {start: Int64} AND time < {end: Int64} GROUP BY time ORDER BY time ASC WITH FILL FROM ${timeFunction[granularity]}(fromUnixTimestamp64Milli({start: Int64})) TO ${timeFunction[granularity]}(fromUnixTimestamp64Milli({end: Int64})) STEP INTERVAL 1 ${intervalStep[granularity]} ;`, params: z.object({ workspaceId: z.string(), keySpaceId: z.string(), start: z.number().int(), end: z.number().int(), }), schema: z.object({ keys: z.number().int(), time: z.number().int(), }), }); try { return await query(args); } catch (error) { throw new Error(`Failed to fetch ${granularity}ly active keys: ${error.message}`); } }; } export const getActiveKeysPerHour = (ch: Querier) => getActiveKeys(ch, 'hour'); export const getActiveKeysPerDay = (ch: Querier) => getActiveKeys(ch, 'day'); export const getActiveKeysPerMonth = (ch: Querier) => getActiveKeys(ch, 'month');apps/dashboard/app/(app)/success/page.tsx (1)
67-68
: Consider adding error handling for the Clickhouse query.While the implementation is cleaner, it would be beneficial to add error handling for the Clickhouse query to gracefully handle potential failures.
- const activeWorkspaces = await clickhouse.business.activeWorkspaces(); - const chartData = activeWorkspaces.map(({ time, workspaces }) => ({ + const activeWorkspaces = await clickhouse.business.activeWorkspaces().catch((error) => { + console.error('Failed to fetch active workspaces:', error); + return []; + }); + const chartData = activeWorkspaces.map(({ time, workspaces }) => ({apps/api/src/pkg/middleware/metrics.ts (1)
128-135
: Consider extracting geographical data parsing to a helper function.The pattern of extracting geographical data with ts-ignore comments is repeated in the code. Consider creating a helper function to reduce duplication and centralize the type handling.
+ function extractGeoData(req: Request) { + // @ts-expect-error - Cloudflare specific fields + const cf = req.raw?.cf; + return { + continent: cf?.continent, + country: cf?.country, + colo: cf?.colo, + city: cf?.city, + }; + } // Usage in both places: - // @ts-ignore - this is a bug in the types - continent: c.req.raw?.cf?.continent, - // @ts-ignore - this is a bug in the types - country: c.req.raw?.cf?.country, - // @ts-ignore - this is a bug in the types - colo: c.req.raw?.cf?.colo, - // @ts-ignore - this is a bug in the types - city: c.req.raw?.cf?.city, + ...extractGeoData(c.req)apps/play/app/page-bk.tsx (3)
Line range hint
52-54
: Fix invalid JSON format in error message.The error message for invalid curl commands has incorrect JSON syntax and is inconsistent with the error handling pattern used elsewhere in the code.
Apply this diff to maintain consistent error handling:
if (!curlString.includes("curl")) { - postNewLine('{"Error", "Invalid Curl Command"}', "text-red-500"); + postNewLine(JSON.stringify({ error: "Invalid Curl Command" }, null, 2), "text-red-500"); return; }
Line range hint
46-50
: Avoid state mutation and use functional updates.The current implementation mutates state directly before updating it, which is an anti-pattern in React. Use a functional update to ensure state updates are handled correctly.
Apply this diff to implement a proper state update:
function postNewLine(input: string, color: string) { - const temp = historyItems; - temp.push({ content: input, color: color }); - setHistoryItems([...temp]); + setHistoryItems(prev => [...prev, { content: input, color }]); }
Line range hint
1-190
: Consider splitting component responsibilities.The component currently handles multiple concerns:
- UI rendering
- Terminal command processing
- Curl command parsing
- State management
- Response handling
Consider extracting these responsibilities into:
- Custom hooks for state and command processing
- Separate service for curl command handling
- Smaller UI components for terminal and history
This would improve maintainability, testability, and reusability.
Would you like help with this refactoring?
apps/api/src/pkg/analytics.ts (4)
Line range hint
133-149
: Consider documenting the verification outcomes.The implementation looks good, but consider adding JSDoc comments to document the meaning of each outcome in the enum for better maintainability.
outcome: z.enum([ + // Key is valid and active "VALID", + // Key exceeded rate limits "RATE_LIMITED", + // Key has expired "EXPIRED", + // Key is disabled "DISABLED", + // Key is not authorized "FORBIDDEN", + // Key exceeded usage limits "USAGE_EXCEEDED", + // Key lacks required permissions "INSUFFICIENT_PERMISSIONS", ]),
Line range hint
157-179
: Consider adding length limits to string fields.The implementation looks good, but consider adding maximum length constraints to string fields like
request_body
,response_body
, anduser_agent
to prevent potential performance issues with extremely large strings.request_body: z.string(), + request_body: z.string().max(1000000), // Adjust limit as needed response_body: z.string(), + response_body: z.string().max(1000000), // Adjust limit as needed user_agent: z.string(), + user_agent: z.string().max(1000), // Adjust limit as needed
Line range hint
1-265
: Consider documenting the migration strategy.The file shows a well-structured incremental migration from Tinybird to ClickHouse. To help team members understand the migration process and timeline:
- Consider adding a top-level comment documenting the migration strategy
- Add TODO comments with ticket numbers for remaining Tinybird cleanup
- Consider creating a migration tracking issue to coordinate the removal of Tinybird components
Would you like me to help create a migration tracking issue with a checklist of remaining Tinybird components to remove?
Migration to ClickHouse telemetry is incomplete
The code shows that while the new
insertSdkTelemetry
method has been added, the middleware inmetrics.ts
is still using the oldingestSdkTelemetry
method. This needs to be updated to use the new ClickHouse implementation before removing the Tinybird code.
apps/api/src/pkg/middleware/metrics.ts
: Update the telemetry middleware to useanalytics.insertSdkTelemetry
instead ofanalytics.ingestSdkTelemetry
apps/api/src/pkg/analytics.ts
: Remove the old Tinybird implementation only after updating all its usages🔗 Analysis chain
Line range hint
44-53
: Consider removing the Tinybird implementation.The new ClickHouse implementation looks good. Since the PR objectives mention removing unused Tinybird components, consider removing the old
ingestSdkTelemetry
implementation in a follow-up PR.
Let me check the actual usage of
ingestSdkTelemetry
in these files to determine if it's still actively used.
Let me check if there's a new method being used in metrics.ts to ensure we're not removing something that hasn't been migrated yet.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Verify if the old Tinybird telemetry endpoint is still in use rg -l "ingestSdkTelemetry" --type tsLength of output: 105
Script:
#!/bin/bash # Check the actual usage of ingestSdkTelemetry in both files echo "=== analytics.ts ===" rg "ingestSdkTelemetry" -A 5 -B 5 apps/api/src/pkg/analytics.ts echo -e "\n=== metrics.ts ===" rg "ingestSdkTelemetry" -A 5 -B 5 apps/api/src/pkg/middleware/metrics.tsLength of output: 916
Script:
#!/bin/bash # Check if the new insertSdkTelemetry is used in metrics.ts echo "=== Checking metrics.ts for new method ===" rg "insertSdkTelemetry" apps/api/src/pkg/middleware/metrics.ts # Also check the full context of the telemetry middleware echo -e "\n=== Full telemetry context in metrics.ts ===" rg -A 10 -B 10 "analytics\.ingestSdkTelemetry" apps/api/src/pkg/middleware/metrics.tsLength of output: 1043
apps/api/src/pkg/ratelimit/client.ts (3)
Line range hint
165-170
: Consider adding a debug log for cache hits.While the cache hit logic is correct, adding debug logging would help with monitoring and troubleshooting cache effectiveness.
if (cached.current >= req.limit) { + this.logger.debug("rate limit cache hit", { identifier: req.identifier, current: cached.current }); return Ok({ passed: false, current: cached.current, reset, remaining: 0, triggered: req.name, }); }
Line range hint
227-232
: Consider consolidating response object creation.Both response objects share similar structure. Consider extracting a helper function to create the response object to reduce duplication and ensure consistency.
+ private createResponse(passed: boolean, current: number, reset: number, limit: number, name: string | null): RatelimitResponse { + return { + passed, + current, + reset, + remaining: limit - current, + triggered: passed ? null : name, + }; + } // Then use it like: - return Ok({ - passed: false, - current: cached.current, - reset, - remaining: req.limit - cached.current, - triggered: req.name, - }); + return Ok(this.createResponse(false, cached.current, reset, req.limit, req.name));Also applies to: 237-242
Line range hint
1-305
: Consider adding retry configuration parameters.The retry logic uses hardcoded values (3 attempts) in multiple places. Consider making these configurable through the constructor to allow for different retry strategies in different environments.
export class AgentRatelimiter implements RateLimiter { private readonly logger: Logger; private readonly metrics: Metrics; private readonly cache: Map<string, { reset: number; current: number }>; private readonly agent: Agent; + private readonly maxRetries: number; constructor(opts: { agent: { url: string; token: string }; logger: Logger; metrics: Metrics; cache: Map<string, { reset: number; current: number }>; + maxRetries?: number; }) { this.logger = opts.logger; this.metrics = opts.metrics; this.cache = opts.cache; this.agent = new Agent(opts.agent.url, opts.agent.token, this.metrics, this.logger); + this.maxRetries = opts.maxRetries ?? 3; }apps/api/src/routes/v1_ratelimit_limit.ts (2)
Line range hint
356-394
: Consider abstracting context collection.The context collection logic is duplicated across different analytics calls. Consider creating a helper function to centralize this logic:
function getRequestContext(c: Context) { return { ipAddress: c.req.header("True-Client-IP") ?? c.req.header("CF-Connecting-IP") ?? "", userAgent: c.req.header("User-Agent") ?? "", country: c.req.raw?.cf?.country ?? "", continent: c.req.raw?.cf?.continent ?? "", city: c.req.raw?.cf?.city ?? "", colo: c.req.raw?.cf?.colo ?? "", }; }Document migration strategy.
The code maintains both Clickhouse and Tinybird implementations. Consider adding a comment explaining the migration strategy and timeline for removing Tinybird.
Line range hint
395-444
: Reduce duplication in audit log generation.The audit log payload is duplicated between generic and Tinybird implementations. Consider extracting the common fields:
function createAuditLogPayload(c: Context, namespace: RatelimitNamespace, rootKey: RootKey, ratelimitResponse: RatelimitResponse) { return { auditLogId: newId("auditLog"), workspaceId: rootKey.authorizedWorkspaceId, bucket: namespace.id, actor: { type: "key", id: rootKey.key.id, }, description: "ratelimit", event: ratelimitResponse.passed ? "ratelimit.success" : "ratelimit.denied", meta: { requestId: c.get("requestId"), namespacId: namespace.id, identifier: req.identifier, success: ratelimitResponse.passed, }, time: Date.now(), resources: req.resources, context: { location: c.req.header("True-Client-IP") ?? "", userAgent: c.req.header("User-Agent") ?? "", }, }; }apps/dashboard/app/(app)/settings/billing/page.tsx (2)
12-12
: LGTM! Consider future migration of ratelimits to Clickhouse.The import changes correctly reflect the migration to Clickhouse for verifications while maintaining Tinybird for ratelimits.
Consider migrating the ratelimits functionality to Clickhouse in a future PR to maintain consistency and reduce dependencies.
Also applies to: 15-15
188-192
: LGTM! Consider enhancing error handling.The migration to Clickhouse for paid tier billing is implemented correctly. However, consider adding error handling for the Promise.all to gracefully handle potential failures in either the verifications or ratelimits requests.
const [usedVerifications, usedRatelimits] = await Promise.all([ clickhouse.billing.billableVerifications({ workspaceId: workspace.id, year, month, }), ratelimits({ workspaceId: workspace.id, year, month, }).then((res) => res.data.at(0)?.success ?? 0), - ]); + ]).catch(error => { + console.error('Failed to fetch usage metrics:', error); + return [0, 0]; // Provide fallback values + });internal/clickhouse/src/client/noop.ts (1)
Line range hint
14-17
: Validation result inquery
method is not checkedIn the
query
method, the result ofsafeParse
is not being verified. This means that invalid parameters may pass through without raising an error, potentially causing issues later.Apply this diff to handle validation errors consistently:
public query<TIn extends z.ZodSchema<any>, TOut extends z.ZodSchema<any>>(req: { // ... }): (params: z.input<TIn>) => Promise<z.output<TOut>[]> { return async (params: z.input<TIn>): Promise<z.output<TOut>[]> => { - req.params?.safeParse(params); + const validationResult = req.params?.safeParse(params); + if (validationResult && !validationResult.success) { + throw new Error(validationResult.error.message); + } return []; }; }internal/clickhouse/src/verifications.ts (1)
31-47
: Simplify SQL queries by removing unnecessary 'GROUP BY' clausesThe
GROUP BY time
clause is unnecessary when the aggregation is already handled by the materialized views in ClickHouse. This could potentially cause errors if not all selected columns are included in theGROUP BY
clause.Consider removing the
GROUP BY
clause to simplify the query. Apply this diff as an example:- GROUP BY time
Also applies to: 60-76, 85-101
apps/dashboard/app/(app)/identities/[identityId]/page.tsx (1)
165-171
: Simplify the asynchronous call by removing unnecessary.then()
afterawait
.Since you're already using
await
, you can handle the response directly without chaining.then()
. This improves readability and aligns with async/await best practices.Apply this diff to simplify the code:
- const lastUsed = await clickhouse.verifications - .latest({ - workspaceId: props.workspaceId, - keySpaceId: props.keySpaceId, - keyId: props.keyId, - }) - .then((res) => res.at(0)?.time ?? null); + const res = await clickhouse.verifications.latest({ + workspaceId: props.workspaceId, + keySpaceId: props.keySpaceId, + keyId: props.keyId, + }); + const lastUsed = res.at(0)?.time ?? null;apps/dashboard/app/(app)/ratelimits/[namespaceId]/logs/page.tsx (2)
125-127
: Consider enhancing error handling for server-side operationsCurrently, errors are logged using
console.error(err)
before rethrowing them. On the server side, relying onconsole.error
may not be sufficient for monitoring and debugging. Consider integrating a logging library or service (e.g., Winston, Pino, or Sentry) to capture and manage errors more effectively.
191-191
: Placeholders used instead of actual data for 'Remaining' and 'IP Address'The table cells for 'Remaining' and 'IP address' display
'--'
as placeholders. If this data is not available from the current data source, consider:
- Fetching and displaying the actual values if possible.
- Providing a tooltip or note explaining why the data is unavailable.
- Removing the columns if they no longer serve a purpose.
Also applies to: 194-194
internal/clickhouse/src/ratelimits.ts (1)
111-112
: Ensure consistent use ofz.infer
vsz.input
in function argument typesIn your functions
getRatelimitsPerMonth
,getRatelimitLogs
, andgetRatelimitLastUsed
, you are usingz.input<typeof params>
for the argument types, whereas in the previous functions you usedz.infer<typeof params>
. For consistency and to prevent potential type mismatches, consider usingz.infer<typeof params>
unless there is a specific reason to usez.input
.Apply this diff for consistency:
-export function getRatelimitsPerMonth(ch: Querier) { - return async (args: z.input<typeof params>) => { +export function getRatelimitsPerMonth(ch: Querier) { + return async (args: z.infer<typeof params>) => { ... -export function getRatelimitLogs(ch: Querier) { - return async (args: z.input<typeof getRatelimitLogsParameters>) => { +export function getRatelimitLogs(ch: Querier) { + return async (args: z.infer<typeof getRatelimitLogsParameters>) => { ... -export function getRatelimitLastUsed(ch: Querier) { - return async (args: z.input<typeof getRatelimitLastUsedParameters>) => { +export function getRatelimitLastUsed(ch: Querier) { + return async (args: z.infer<typeof getRatelimitLastUsedParameters>) => {Also applies to: 154-155, 190-191
apps/dashboard/app/(app)/ratelimits/[namespaceId]/page.tsx (4)
1-1
: Remove unused import.The
StackedBarChart
import is not being used in the code. Please remove it to keep the import statements clean.-import { StackedColumnChart } from "@/components/dashboard/charts";
43-43
: Simplify the redirect logic.Instead of calling the
redirect
function separately, you can directly return its result to simplify the code.- redirect("/ratelimits"); + return redirect("/ratelimits");
84-91
: Rename variables for consistency.The variable names have been updated from
successOverTime
topassedOverTime
, andfailedOverTime
toratelimitedOverTime
. This change aligns with the new naming convention used in theratelimitEvents
data.Please ensure that all references to these variables have been updated accordingly throughout the file.
119-120
: Update the API call snippet.The headers in the API call snippet have been updated to include the
Content-Type
andAuthorization
headers. This change aligns with the current API requirements.Please ensure that the documentation and any other references to this API call are updated to reflect these changes.
apps/dashboard/app/(app)/apis/[apiId]/page.tsx (3)
Line range hint
30-33
: Avoid mutating the same Date object when calculating billing cycle dates.The variable
t
is being mutated multiple times, which can lead to unintended side effects. To ensure accurate date calculations, consider creating new Date instances when calculatingbillingCycleEnd
.Apply this diff to prevent mutations:
const t = new Date(); t.setUTCDate(1); t.setUTCHours(0, 0, 0, 0); const billingCycleStart = t.getTime(); -const billingCycleEnd = t.setUTCMonth(t.getUTCMonth() + 1) - 1; +const billingCycleEnd = new Date(t).setUTCMonth(t.getUTCMonth() + 1) - 1;Alternatively, clone the date object before modifying:
const t = new Date(); t.setUTCDate(1); t.setUTCHours(0, 0, 0, 0); const billingCycleStart = t.getTime(); +const tEnd = new Date(t.getTime()); const billingCycleEnd = tEnd.setUTCMonth(tEnd.getUTCMonth() + 1) - 1;
87-109
: Add a default case to the switch statement to handle unexpectedd.outcome
values.The switch statement lacks a default case, which means any unexpected
d.outcome
values will be silently ignored. Adding a default case improves robustness by handling unforeseen outcomes.Apply this diff to include a default case:
switch (d.outcome) { case "VALID": successOverTime.push({ x, y: d.count }); break; // Other cases... case "FORBIDDEN": forbiddenOverTime.push({ x, y: d.count }); break; + default: + // Optionally handle unexpected outcomes + console.warn(`Unhandled verification outcome: ${d.outcome}`); + break; }
Line range hint
279-319
: RefactorprepareInterval
function to reduce code duplication.The
prepareInterval
function contains repetitive code across different cases. Consider refactoring to eliminate duplication and improve maintainability.You can create a mapping object or use variables to consolidate common logic. Here's an example:
function prepareInterval(interval: Interval) { const now = new Date(); + let end: number; + let intervalMs: number; + let granularity: number; + let getVerificationsPerInterval: Function; + let getActiveKeysPerInterval: Function; switch (interval) { case "24h": { end = now.setUTCHours(now.getUTCHours() + 1, 0, 0, 0); intervalMs = 1000 * 60 * 60 * 24; granularity = 1000 * 60 * 60; getVerificationsPerInterval = clickhouse.verifications.perHour; getActiveKeysPerInterval = clickhouse.activeKeys.perHour; break; } case "7d": case "30d": case "90d": { now.setUTCDate(now.getUTCDate() + 1); end = now.setUTCHours(0, 0, 0, 0); intervalMs = interval === "7d" ? 1000 * 60 * 60 * 24 * 7 : interval === "30d" ? 1000 * 60 * 60 * 24 * 30 : 1000 * 60 * 60 * 24 * 90; granularity = 1000 * 60 * 60 * 24; getVerificationsPerInterval = clickhouse.verifications.perDay; getActiveKeysPerInterval = clickhouse.activeKeys.perDay; break; } } return { start: end - intervalMs, end, intervalMs, granularity, getVerificationsPerInterval, getActiveKeysPerInterval, }; }apps/dashboard/app/(app)/apis/[apiId]/keys/[keyAuthId]/[keyId]/page.tsx (1)
111-111
: Avoid mutating the originalverifications
array when sortingSorting the
verifications
array in-place can lead to unexpected side effects if it's used elsewhere in the code. Consider sorting a copy of the array to prevent potential issues.Apply this diff to sort a copy of the array:
-for (const d of verifications.sort((a, b) => a.time - b.time)) { +for (const d of [...verifications].sort((a, b) => a.time - b.time)) {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
⛔ Files ignored due to path filters (1)
pnpm-lock.yaml
is excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (82)
Taskfile.yml
(1 hunks)apps/agent/pkg/clickhouse/schema/005_create_mv_key_verifications_per_day.sql
(0 hunks)apps/api/package.json
(1 hunks)apps/api/src/pkg/analytics.ts
(7 hunks)apps/api/src/pkg/keys/service.ts
(1 hunks)apps/api/src/pkg/middleware/metrics.ts
(1 hunks)apps/api/src/pkg/ratelimit/client.ts
(8 hunks)apps/api/src/pkg/ratelimit/interface.ts
(1 hunks)apps/api/src/pkg/ratelimit/noop.ts
(1 hunks)apps/api/src/routes/v1_ratelimit_limit.ts
(5 hunks)apps/dashboard/app/(app)/apis/[apiId]/keys/[keyAuthId]/[keyId]/page.tsx
(11 hunks)apps/dashboard/app/(app)/apis/[apiId]/keys/[keyAuthId]/[keyId]/verification-table.tsx
(3 hunks)apps/dashboard/app/(app)/apis/[apiId]/page.tsx
(11 hunks)apps/dashboard/app/(app)/banner.tsx
(2 hunks)apps/dashboard/app/(app)/identities/[identityId]/page.tsx
(3 hunks)apps/dashboard/app/(app)/logs/page.tsx
(2 hunks)apps/dashboard/app/(app)/ratelimits/[namespaceId]/logs/page.tsx
(4 hunks)apps/dashboard/app/(app)/ratelimits/[namespaceId]/overrides/table.tsx
(3 hunks)apps/dashboard/app/(app)/ratelimits/[namespaceId]/page.tsx
(12 hunks)apps/dashboard/app/(app)/ratelimits/card.tsx
(2 hunks)apps/dashboard/app/(app)/ratelimits/sparkline.tsx
(2 hunks)apps/dashboard/app/(app)/settings/billing/page.tsx
(3 hunks)apps/dashboard/app/(app)/settings/root-keys/[keyId]/history/access-table.tsx
(1 hunks)apps/dashboard/app/(app)/settings/root-keys/[keyId]/history/page.tsx
(3 hunks)apps/dashboard/app/(app)/settings/root-keys/[keyId]/layout.tsx
(3 hunks)apps/dashboard/app/(app)/settings/root-keys/[keyId]/page.tsx
(4 hunks)apps/dashboard/app/(app)/success/page.tsx
(3 hunks)apps/dashboard/components/array-input.tsx
(2 hunks)apps/dashboard/lib/analytics.tsx
(0 hunks)apps/dashboard/lib/charts/sparkline.tsx
(3 hunks)apps/dashboard/lib/clickhouse.ts
(1 hunks)apps/dashboard/lib/clickhouse/index.ts
(0 hunks)apps/dashboard/lib/tinybird.ts
(1 hunks)apps/dashboard/middleware.ts
(0 hunks)apps/dashboard/package.json
(1 hunks)apps/play/app/page-bk.tsx
(1 hunks)apps/www/.env.example
(1 hunks)apps/www/components/stats.tsx
(1 hunks)apps/www/lib/env.ts
(0 hunks)apps/www/lib/tinybird.ts
(0 hunks)deployment/grafana/grafana.yaml
(1 hunks)internal/clickhouse/package.json
(1 hunks)internal/clickhouse/schema/000_README.md
(2 hunks)internal/clickhouse/schema/001_create_databases.sql
(1 hunks)internal/clickhouse/schema/002_create_metrics_raw_api_requests_v1.sql
(2 hunks)internal/clickhouse/schema/003_create_verifications_raw_key_verifications_v1.sql
(2 hunks)internal/clickhouse/schema/004_create_verifications_key_verifications_per_hour_v1.sql
(1 hunks)internal/clickhouse/schema/005_create_verifications_key_verifications_per_day_v1.sql
(1 hunks)internal/clickhouse/schema/006_create_verifications_key_verifications_per_month_v1.sql
(1 hunks)internal/clickhouse/schema/007_create_verifications_key_verifications_per_hour_mv_v1.sql
(1 hunks)internal/clickhouse/schema/008_create_verifications_key_verifications_per_day_mv_v1.sql
(1 hunks)internal/clickhouse/schema/009_create_verifications_key_verifications_per_month_mv_v1.sql
(1 hunks)internal/clickhouse/schema/010_create_ratelimits_raw_ratelimits_table.sql
(1 hunks)internal/clickhouse/schema/011_create_telemetry_raw_sdks_v1.sql
(2 hunks)internal/clickhouse/schema/012_create_billing_billable_verifications_per_month_v1.sql
(1 hunks)internal/clickhouse/schema/013_create_billing_billable_verifications_per_month_mv_v1.sql
(1 hunks)internal/clickhouse/schema/014_create_ratelimits_ratelimits_per_minute_v1.sql
(1 hunks)internal/clickhouse/schema/015_create_ratelimits_ratelimits_per_hour_v1.sql
(1 hunks)internal/clickhouse/schema/016_create_ratelimits_ratelimits_per_day_v1.sql
(1 hunks)internal/clickhouse/schema/017_create_ratelimits_ratelimits_per_month_v1.sql
(1 hunks)internal/clickhouse/schema/018_create_ratelimits_ratelimits_per_minute_mv_v1.sql
(1 hunks)internal/clickhouse/schema/019_create_ratelimits_ratelimits_per_hour_mv_v1.sql
(1 hunks)internal/clickhouse/schema/020_create_ratelimits_ratelimits_per_day_mv_v1.sql
(1 hunks)internal/clickhouse/schema/021_create_ratelimits_ratelimits_per_month_mv_v1.sql
(1 hunks)internal/clickhouse/schema/022_create_business_active_workspaces_per_month_v1.sql
(1 hunks)internal/clickhouse/schema/023_create_business_active_workspaces_per_month_mv_v1.sql
(1 hunks)internal/clickhouse/schema/024_create_ratelimits_last_used_mv_v1.sql
(1 hunks)internal/clickhouse/src/active_keys.ts
(1 hunks)internal/clickhouse/src/billing.ts
(1 hunks)internal/clickhouse/src/client/client.ts
(2 hunks)internal/clickhouse/src/client/interface.ts
(2 hunks)internal/clickhouse/src/client/noop.ts
(1 hunks)internal/clickhouse/src/index.ts
(1 hunks)internal/clickhouse/src/last_used.ts
(1 hunks)internal/clickhouse/src/latest_verifications.ts
(1 hunks)internal/clickhouse/src/logs.ts
(1 hunks)internal/clickhouse/src/ratelimits.ts
(1 hunks)internal/clickhouse/src/success.ts
(1 hunks)internal/clickhouse/src/util.ts
(1 hunks)internal/clickhouse/src/verifications.ts
(1 hunks)internal/db/src/schema/workspaces.ts
(1 hunks)internal/db/src/types.ts
(1 hunks)
⛔ Files not processed due to max files limit (3)
- tools/migrate/auditlog-import.ts
- tools/migrate/package.json
- tools/migrate/tinybird-export.ts
💤 Files with no reviewable changes (6)
- apps/agent/pkg/clickhouse/schema/005_create_mv_key_verifications_per_day.sql
- apps/dashboard/lib/analytics.tsx
- apps/dashboard/lib/clickhouse/index.ts
- apps/dashboard/middleware.ts
- apps/www/lib/env.ts
- apps/www/lib/tinybird.ts
✅ Files skipped from review due to trivial changes (5)
- apps/api/package.json
- apps/dashboard/lib/clickhouse.ts
- apps/dashboard/package.json
- apps/www/.env.example
- internal/clickhouse/package.json
🔇 Additional comments (124)
internal/clickhouse/src/util.ts (1)
1-2
: LGTM! Clear import and helpful documentation.
The import statement is appropriate, and the comment effectively explains why this utility is needed.
internal/clickhouse/schema/022_create_business_active_workspaces_per_month_v1.sql (1)
12-13
: LGTM!
The DROP TABLE statement correctly removes the table during rollback.
internal/clickhouse/schema/012_create_billing_billable_verifications_per_month_v1.sql (1)
9-10
: Good choice of SummingMergeTree engine.
The SummingMergeTree engine is perfect for this use case as it:
- Automatically sums the
count
column during merges - Provides efficient querying by workspace_id, year, and month
internal/clickhouse/schema/001_create_databases.sql (1)
1-2
: LGTM: Migration structure follows best practices.
The migration file correctly uses Goose markers and maintains a clear separation between up and down migrations.
Also applies to: 11-11
internal/clickhouse/schema/015_create_ratelimits_ratelimits_per_hour_v1.sql (2)
17-18
: LGTM: Clean migration rollback.
The down migration correctly drops the table, providing a clean rollback path.
12-13
: Verify the ordering key selection for query patterns.
The current ORDER BY clause (workspace_id, namespace_id, time, identifier) is optimized for queries filtering by workspace and namespace first. Ensure this aligns with your most common query patterns.
Consider these alternatives based on your query patterns:
- If you frequently query recent data:
ORDER BY (time, workspace_id, namespace_id, identifier)
- If you need workspace-specific time ranges:
ORDER BY (workspace_id, time)
✅ Verification successful
Based on the search results, I can now generate a final response. The results show the actual query patterns in internal/clickhouse/src/ratelimits.ts
and the table structure through the schema files.
The current ORDER BY clause is optimal for the observed query patterns
The query pattern in internal/clickhouse/src/ratelimits.ts
shows that queries are indeed primarily filtered by workspace_id
and namespace_id
first, followed by time-based filtering. The current ordering (workspace_id, namespace_id, time, identifier)
aligns perfectly with this access pattern:
- Primary filters:
WHERE workspace_id = {workspaceId: String} AND namespace_id = {namespaceId: String}
- Secondary filters: Time range conditions
The ordering enables efficient data skipping for the most common query pattern, making it the optimal choice for this use case.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for query patterns in the codebase to verify the optimal ordering
rg -A 5 "ratelimits_per_hour" --type typescript --type rust
Length of output: 94
Script:
#!/bin/bash
# Search for query patterns in the codebase with broader scope
rg -A 5 "ratelimits_per_hour"
Length of output: 2180
Script:
#!/bin/bash
# Look for SQL queries or database interactions
rg -A 5 "SELECT.*FROM.*ratelimits_per_hour"
Length of output: 45
Script:
#!/bin/bash
# Check for rate limit related files
fd -t f ratelimit
Length of output: 5331
Script:
#!/bin/bash
# Look for database schema files
fd -t f -e sql
Length of output: 1917
internal/clickhouse/schema/016_create_ratelimits_ratelimits_per_day_v1.sql (1)
18-19
: Consider data preservation in down migration.
The current down migration will permanently delete all data. Consider adding a step to backup the data before dropping the table, especially if this table will store important metrics.
-- +goose down
+-- Backup data before dropping (adjust destination table as needed)
+CREATE TABLE IF NOT EXISTS ratelimits.ratelimits_per_day_v1_backup AS
+SELECT * FROM ratelimits.ratelimits_per_day_v1;
+
DROP TABLE ratelimits.ratelimits_per_day_v1;
internal/clickhouse/schema/024_create_ratelimits_last_used_mv_v1.sql (2)
1-2
: LGTM: Migration structure follows best practices.
The migration uses the standard goose format with clear up/down migrations.
Also applies to: 18-19
2-15
: Verify the target table exists and indexes are optimized.
The materialized view logic looks correct, using maxSimpleState
for efficient aggregation. However, we should verify:
- The target table
ratelimits.ratelimits_last_used_v1
exists - Proper indexes are set up for the grouped columns to optimize query performance
internal/clickhouse/schema/017_create_ratelimits_ratelimits_per_month_v1.sql (1)
18-19
: LGTM! The down migration is correct.
The down migration properly drops the table, allowing for clean rollback if needed.
internal/clickhouse/schema/014_create_ratelimits_ratelimits_per_minute_v1.sql (1)
18-19
: LGTM: Clean rollback implementation
The rollback (down migration) correctly drops the table.
internal/clickhouse/schema/010_create_ratelimits_raw_ratelimits_table.sql (1)
1-19
: LGTM! Migration structure and table schema look good.
The table schema is well-designed with appropriate column types and a logical ordering that supports efficient querying of rate limits by workspace, namespace, and time ranges.
internal/clickhouse/schema/005_create_verifications_key_verifications_per_day_v1.sql (2)
17-19
: LGTM: Clean down migration.
The down migration correctly drops the table, providing a clean rollback path.
12-14
: Verify engine change implications.
The switch from AggregatingMergeTree
to SummingMergeTree
is significant:
SummingMergeTree
is appropriate here as it will automatically sum thecount
column during merges- The ordering looks correct for efficient querying by workspace and key space
Let's verify there are no other tables or views depending on the old aggregation behavior:
✅ Verification successful
Engine change from AggregatingMergeTree to SummingMergeTree is safe
The materialized view that populates this table performs a simple COUNT aggregation grouped by all the relevant dimensions. SummingMergeTree is perfectly suited for this use case as it will automatically sum the count
column during merges, which matches the aggregation behavior needed. The view doesn't use any complex aggregation functions that would require AggregatingMergeTree.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for any references to the old table name or views
rg -l "key_verifications_per_day_v1" --type sql
Length of output: 220
Script:
#!/bin/bash
# Let's check the content of the materialized view to understand its dependency
cat internal/clickhouse/schema/008_create_verifications_key_verifications_per_day_mv_v1.sql
Length of output: 581
internal/clickhouse/schema/006_create_verifications_key_verifications_per_month_v1.sql (2)
16-17
: LGTM! Clean down migration.
The down migration correctly drops the table, ensuring a clean rollback if needed.
2-2
: Verify schema existence and dependencies.
Let's ensure the schema exists and check for any potential dependencies.
✅ Verification successful
Schema exists and table has dependencies - proceed with caution
The 'verifications' schema is properly created in migration 001_create_databases.sql. However, this table (key_verifications_per_month_v1) has three materialized views depending on it:
- verifications.key_verifications_per_month_mv_v1
- billing.billable_verifications_per_month_mv_v1
- business.active_workspaces_keys_per_month_mv_v1
Any schema changes to this table must ensure compatibility with these dependent views.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify schema existence and dependencies
# Check if other migration files reference the 'verifications' schema
echo "Checking for schema creation:"
rg -l "CREATE SCHEMA.*verifications" internal/clickhouse/schema/
# Check for any views or tables that might depend on this table
echo "Checking for dependencies:"
rg -l "key_verifications_per_month_v1" internal/clickhouse/schema/ | grep -v "006_create_verifications"
Length of output: 564
Script:
#!/bin/bash
# Let's check the content of the files that depend on this table
# and also look for schema creation in earlier migrations
# Check for schema creation in all SQL files
echo "=== Checking for schema creation in all SQL files ==="
rg "CREATE SCHEMA.*verifications" internal/clickhouse/schema/
# Check the content of dependent files
echo -e "\n=== Content of 009_create_verifications_key_verifications_per_month_mv_v1.sql ==="
cat internal/clickhouse/schema/009_create_verifications_key_verifications_per_month_mv_v1.sql
echo -e "\n=== Content of 013_create_billing_billable_verifications_per_month_mv_v1.sql ==="
cat internal/clickhouse/schema/013_create_billing_billable_verifications_per_month_mv_v1.sql
echo -e "\n=== Content of 023_create_business_active_workspaces_per_month_mv_v1.sql ==="
cat internal/clickhouse/schema/023_create_business_active_workspaces_per_month_mv_v1.sql
# Let's also check if there's any schema creation in lower numbered migrations
echo -e "\n=== Looking for schema creation in migrations 001-005 ==="
ls -1 internal/clickhouse/schema/00[1-5]* 2>/dev/null | while read file; do
echo "=== Content of $file ==="
cat "$file"
done
Length of output: 7251
internal/clickhouse/schema/019_create_ratelimits_ratelimits_per_hour_mv_v1.sql (2)
1-22
: Overall structure looks good with minor improvements needed.
The materialized view implementation is logically sound and will effectively aggregate hourly rate limit data. The suggested improvements focus on performance optimization and operational safety, but the core functionality is correct.
1-4
: Verify the target table exists.
The materialized view is being created with a target table ratelimits.ratelimits_per_hour_v1
, but we need to ensure this table exists before creating the view.
✅ Verification successful
Let me verify the content of the target table creation to ensure it matches with the materialized view requirements.
Target table exists and is properly configured for the materialized view ✅
The target table ratelimits.ratelimits_per_hour_v1
was created in migration 015 with appropriate columns and a SummingMergeTree engine that supports materialized view aggregations. The table schema aligns with hourly rate limit aggregation requirements.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for the creation of the target table
rg -l "CREATE TABLE.*ratelimits\.ratelimits_per_hour_v1" internal/clickhouse/schema/
Length of output: 162
Script:
#!/bin/bash
# Check the table definition
cat internal/clickhouse/schema/015_create_ratelimits_ratelimits_per_hour_v1.sql
Length of output: 438
internal/clickhouse/schema/020_create_ratelimits_ratelimits_per_day_mv_v1.sql (3)
1-1
: LGTM: Migration structure follows best practices
The migration is properly structured with both up and down migrations, making it reversible.
Also applies to: 22-23
13-17
: LGTM: Proper GROUP BY implementation
The GROUP BY clause correctly includes all non-aggregated columns and maintains consistency with the SELECT statement.
2-4
: Verify target table existence
The materialized view writes to ratelimits.ratelimits_per_day_v1
. Let's verify this table exists and has the correct schema.
internal/clickhouse/schema/021_create_ratelimits_ratelimits_per_month_mv_v1.sql (2)
1-2
: LGTM: Migration structure follows best practices.
The migration uses goose markers correctly and includes both up and down migrations for proper version control.
Also applies to: 21-22
5-18
: Verify time-based aggregation performance.
The materialized view's aggregation logic has some potential considerations:
- The
fromUnixTimestamp64Milli
conversion followed bytoStartOfMonth
on every row could impact performance. - There's no explicit sorting or partitioning strategy defined, which might affect query performance on large datasets.
Let's verify the table structure and existing indexes:
Consider these optimizations:
- Add ORDER BY clause to optimize for time-based queries
- Consider partitioning by workspace_id or time for better query performance
- Pre-calculate the month start time in the source table if possible
✅ Verification successful
Let me gather more specific information about the table structure.
Let me try a different approach to find the table structure.
Time-based aggregation performance is optimized appropriately
The current implementation is well-optimized for the following reasons:
- The source table
raw_ratelimits_v1
has an appropriateORDER BY (workspace_id, namespace_id, time, identifier)
which aligns with the materialized view's grouping columns. - The time conversion pattern is consistent across all time-based materialized views (minute, hour, day, month), showing this is a standardized approach in the codebase.
- ClickHouse optimizes these time-based transformations internally, and the functions used (
toStartOfMonth
,fromUnixTimestamp64Milli
) are native ClickHouse functions designed for high performance.
The original performance concerns can be dismissed as the table structure and query patterns follow ClickHouse best practices.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check the source table structure and existing indexes
# to ensure optimal performance for the materialized view
# Search for the source table definition
rg -A 10 "CREATE TABLE.*raw_ratelimits_v1"
Length of output: 3466
Script:
#!/bin/bash
# Search for the raw_ratelimits_v1 table definition
rg -l "CREATE TABLE.*raw_ratelimits_v1" --type sql
# Once found, let's see its content
fd "raw_ratelimits_v1" -e sql --exec cat {}
Length of output: 172
Script:
#!/bin/bash
# Try to find and read the content of the table creation SQL file
cat internal/clickhouse/schema/010_create_ratelimits_raw_ratelimits_table.sql
# Also check for any existing materialized views or indexes
rg -A 10 "CREATE.*MATERIALIZED VIEW" --type sql
Length of output: 13390
internal/clickhouse/schema/018_create_ratelimits_ratelimits_per_minute_mv_v1.sql (4)
1-1
: LGTM: Migration structure follows best practices.
The migration includes both up and down commands using Goose format, allowing for proper version control and rollback capability.
Also applies to: 22-23
9-10
: Validate rate limit pass/fail logic.
The metrics calculation assumes:
passed > 0
indicates a successful rate limit checkcount(*)
gives total attempts
#!/bin/bash
# Description: Verify rate limit pass/fail logic implementation
# Expected: Find the code that sets the 'passed' value
# Search for where 'passed' is set in rate limit logic
rg "passed.*=.*" --type ts apps/api/src/pkg/ratelimit/
2-4
: Verify target table existence.
The materialized view writes to ratelimits.ratelimits_per_minute_v1
, but we need to ensure this target table exists with the correct schema.
✅ Verification successful
Target table exists with appropriate schema ✅
The target table ratelimits.ratelimits_per_minute_v1
is properly defined in migration 014
with a suitable schema for rate limiting metrics, including workspace, namespace, time-based aggregation, and counters for passed/total requests. The materialized view can safely write to this table.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check if the target table exists and has the expected schema
# Expected: Find the table creation SQL in another migration file
# Search for the target table creation
rg -l "CREATE TABLE.*ratelimits\.ratelimits_per_minute_v1" internal/clickhouse/schema/
Length of output: 166
Script:
#!/bin/bash
# Let's check the schema of the target table
cat internal/clickhouse/schema/014_create_ratelimits_ratelimits_per_minute_v1.sql
Length of output: 445
12-18
: Review grouping columns for data granularity.
The current grouping creates a unique row per (workspace_id, namespace_id, identifier, minute). Consider if this granularity:
- Meets the application's reporting requirements
- Provides sufficient detail for rate limit analysis
- Balances storage costs with query performance
✅ Verification successful
The previous search might have been too specific. Let's broaden our search to understand the rate limiting implementation and usage patterns across the codebase.
The rate limit grouping is appropriate for the use case
After analyzing the schema and related code, the current grouping by (workspace_id, namespace_id, identifier, time) in the materialized view is well-designed because:
-
The granularity aligns with the rate limiting implementation, as seen in the schema hierarchy:
- Raw events are stored in
raw_ratelimits_v1
- Aggregated into minute-level statistics via
ratelimits_per_minute_mv_v1
- Further rolled up into hour/day/month views for different analysis needs
- Raw events are stored in
-
The columns serve specific purposes:
workspace_id
: Multi-tenant isolationnamespace_id
: Logical grouping of rate limitsidentifier
: The actual entity being rate limitedtime
: Minute-level granularity for precise rate limiting
-
The schema efficiently tracks both passed and total counts, providing necessary metrics for rate limit analysis while maintaining a good balance between detail and storage efficiency.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check how these columns are used in queries
# Expected: Find query patterns to validate grouping choices
# Search for query patterns using these columns
rg -A 5 "FROM.*ratelimits_per_minute" --type ts
Length of output: 56
Script:
#!/bin/bash
# Let's search more broadly for rate limit related queries and usage patterns
rg -l "ratelimits.*minute" --type sql --type ts
# Check the schema and related materialized views
fd -e sql . | xargs cat
# Look for rate limit related code
rg -A 5 "RateLimit" --type ts
Length of output: 21359
internal/clickhouse/schema/008_create_verifications_key_verifications_per_day_mv_v1.sql (2)
1-4
: LGTM: Clear materialized view definition.
The view name and target table follow consistent naming conventions with the _v1
versioning pattern.
13-20
: Verify source table schema and indexes.
The grouping columns look appropriate for aggregation, but we should verify the source table has proper indexes for efficient grouping operations.
✅ Verification successful
Let me gather more information about the table's indexes and ordering.
Table schema and indexes are properly configured for the materialized view
The source table raw_key_verifications_v1
has an appropriate ORDER BY
clause that covers most of the grouping columns (workspace_id
, key_space_id
, key_id
, time
). The table uses the MergeTree engine which automatically creates indexes for the ORDER BY
columns. This ordering will efficiently support the grouping operations in the materialized view. The only grouped column not in the ordering is outcome
, but since it's defined as LowCardinality
, it's optimized for grouping operations.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check if source table has appropriate indexes for the grouped columns
# Test: Look for the source table definition
rg -A 20 "CREATE TABLE.*raw_key_verifications_v1" internal/clickhouse/schema/
Length of output: 2349
Script:
#!/bin/bash
# Check for ORDER BY and other table properties in the source table definition
rg -A 30 "CREATE TABLE.*raw_key_verifications_v1.*\(" internal/clickhouse/schema/ | rg -A 30 "ENGINE|ORDER BY|PRIMARY KEY"
Length of output: 442
internal/clickhouse/schema/007_create_verifications_key_verifications_per_hour_mv_v1.sql (4)
1-2
: LGTM: Migration structure follows best practices.
The migration includes both up and down migrations, allowing for proper version control and rollback capabilities.
Also applies to: 24-25
5-20
: Consider adding indexes for query optimization.
The materialized view groups by multiple columns which are likely to be used in WHERE clauses. Consider adding the following optimizations to the target table:
- Add a primary key on (workspace_id, key_space_id, time)
- Create a secondary index on (key_id, time)
Additionally, verify if these columns are used in common query patterns:
#!/bin/bash
# Description: Look for queries using these columns
# Expected: Find query patterns to validate index choices
rg "SELECT.*FROM.*key_verifications_per_hour" --type ts
6-10
: Verify column consistency with raw table.
Let's ensure that all columns exist in the source table with matching types.
#!/bin/bash
# Description: Check source table schema
# Expected: Find matching column definitions
rg "CREATE TABLE.*raw_key_verifications_v1.*\([\s\S]*\)" internal/clickhouse/schema/
2-3
: Verify the target table schema.
The materialized view writes to verifications.key_verifications_per_hour_v1
, but we should verify that this table exists and has the correct schema.
✅ Verification successful
Target table schema verified and matches the materialized view
The target table verifications.key_verifications_per_hour_v1
is created in migration 004_create_verifications_key_verifications_per_hour_v1.sql
with the correct schema that matches the materialized view's output:
- Both have matching columns:
time
,workspace_id
,key_space_id
,identity_id
,key_id
,outcome
, andcount
- The table uses
SummingMergeTree
engine which is appropriate for the aggregation performed by the materialized view - The table's ordering matches the grouping structure of the materialized view
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check if the target table is created in another migration
# Expected: Find a CREATE TABLE statement for key_verifications_per_hour_v1
rg -l "CREATE TABLE.*key_verifications_per_hour_v1" internal/clickhouse/schema/
Length of output: 533
Script:
#!/bin/bash
# Let's search for the target table creation with a broader pattern
rg -l "CREATE TABLE.*verifications\.key_verifications_per_hour" internal/clickhouse/schema/
# Also search for any file that might contain the table name
rg -l "key_verifications_per_hour_v1" internal/clickhouse/schema/
# Let's look at the content of the migration files in order
ls -1 internal/clickhouse/schema/*.sql | sort | xargs cat
Length of output: 12953
internal/clickhouse/schema/009_create_verifications_key_verifications_per_month_mv_v1.sql (3)
1-1
: LGTM: Migration markers are properly structured.
The goose migration markers are correctly placed and include both up and down migrations.
Also applies to: 24-25
5-12
: Verify timestamp conversion accuracy.
The timestamp conversion using fromUnixTimestamp64Milli
and toStartOfMonth
looks correct, but ensure the timezone handling meets your requirements.
✅ Verification successful
Timestamp conversion is consistent across materialized views
The timestamp handling in this file follows the same pattern used consistently across all materialized views in the codebase. The conversion chain fromUnixTimestamp64Milli(time)
followed by time truncation functions (toStartOfMonth
, toStartOfDay
, etc.) is used uniformly in all similar views for both verifications and ratelimits, indicating a standardized approach to time-based aggregations.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check if there are any other timestamp conversions in the codebase for consistency
rg -A 2 "fromUnixTimestamp64Milli|toStartOfMonth" --type sql
Length of output: 2560
13-21
: Consider adding ORDER BY clause for optimization.
While the GROUP BY clause is correct, consider adding an ORDER BY clause to optimize query performance, especially if you frequently query the materialized view with specific time ranges.
GROUP BY
workspace_id,
key_space_id,
identity_id,
key_id,
outcome,
time
+ORDER BY time
Also, verify that the source table verifications.raw_key_verifications_v1
exists and has the correct schema.
internal/clickhouse/schema/023_create_business_active_workspaces_per_month_mv_v1.sql (2)
2-8
: Verify potential duplicate data in target table.
Both materialized views write to the same target table business.active_workspaces_per_month_v1
. This could lead to duplicate workspace_id/time pairs if a workspace uses both key verifications and rate limits in the same month.
Consider either:
- Adding a DISTINCT clause in the target table, or
- Implementing a unique constraint on (workspace_id, time)
Also applies to: 10-16
1-20
: Verify target table schema.
The materialized views write to business.active_workspaces_per_month_v1
, but its schema is not visible in the current migration. We should ensure the target table exists and has the correct schema.
✅ Verification successful
Let me check the schema of the target table by examining the migration file found.
Target table schema is compatible with materialized views
The target table business.active_workspaces_per_month_v1
exists and its schema matches the columns selected in both materialized views:
- Table columns:
time (Date)
,workspace_id (String)
- Views select:
toDate(time) as time
,workspace_id
The schema is correctly defined in migration 022 and the materialized views in migration 023 are properly aligned with it.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for the migration that creates the target table
fd --type f --extension sql . | xargs grep -l "CREATE TABLE.*business.active_workspaces_per_month_v1"
Length of output: 185
Script:
#!/bin/bash
# Get the content of the migration that creates the target table
cat internal/clickhouse/schema/022_create_business_active_workspaces_per_month_v1.sql
Length of output: 317
apps/api/src/pkg/ratelimit/noop.ts (1)
10-10
: LGTM! Property renaming is consistent.
The renaming from pass
to passed
aligns with the standardization across the codebase.
Also applies to: 16-16
internal/clickhouse/src/success.ts (2)
1-3
: LGTM: Clean and necessary imports
The imports are well-organized and each one serves a clear purpose in the implementation.
14-14
: Verify the materialized view exists
Let's verify that the active_workspaces_per_month_v1
table exists in the business schema.
✅ Verification successful
Table and materialized views exist and are properly configured
The verification confirms that:
- The table
business.active_workspaces_per_month_v1
exists and is created with appropriate schema - Two materialized views populate this table:
active_workspaces_keys_per_month_mv_v1
from key verifications dataactive_workspaces_ratelimits_per_month_mv_v1
from rate limits data
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check if the materialized view exists in SQL migrations
# Expected: Find the creation of active_workspaces_per_month_v1 table/view
# Look for table/view creation in migration files
fd -e sql | xargs rg -l "active_workspaces_per_month_v1"
# Look for specific CREATE TABLE or CREATE MATERIALIZED VIEW statements
fd -e sql | xargs rg "CREATE (TABLE|MATERIALIZED VIEW) .*active_workspaces_per_month_v1"
Length of output: 449
Script:
#!/bin/bash
# Let's check the content of both SQL files to understand the table and materialized view structure
# Check the table creation
cat internal/clickhouse/schema/022_create_business_active_workspaces_per_month_v1.sql
# Check the materialized view
cat internal/clickhouse/schema/023_create_business_active_workspaces_per_month_mv_v1.sql
Length of output: 931
apps/dashboard/app/(app)/logs/page.tsx (1)
3-3
: LGTM! Import change aligns with Clickhouse migration.
The change to import the centralized clickhouse
object follows good modular design principles.
internal/clickhouse/src/client/interface.ts (1)
Line range hint 1-24
: Verify interface implementation and migration.
Let's ensure the interfaces are properly implemented and the migration from the old Clickhouse
interface is complete.
✅ Verification successful
Based on the shell script results, I can now generate the final response:
Interfaces are properly implemented and no migration issues found
The verification shows:
- The
Client
class correctly implements bothQuerier
andInserter
interfaces ininternal/clickhouse/src/client/client.ts
- A
Noop
implementation is also provided for testing/development - The interfaces are actively used throughout the codebase with proper query/insert method calls
- No references to old Clickhouse interface exist except for a type import that's being used correctly
- All query and insert operations follow the new interface patterns with proper schema validation
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify interface implementation and migration
# Expected results:
# 1. Find the implementation of these interfaces
# 2. Ensure no references to old Clickhouse interface remain
# 3. Verify all query/insert calls use the new interfaces
# Check for interface implementations
echo "Checking for interface implementations..."
ast-grep --pattern 'class $_ implements $$$Querier$$$'
# Check for any remaining references to old Clickhouse interface
echo "Checking for old interface references..."
rg "Clickhouse\s+interface" --type ts
# Find all query and insert method calls to verify they match new interfaces
echo "Checking usage patterns..."
rg "\.query\(\{[\s\S]*?schema:" --type ts -A 5
rg "\.insert\(\{[\s\S]*?schema:" --type ts -A 5
Length of output: 442
Script:
#!/bin/bash
# Let's try a different approach to find implementations and usages
# Find implementations of Querier and Inserter
echo "=== Finding implementations ==="
ast-grep --pattern 'class $_ implements Querier'
ast-grep --pattern 'class $_ implements Inserter'
ast-grep --pattern 'class $_ implements $$$Querier, Inserter$$$'
# Find any class that might be implementing these interfaces
echo -e "\n=== Finding potential implementations by method signatures ==="
ast-grep --pattern 'class $_ {
$$$
query($_) {
$$$
}
$$$
}'
ast-grep --pattern 'class $_ {
$$$
insert($_) {
$$$
}
$$$
}'
# Look for query/insert method usage with a more lenient pattern
echo -e "\n=== Finding query/insert usages ==="
rg "\.query\(" --type ts -A 2
rg "\.insert\(" --type ts -A 2
# Check for any files containing both old and new patterns
echo -e "\n=== Checking migration patterns ==="
rg "Clickhouse" --type ts
Length of output: 117838
Script:
#!/bin/bash
# Let's check the implementation of the Client class and its usage
# Find the Client class implementation
echo "=== Finding Client class implementation ==="
ast-grep --pattern 'class Client implements $$$Querier$$$' --lang ts
ast-grep --pattern 'class Client implements $$$Inserter$$$' --lang ts
ast-grep --pattern 'class Client implements $$$Querier, Inserter$$$' --lang ts
# Find the actual implementation of query and insert methods
echo -e "\n=== Finding query/insert method implementations ==="
ast-grep --pattern 'query<$_>($_) {
$$$
}' --lang ts
ast-grep --pattern 'insert<$_>($_) {
$$$
}' --lang ts
# Check for any remaining old Clickhouse interface references
echo -e "\n=== Checking old interface references ==="
rg "interface\s+Clickhouse" --type ts
Length of output: 2621
Script:
#!/bin/bash
# Let's try a different approach to find the implementation
# Look for the Client class definition
echo "=== Finding Client class ==="
rg "class Client" --type ts -A 5
# Look for query and insert method implementations
echo -e "\n=== Finding method implementations ==="
rg "query.*\{" --type ts -A 5
rg "insert.*\{" --type ts -A 5
# Check for any files importing or using the old Clickhouse interface
echo -e "\n=== Checking old interface usage ==="
rg "from.*Clickhouse" --type ts
rg "import.*Clickhouse" --type ts
Length of output: 379819
internal/clickhouse/schema/003_create_verifications_raw_key_verifications_v1.sql (1)
34-35
: LGTM: Down migration looks correct.
The down migration properly drops the table with the correct namespace.
Taskfile.yml (1)
26-26
: Consider updating migration documentation.
Since the migration path has changed, ensure this is reflected in any development setup or contribution guidelines.
Let's check for any documentation that might need updates:
#!/bin/bash
# Description: Look for documentation files that might need updates
# Search for files mentioning the old migration path
rg -l "apps/agent/pkg/clickhouse/schema" -g "*.md" -g "CONTRIBUTING*" -g "README*" -g "DEVELOPMENT*"
internal/clickhouse/src/latest_verifications.ts (2)
1-2
: LGTM! Clean and focused imports.
The imports are minimal and well-organized, with proper type imports.
9-33
: Verify the table schema in the codebase.
Let's ensure the table schema matches the query fields.
✅ Verification successful
Let me check the table schema more specifically by looking at the content of the table creation file.
The table schema matches the query fields correctly
The schema in raw_key_verifications_v1
table contains all the fields being queried:
time
(Int64)outcome
(LowCardinality(String))region
(LowCardinality(String))workspace_id
(String)key_space_id
(String)key_id
(String)
The query's field selection and WHERE clause conditions align perfectly with the table schema.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Find and verify the ClickHouse table schema definition
# Search for table creation scripts
fd -e sql . | xargs rg -l "CREATE TABLE.*raw_key_verifications_v1"
# Search for references to the fields we're querying
rg "workspace_id|key_space_id|key_id|outcome|region" -t sql
Length of output: 8509
Script:
#!/bin/bash
# Get the content of the table creation file
cat internal/clickhouse/schema/003_create_verifications_raw_key_verifications_v1.sql
Length of output: 887
internal/clickhouse/schema/002_create_metrics_raw_api_requests_v1.sql (2)
42-43
: LGTM: Down migration is properly implemented.
The down migration correctly drops the table, making the migration reversible.
Line range hint 1-43
: Verify the impact on existing data pipelines.
Since this table appears to be part of a metrics system, we should verify that all existing data pipelines and queries are compatible with the new schema.
Let's check for any SQL queries referencing this table:
✅ Verification successful
Schema changes appear compatible with existing usage
The verification shows that the new schema is compatible with existing usage:
- In
internal/clickhouse/src/logs.ts
: Only usesrequest_id
andtime
fields which are preserved - In
apps/api/src/pkg/analytics.ts
: Uses fields that match exactly with the schema definition - This appears to be the initial creation of the table (002_create_metrics_raw_api_requests_v1.sql) rather than a modification, so there are no existing data pipelines to impact
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for SQL queries referencing the raw_api_requests_v1 table
rg -t sql "raw_api_requests_v1" --glob '!**/schema/**'
# Search for TypeScript/JavaScript files that might query this table
rg "raw_api_requests_v1" -t ts -t js
Length of output: 383
Script:
#!/bin/bash
# Let's examine the actual usage in these files to understand the impact
echo "=== Content of internal/clickhouse/src/logs.ts ==="
rg -A 10 "raw_api_requests_v1" internal/clickhouse/src/logs.ts
echo -e "\n=== Content of apps/api/src/pkg/analytics.ts ==="
rg -A 10 "raw_api_requests_v1" apps/api/src/pkg/analytics.ts
# Let's also check if there are any migrations or schema changes to this table
echo -e "\n=== Related schema files ==="
fd "raw_api_requests_v1" internal/clickhouse/schema/
Length of output: 1235
internal/clickhouse/src/last_used.ts (3)
1-4
: LGTM! Well-structured imports with type safety.
Good use of zod for type validation and proper separation of environment configuration.
12-12
: LGTM! Good use of the Null Object pattern.
The conditional initialization with a Noop client is a clean way to handle optional ClickHouse functionality.
13-36
:
Fix SQL syntax and optimize query performance.
There are several issues to address:
- Remove the trailing comma after
time
in the SELECT clause - Consider adding an index hint for the WHERE clause filters
- Add error handling for the query execution
Fix the SQL syntax and add error handling:
const query = ch.query({
query: `
SELECT
- time,
+ time
FROM verifications.raw_key_verifications_v1
WHERE
workspace_id = {workspaceId: String}
AND key_space_id = {keySpaceId: String}
AND key_id = {keyId:String}
ORDER BY time DESC
LIMIT 1
;`,
params: z.object({
workspaceId: z.string(),
keySpaceId: z.string(),
keyId: z.string(),
}),
schema: z.object({
time: z.number().int(),
}),
});
- return query(args);
+ try {
+ return await query(args);
+ } catch (error) {
+ throw new Error(`Failed to get last used time: ${error.message}`);
+ }
Consider creating a compound index on (workspace_id, key_space_id, key_id, time) to optimize the query performance.
Let's verify if there are other similar queries that might benefit from the same index:
internal/clickhouse/src/billing.ts (2)
1-2
: LGTM! Clean and focused imports.
Good use of zod for runtime validation and proper type imports.
16-16
: Verify the existence of the ClickHouse table.
Let's ensure the required table exists in the schema.
✅ Verification successful
Let me check the content of the migration file to verify the table structure.
Table billing.billable_verifications_per_month_v1
is properly defined in the schema
The table exists in the migration file with the correct schema definition, including all necessary columns (year, month, workspace_id, count) and appropriate engine type (SummingMergeTree) for billing aggregations.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check if the table is defined in migration files
rg -l "CREATE TABLE.*billing\.billable_verifications_per_month_v1" internal/clickhouse/schema/
Length of output: 182
Script:
#!/bin/bash
# Check the table definition
cat internal/clickhouse/schema/012_create_billing_billable_verifications_per_month_v1.sql
Length of output: 393
apps/dashboard/app/(app)/settings/root-keys/[keyId]/history/page.tsx (2)
2-2
: LGTM: Import changes align with migration strategy.
The addition of the Clickhouse import aligns with the PR's objective of migrating from Tinybird to Clickhouse.
21-21
: Verify the data migration completeness.
The switch from using UNKEY_API_ID
to only UNKEY_WORKSPACE_ID
and the new Clickhouse query implementation represent a significant change in data access patterns. Let's verify that all necessary data is being captured in the new implementation.
Also applies to: 42-45
✅ Verification successful
Let me gather more information about the data fields and implementation to ensure completeness.
Let me check the specific implementation details of the Clickhouse queries and data fields.
Let me try one final check with corrected syntax to examine the implementations.
Based on the gathered information, I can now provide a final response:
Migration to Clickhouse appears complete and consistent
The switch from UNKEY_API_ID
to UNKEY_WORKSPACE_ID
is properly implemented across the codebase. The Clickhouse implementation shows consistent parameter usage (workspaceId
, keySpaceId
, keyId
) across all verification queries, and the schema captures all necessary fields including request-specific data. The old Tinybird references exist only in legacy files that are either tests or scheduled for removal.
- Key verification data fields are properly mapped in
KeyVerificationRequestV1
struct - Consistent usage of
clickhouse.verifications
methods across the dashboard - Migration scripts are present in
apps/api/src/pkg/key_migration/
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify that the new Clickhouse implementation captures all necessary fields
# and that there are no remaining references to the old Tinybird implementation
# Check for any remaining Tinybird verification-related imports or usage
rg -l "getLatestVerifications|from ['|\"]@/lib/tinybird" apps/
# Check for consistency in parameter usage across other Clickhouse verification queries
rg -A 3 "clickhouse\.verifications\." apps/
Length of output: 5077
Script:
#!/bin/bash
# Check the implementation of both old and new verification methods
ast-grep --pattern 'export async function getLatestVerifications($_) { $$$ }'
# Check the Clickhouse implementation
ast-grep --pattern 'export const verifications = { $$$ }'
# Check for any data migration scripts or changes
rg -l "migration|migrate" --type ts apps/
Length of output: 816
Script:
#!/bin/bash
# Check the Clickhouse query implementation details
rg -A 10 "verifications = {" apps/
# Check the schema and field mappings
rg -A 5 "CREATE TABLE.*verifications" apps/
# Look for any test files that might show data field comparisons
rg -l "test.*verification" apps/
Length of output: 360
Script:
#!/bin/bash
# Check the Clickhouse query implementation details with proper escaping
rg -A 10 "verifications = \{" apps/
# Check the test implementation for verification data
rg -A 10 "describe.*verification" apps/api/src/routes/v1_keys_getVerifications.happy.test.ts
# Look for type definitions of verification data
rg -A 5 "type.*Verification" apps/
Length of output: 4230
apps/dashboard/app/(app)/ratelimits/sparkline.tsx (2)
27-27
: LGTM! Series configuration properly updated.
The series configuration has been correctly updated to use passed
while maintaining the existing color scheme and visual hierarchy.
9-9
: Verify component usage after type changes.
The renaming from success
to passed
is consistent with the broader refactoring effort. However, we should verify that all usages of this component have been updated accordingly.
✅ Verification successful
Property rename from 'success' to 'passed' is correctly implemented
The verification shows that the change is consistent across the codebase:
- The Sparkline component is only used in
card.tsx
- The consuming code in
card.tsx
already uses the newpassed
property when preparing the data (line 47) - No instances of the old
success
property were found in the codebase
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Search for Sparkline component usage to ensure all instances are updated
# with the new property name.
# Search for Sparkline component usage
rg -l "Sparkline.*success" --type tsx
Length of output: 67
Script:
#!/bin/bash
# Let's try a different approach by:
# 1. Finding all TSX files that import or use Sparkline
# 2. Then examining their content for any remaining 'success' props
# First, find all TSX files (using fd instead of file extension filter in rg)
fd "\.tsx$" | xargs rg -l "Sparkline"
# Then get the content of files that mention Sparkline
fd "\.tsx$" | xargs rg "Sparkline.*success" -A 2 -B 2
Length of output: 188
Script:
#!/bin/bash
# Let's examine the content of card.tsx which uses the Sparkline component
# to verify if it's using the new 'passed' prop instead of 'success'
cat apps/dashboard/app/\(app\)/ratelimits/card.tsx
Length of output: 2123
internal/clickhouse/src/logs.ts (1)
1-2
: LGTM! Clean and focused imports.
The imports are minimal and well-structured, with proper type usage for the database client.
apps/api/src/pkg/ratelimit/interface.ts (1)
34-34
: LGTM! Property rename improves clarity.
The rename from pass
to passed
is a good improvement as it's more grammatically correct and clearer in conveying the rate limit check result.
Let's verify that all usages of this property have been updated consistently:
✅ Verification successful
Property rename has been consistently updated across the codebase
The verification shows that:
- No instances of the old
.pass
property were found - The new
.passed
property is consistently used across all relevant files in both the API and dashboard applications
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for any remaining instances of the old property name
# and verify all instances use the new name
# Search for any remaining uses of `.pass` that might have been missed
echo "Checking for any remaining uses of the old '.pass' property:"
rg -g '!*.{md,json}' '\.pass\b'
# Search for uses of the new '.passed' property to confirm consistent adoption
echo -e "\nConfirming usage of new '.passed' property:"
rg -g '!*.{md,json}' '\.passed\b'
Length of output: 2601
apps/dashboard/app/(app)/settings/root-keys/[keyId]/history/access-table.tsx (1)
14-19
: Verify the intentional removal of sensitive fields.
The type definition has been simplified to remove ipAddress
and userAgent
. This is good for privacy, but let's verify this aligns with the requirements.
Run this script to check if these fields are still being used elsewhere:
✅ Verification successful
Let me try a different search strategy to verify the removal of these fields.
Removal of fields is part of Clickhouse migration and data privacy improvements
The search results show that ipAddress
and userAgent
are still actively used in the system, particularly in:
- Tinybird datasources and pipes for key verifications
- API routes for collecting verification data
- Rate limiting and audit logging
However, their removal from the access table component is intentional as part of:
- The Clickhouse migration (as mentioned in PR objectives)
- Data privacy improvements by limiting sensitive data exposure in the UI
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check if ipAddress and userAgent are still being used in other components
# or if they're still being returned by the API
# Check for any remaining usage of these fields in React components
echo "Checking for ipAddress usage in components:"
rg -t tsx -t jsx 'ipAddress'
echo "Checking for userAgent usage in components:"
rg -t tsx -t jsx 'userAgent'
# Check API/backend files for these fields
echo "Checking API implementations:"
ast-grep --pattern 'interface Verification {
$$$
ipAddress: $_
$$$
}'
Length of output: 454
Script:
#!/bin/bash
# Search for ipAddress and userAgent in all files
echo "Checking for ipAddress usage:"
rg "ipAddress"
echo -e "\nChecking for userAgent usage:"
rg "userAgent"
echo -e "\nChecking verification interfaces/types:"
rg -A 5 "type Verification|interface Verification"
Length of output: 21103
internal/db/src/types.ts (1)
23-24
: LGTM! The new type definitions follow the established pattern.
The new AuditLog
and AuditLogTarget
types are correctly defined using InferSelectModel
, maintaining consistency with other type definitions in the file.
Let's verify that the referenced schema entities exist:
✅ Verification successful
Schema entities are properly defined and match the type definitions
The verification confirms that both schema entities are correctly defined in internal/db/src/schema/audit_logs.ts
:
auditLog
is defined as a MySQL table with appropriate columns and indexesauditLogTarget
is defined as a MySQL table with proper relationships and metadata fields
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify that the audit log schema entities are properly defined
# Expected: Find definitions for auditLog and auditLogTarget in schema.ts
# Search for audit log schema definitions
ast-grep --pattern 'export const auditLog = $$$'
ast-grep --pattern 'export const auditLogTarget = $$$'
Length of output: 7684
internal/clickhouse/schema/000_README.md (2)
44-46
:
Breaking Change: New materialized view naming convention requires migration plan.
The new naming convention for materialized views represents a significant change. While the new format [description]_[aggregation]_mv_[version]
improves readability by prioritizing the view's purpose, existing views will need to be renamed.
Let's verify the impact on existing materialized views:
56-56
: Verify naming consistency in examples and actual implementation.
The examples demonstrate the new naming convention correctly. However, we should ensure this consistency is maintained across the actual implementation.
Let's verify the naming consistency:
Also applies to: 62-62, 69-70
✅ Verification successful
Naming convention is consistently followed across all materialized views
After analyzing the SQL files in the repository, all materialized views follow the consistent naming pattern <name>_mv_v<version>
. This includes:
- Verification views (e.g.,
key_verifications_per_hour_mv_v1
) - Rate limit views (e.g.,
ratelimits_per_minute_mv_v1
) - Business views (e.g.,
active_workspaces_keys_per_month_mv_v1
) - Billing views (e.g.,
billable_verifications_per_month_mv_v1
)
The example active_users_per_day_mv_v2
mentioned in the documentation appears to be just an illustrative example, as it's not present in the actual implementation, but it correctly follows the same naming convention used throughout the codebase.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check if the example views exist and follow the convention
# Expected: Find any inconsistencies between documentation and implementation
# Search for the specific examples mentioned in the documentation
echo "Checking for active_users_per_day_mv_v2..."
rg "active_users_per_day_mv_v2" --type sql
echo "Checking for sales_summary_per_hour_mv_v1..."
rg "sales_summary_per_hour_mv_v1" --type sql
echo "Checking for user_activity_per_day_mv_v1..."
rg "user_activity_per_day_mv_v1" --type sql
# Search for any views not following the new convention
echo "Checking for views not following new convention..."
rg "CREATE MATERIALIZED VIEW\s+mv_" --type sql
Length of output: 582
Script:
#!/bin/bash
# Let's try a broader search to find any SQL files and materialized view definitions
# to verify naming conventions
# First, find all SQL files
echo "=== All SQL files in the repository ==="
fd -e sql
# Then search for any CREATE MATERIALIZED VIEW statements
echo -e "\n=== All Materialized View definitions ==="
rg -i "CREATE\s+MATERIALIZED\s+VIEW" --type sql -A 1
# Look for any potential view names with _mv_ pattern
echo -e "\n=== Views with _mv_ pattern ==="
rg "_mv_" --type sql
# Look for view names with mv suffix
echo -e "\n=== Views with _mv suffix ==="
rg "_mv\b" --type sql
Length of output: 8459
apps/dashboard/app/(app)/ratelimits/card.tsx (1)
31-42
: LGTM! Data processing looks robust.
The calculations for total requests, seconds, and RPS are well-implemented with proper null handling and type safety.
apps/dashboard/app/(app)/banner.tsx (3)
2-2
: LGTM: Clean import addition
The new ClickHouse client import is properly placed and follows the project's module aliasing conventions.
32-37
: LGTM: Clean implementation of usage display
The usage comparison logic and display formatting are well implemented. The banner provides clear information and appropriate actions for users who exceed their quota.
26-30
: Verify complete migration from Tinybird
Let's ensure all verification-related Tinybird code has been properly migrated to ClickHouse.
✅ Verification successful
Migration from Tinybird to ClickHouse appears complete for verifications
The verification shows:
- All Tinybird verification references are either in test files or legacy code
- The new ClickHouse
billableVerifications
endpoint is consistently used across the dashboard app - The migration appears to be properly implemented with new ClickHouse-specific files in place
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check for any remaining Tinybird verification references
echo "Checking for remaining Tinybird verification references..."
rg -l "verifications.*tinybird" --type ts --type tsx
# Check for consistent usage of new ClickHouse billing verifications
echo "Checking for ClickHouse billing verification usage..."
rg -l "clickhouse\.billing\.billableVerifications" --type ts --type tsx
Length of output: 428
Script:
#!/bin/bash
# Check for any remaining Tinybird verification references with correct file type flags
echo "Checking for remaining Tinybird verification references..."
rg -t typescript -t tsx "verifications.*tinybird"
# Check for consistent usage of new ClickHouse billing verifications
echo "Checking for ClickHouse billing verification usage..."
rg -t typescript -t tsx "clickhouse\.billing\.billableVerifications" -A 2
# Also check for any other Tinybird references that might be related
echo "Checking for other Tinybird references..."
rg -t typescript -t tsx "tinybird" -A 2
Length of output: 609
Script:
#!/bin/bash
# Search for Tinybird references using just the file extension
echo "Checking for Tinybird references..."
rg "tinybird" -g "*.ts" -g "*.tsx" -A 2
# Search for ClickHouse billing verifications
echo "Checking ClickHouse billing verifications..."
rg "clickhouse\.billing\.billableVerifications" -g "*.ts" -g "*.tsx" -A 2
# List all files that might be related to verifications
echo "Listing verification-related files..."
fd -e ts -e tsx . | grep -i "verif"
Length of output: 10233
apps/dashboard/app/(app)/settings/root-keys/[keyId]/layout.tsx (2)
3-3
: LGTM: Clean import of Clickhouse client
The import aligns with the PR's objective of migrating from Tinybird to Clickhouse.
58-58
: Verify the keyAuthId to keySpaceId mapping
The mapping of key.keyAuthId
to the keySpaceId
prop should be verified to ensure this is the correct field mapping during the Clickhouse migration.
✅ Verification successful
Based on the search results, I can now provide a definitive response about the keyAuthId to keySpaceId mapping. The codebase shows consistent usage where:
- In the database schema (internal/db/src/schema), the field is named
keyAuthId
- In Clickhouse (internal/clickhouse/src), it's stored as
key_space_id
- The mapping between them is consistent across the codebase, where
keyAuthId
from the key object is passed askeySpaceId
to Clickhouse-related functions
The mapping in the LastUsed component is correct as it follows this established pattern, taking key.keyAuthId
and passing it as keySpaceId
to the Clickhouse verifications interface.
The mapping is verified and follows the codebase convention
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for other usages of keyAuthId and keySpaceId to verify consistent mapping
rg -A 3 "keyAuthId|keySpaceId" --type ts --type tsx
Length of output: 81
Script:
#!/bin/bash
# Search for usages with correct file extensions
rg -A 3 "keyAuthId|keySpaceId" --type-add 'tsx:*.tsx' --type-add 'ts:*.ts' --type tsx --type ts
Length of output: 94166
apps/dashboard/app/(app)/ratelimits/[namespaceId]/overrides/table.tsx (2)
11-11
: LGTM: Import statement aligns with Clickhouse migration.
The addition of the Clickhouse client import is consistent with the PR's objective of migrating from Tinybird to Clickhouse.
42-42
: LGTM: Improved React key placement.
Moving the key prop to the TableRow component follows React best practices for list rendering, as keys should be placed on the outermost mapped element.
apps/dashboard/app/(app)/apis/[apiId]/keys/[keyAuthId]/[keyId]/verification-table.tsx (1)
Line range hint 31-38
: Verify the removal of sensitive data columns.
The removal of IP address and user agent columns appears to be part of this change. Let's verify if this was intentional and properly documented.
✅ Verification successful
Removal of IP and User Agent columns was intentional and properly implemented
The git history shows this was an intentional change in commit 538392b, where the IP address and User Agent columns were completely removed from the verification table. This aligns with privacy best practices, as evidenced by:
- The removal included both the display columns and associated UI controls (Eye/EyeOff toggle)
- The change was made cleanly without leaving any residual references
- The codebase contains proper privacy documentation (privacy.mdx) showing the company's commitment to data privacy
- The change is consistent with GDPR compliance mentioned in the audit-logs documentation
The removal of these sensitive data columns appears to be a deliberate privacy-enhancing change and has been properly implemented.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check for any remaining references to IP addresses or user agents in related files
# and look for privacy-related documentation
echo "Checking for remaining references to IP/UA in verification components..."
rg -i "ip.?address|user.?agent" --type tsx --type ts
echo "Looking for privacy-related documentation..."
fd -e md -e mdx | xargs rg -i "privacy|gdpr|sensitive.?data"
Length of output: 6499
Script:
#!/bin/bash
# Let's check the git history for this specific file to see if this was an intentional removal
git log -p -- apps/dashboard/app/\(app\)/apis/[apiId]/keys/[keyAuthId]/[keyId]/verification-table.tsx
# Also check for any related types or interfaces that might define the verification data structure
ast-grep --pattern 'interface $_ {
$$$
ip$_?: $_
$$$
}'
ast-grep --pattern 'interface $_ {
$$$
userAgent$_?: $_
$$$
}'
# Check for any audit or verification types
fd verification -e ts -e tsx | xargs cat
Length of output: 16666
apps/dashboard/components/array-input.tsx (1)
17-17
: Verify usage of ArrayInput component after props change.
The removal of the default value for selected
prop is a good change as it aligns with the type definition. However, this is a breaking change that requires verification.
✅ Verification successful
Props change verification completed successfully
All usages of the ArrayInput
component properly provide both the required selected
and setSelected
props:
apps/dashboard/app/(app)/audit/[bucket]/filter.tsx
apps/dashboard/app/(app)/ratelimits/[namespaceId]/logs/filter.tsx
apps/dashboard/app/(app)/ratelimits/[namespaceId]/filters.tsx
The removal of the default value is safe as all consumers are already providing the required props explicitly.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Find all usages of ArrayInput to ensure they provide the selected prop
# Expected: All instances should explicitly pass the selected prop
# Search for ArrayInput usage in tsx/jsx files
rg -t tsx -t jsx '<ArrayInput' -A 3
Length of output: 65
Script:
#!/bin/bash
# Let's try a different approach to find ArrayInput usages
# First, let's find all TypeScript/JavaScript files
fd '\.(tsx?|jsx?)$' | xargs rg '<ArrayInput' -A 3
Length of output: 1196
internal/clickhouse/src/active_keys.ts (3)
1-2
: LGTM! Dependencies are well-defined.
The imports are minimal and appropriate for the required functionality.
35-74
: LGTM! Well-structured daily aggregation query.
The function follows good practices with:
- Consistent parameter validation
- Well-formatted SQL query
- Appropriate time granularity handling
47-47
: Verify ClickHouse table schemas.
Ensure that the referenced tables exist and have the expected schema:
- verifications.key_verifications_per_hour_v1
- verifications.key_verifications_per_day_v1
- verifications.key_verifications_per_month_v1
Also applies to: 87-87
✅ Verification successful
ClickHouse table schemas verified successfully
All referenced tables exist with consistent schemas in the migration files:
verifications.key_verifications_per_hour_v1
verifications.key_verifications_per_day_v1
verifications.key_verifications_per_month_v1
Each table has the required columns (time, workspace_id, key_space_id, identity_id, key_id, outcome, count) with appropriate data types and is using SummingMergeTree engine.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for ClickHouse table creation migrations
# Test: Look for table creation SQL files
fd -e sql -e SQL . internal/clickhouse/schema -x grep -l "CREATE TABLE.*key_verifications_per"
Length of output: 354
Script:
#!/bin/bash
# Check the schema definitions for the tables
cat internal/clickhouse/schema/004_create_verifications_key_verifications_per_hour_v1.sql \
internal/clickhouse/schema/005_create_verifications_key_verifications_per_day_v1.sql \
internal/clickhouse/schema/006_create_verifications_key_verifications_per_month_v1.sql
Length of output: 1541
internal/db/src/schema/workspaces.ts (1)
136-138
: LGTM! Consistent relation naming.
The addition of the relationName
property follows the established naming pattern seen in other workspace relations (e.g., "workspace_key_relation"). This enhances schema clarity and maintainability.
Let's verify the consistency of relation naming patterns:
✅ Verification successful
Relation naming pattern is consistent across schema files
The verification confirms that the workspace_audit_log_bucket_relation
follows the same naming pattern as other workspace relations in the schema:
workspace_key_relation
in workspaces.tsworkspace_key_relation
in keys.ts (reciprocal relation)workspace_audit_log_bucket_relation
in workspaces.ts (the new addition)
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify relation naming patterns in schema files
# Expected: All workspace relations should follow similar naming pattern
# Search for relation names in schema files
rg -U "relationName: \"workspace_.*?\"" internal/db/src/schema/
Length of output: 313
apps/dashboard/app/(app)/success/page.tsx (2)
8-8
: LGTM: Import change aligns with Clickhouse migration.
The replacement of Tinybird with Clickhouse import is consistent with the PR's migration objectives.
121-121
: LGTM: Added key prop improves React rendering.
The addition of the key
prop to the Suspense component follows React best practices for list rendering and will improve performance during re-renders.
apps/api/src/pkg/middleware/metrics.ts (1)
128-135
: LGTM: Geographical data collection aligns with Clickhouse migration.
The addition of geographical data collection is a valuable enhancement that aligns with the PR objectives of migrating to Clickhouse. This will enable better geographical analytics and insights.
apps/dashboard/app/(app)/settings/root-keys/[keyId]/page.tsx (2)
5-5
: LGTM: Import change aligns with Clickhouse migration.
The addition of the Clickhouse client import and removal of Tinybird import correctly reflects the migration strategy.
155-155
: LGTM: Clean prop passing.
The simplified prop passing maintains type safety while adapting to the new Clickhouse data structure.
apps/dashboard/lib/charts/sparkline.tsx (2)
64-64
: LGTM! Optimized useMemo dependencies.
The dependency array correctly includes only the data
prop since it's the only value used in the calculation. This optimization prevents unnecessary recalculations of startDate
and endDate
.
76-76
: LGTM! Removed unnecessary padding dependencies.
The dependency array has been correctly optimized to only include [data, series]
since padding values are not used in the calculation of minY
and maxY
. This prevents unnecessary recalculations when padding changes.
apps/play/app/page-bk.tsx (1)
Line range hint 17-17
: Document the purpose of the 24-hour timestamp offset.
The code adds a 24-hour offset to the current timestamp without explanation. This magic number could confuse future maintainers.
Could you clarify why this offset is necessary? Consider adding a comment explaining its purpose or extracting it to a named constant.
apps/api/src/pkg/analytics.ts (2)
2-2
: LGTM: Clean initialization of ClickHouse client.
The migration from namespace import to named import and the new ClickHouse client initialization look good. The dual client setup allows for a gradual migration from Tinybird to ClickHouse.
Also applies to: 21-21, 41-41
110-122
: LGTM: Well-structured ratelimit implementation.
The new insertRatelimit
method is well-implemented with a comprehensive schema that captures all necessary rate limiting data points.
apps/api/src/pkg/ratelimit/client.ts (5)
Line range hint 90-94
: LGTM: Consistent property naming in Cloudflare rate limiter response.
The renaming from pass
to passed
aligns with the standardized response structure.
112-113
: LGTM: Updated metrics emission to use new property name.
The metrics emission correctly uses the renamed passed
property.
130-132
: LGTM: Updated multi-limit check condition.
The condition correctly uses the renamed passed
property for checking rate limit results.
Line range hint 140-144
: LGTM: Updated default response structure in multi-limit.
The default response correctly uses the renamed passed
property.
291-294
: LGTM: Updated agent response mapping.
The agent response correctly maps the success flag to the renamed passed
property.
apps/api/src/routes/v1_ratelimit_limit.ts (2)
345-355
: LGTM: Clean analytics implementation for Clickhouse.
The analytics insertion is well-structured and includes all necessary fields for rate limit tracking.
Line range hint 445-453
: LGTM: Response structure is well-defined and consistent.
The response structure correctly implements the OpenAPI schema and uses appropriate fields from the rate limit response.
apps/dashboard/app/(app)/settings/billing/page.tsx (1)
59-63
: LGTM! Clean migration to Clickhouse for free tier billing.
The implementation correctly fetches billable verifications from Clickhouse with appropriate parameters.
internal/clickhouse/src/client/noop.ts (1)
2-3
: Implementation of Querier
and Inserter
interfaces enhances functionality
The Noop
class now correctly implements both Querier
and Inserter
interfaces, allowing it to handle both querying and inserting operations as per the new architecture.
internal/clickhouse/src/client/client.ts (2)
3-3
: Correctly Importing 'Inserter' and 'Querier' Interfaces
The addition of Inserter
and Querier
interfaces in the import statement aligns with their usage in the class.
9-9
: Class Now Implements 'Querier' and 'Inserter' Interfaces
The Client
class now implements both Querier
and Inserter
interfaces, properly reflecting its responsibilities.
internal/clickhouse/src/verifications.ts (1)
23-26
: Ensure 'dateTimeToUnix' handles time zones correctly
The time
field uses dateTimeToUnix
for conversion. Verify that the function correctly handles time zones and daylight saving time transitions to prevent any discrepancies in timestamp conversions.
Provide additional details or tests to confirm the correctness of dateTimeToUnix
.
apps/dashboard/lib/tinybird.ts (1)
17-17
: Ensure all references to dateToUnixMilli
are updated
The function dateToUnixMilli
has been renamed to _dateToUnixMilli
, possibly to indicate internal use. Please verify that all usages of dateToUnixMilli
in the codebase have been updated to _dateToUnixMilli
to prevent any reference errors.
Run the following script to check for any remaining references to dateToUnixMilli
:
apps/dashboard/app/(app)/identities/[identityId]/page.tsx (2)
17-17
: LGTM!
The import statement correctly includes the clickhouse
library for data retrieval.
145-145
: Verify the correctness of the keySpaceId
parameter.
Ensure that key.keyAuthId
correctly maps to keySpaceId
. If keyAuthId
represents the key space identifier, then this is appropriate. Otherwise, please confirm that the correct property is being used.
apps/dashboard/app/(app)/ratelimits/[namespaceId]/logs/page.tsx (3)
171-171
: Verify the existence of 'request_id' in log entries
The key
prop in <TableRow>
is set to l.request_id
. Ensure that the logs
data structure returned by clickhouse.ratelimits.logs
includes the request_id
property. If the property was previously requestId
, confirm that all references and data mappings have been updated accordingly.
183-185
: Confirm the use of 'l.passed' for the success indicator
The success status is determined by l.passed
. Please verify that the logs
data includes a passed
property reflecting the success state. If this was previously l.success
, ensure that the data source and all related code have been updated to use passed
.
117-118
:
Possible reversal of 'start' and 'end' parameters in query
In the query object, start
is assigned selected.before?.getTime()
, and end
is assigned selected.after?.getTime()
. This might be reversed. Typically, start
should correspond to the earlier date (after
), and end
to the later date (before
). Please verify if the assignments should be swapped to reflect the correct time range.
Apply this diff to fix the assignments:
const query = {
workspaceId: workspaceId,
namespaceId: namespaceId,
- start: selected.before?.getTime() ?? undefined,
- end: selected.after?.getTime() ?? undefined,
+ start: selected.after?.getTime() ?? undefined,
+ end: selected.before?.getTime() ?? undefined,
identifier: selected.identifier.length > 0 ? selected.identifier : undefined,
country: selected.country.length > 0 ? selected.country : undefined,
ipAddress: selected.ipAddress.length > 0 ? selected.ipAddress : undefined,
success: selected.success ?? undefined,
};
To verify the correct usage of start
and end
parameters, you can review the expected behavior of the date filters in your application or check the implementation details of clickhouse.ratelimits.logs
.
internal/clickhouse/src/ratelimits.ts (1)
167-168
: Verify time comparison consistency in SQL queries
In the getRatelimitLogs
function, the time comparisons use AND time >= {start: Int64}
without converting start
and end
using fromUnixTimestamp64Milli
, unlike the other functions where time
is compared using fromUnixTimestamp64Milli({start: Int64})
. Please verify that the data type of the time
column in ratelimits.raw_ratelimits
is compatible with direct integer comparison.
To confirm the data type of the time
column, run the following script:
This script searches for the table creation script to verify if the time
column is stored as an Int64
or DateTime64
. Adjust the time comparisons accordingly based on the data type.
✅ Verification successful
Direct integer comparison with time
column is correct
The schema shows that the time
column in ratelimits.raw_ratelimits_v1
is defined as Int64
with a comment indicating it stores unix milliseconds. Therefore, the direct integer comparison in the query is correct and consistent with the data type.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Search for the schema definition of 'ratelimits.raw_ratelimits' to check the 'time' column type.
# Test: Find the CREATE TABLE statement and display the 'time' column definition.
rg -A 5 -B 2 'CREATE TABLE.*ratelimits\.raw_ratelimits' --type sql
Length of output: 763
apps/dashboard/app/(app)/ratelimits/[namespaceId]/page.tsx (13)
8-8
: LGTM!
The import of the clickhouse
library is necessary for the changes made in this file to fetch rate limit data from ClickHouse.
100-111
: Verify the removal of commented code.
The commented-out code block that constructs the dataOverTime
array using ratelimitedOverTime
and successOverTime
has been removed. Instead, a new implementation using ratelimitEvents
and the updated metric names (passed
instead of success
) is being used.
Please confirm that this change is intentional and that the new implementation covers all necessary scenarios.
136-136
: LGTM!
The metric calculation has been updated to use the passed
property instead of success
, which is consistent with the changes made to the ratelimitEvents
data structure.
168-169
: LGTM!
The label and value for the "Passed" metric have been updated to reflect the changes in the ratelimitEvents
data structure, using the passed
property instead of success
.
174-174
: LGTM!
The calculation for the "Ratelimited" metric has been updated to use the difference between total
and passed
from the ratelimitEvents
data, which is consistent with the changes made to the data structure.
179-179
: LGTM!
The calculation for the "Total" metric has been updated to use the total
property from the ratelimitEvents
data, which is consistent with the changes made to the data structure.
184-186
: LGTM!
The calculation for the "Success Rate" metric has been updated to use the passed
and total
properties from the ratelimitEvents
data, which is consistent with the changes made to the data structure.
238-238
: LGTM!
The getRatelimitsPerInterval
function for the "60m" interval has been updated to use clickhouse.ratelimits.perMinute
, which aligns with the shift from Tinybird to ClickHouse for fetching rate limit data.
249-249
: LGTM!
The getRatelimitsPerInterval
function for the "24h" interval has been updated to use clickhouse.ratelimits.perHour
, which aligns with the shift from Tinybird to ClickHouse for fetching rate limit data.
261-261
: LGTM!
The getRatelimitsPerInterval
function for the "7d" interval has been updated to use clickhouse.ratelimits.perDay
, which aligns with the shift from Tinybird to ClickHouse for fetching rate limit data.
273-273
: LGTM!
The getRatelimitsPerInterval
function for the "30d" interval has been updated to use clickhouse.ratelimits.perDay
, which aligns with the shift from Tinybird to ClickHouse for fetching rate limit data.
285-285
: LGTM!
The getRatelimitsPerInterval
function for the "90d" interval has been updated to use clickhouse.ratelimits.perDay
, which aligns with the shift from Tinybird to ClickHouse for fetching rate limit data.
57-82
: Verify the removal of Tinybird-related functions.
The changes in this section indicate a shift from using Tinybird to ClickHouse for fetching rate limit data. The getIdentifiers
, getRatelimitsPerInterval
, and other Tinybird-specific functions have been removed.
Please ensure that all references to these functions have been properly updated throughout the codebase to avoid any breakages.
✅ Verification successful
Based on the search results, I can generate the final response as we have sufficient information. The results show that all references to these rate-limiting functions are now properly using the ClickHouse implementation, with no lingering Tinybird references.
Migration to ClickHouse rate-limiting functions is complete and consistent
The codebase shows a clean migration where all rate-limiting functions are now properly sourced from the ClickHouse implementation:
- Functions are defined in
internal/clickhouse/src/ratelimits.ts
- They are exported through
internal/clickhouse/src/index.ts
- Used correctly in
apps/dashboard/app/(app)/ratelimits/[namespaceId]/page.tsx
with ClickHouse client
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for references to the removed Tinybird functions
rg -w 'getIdentifiers|getRatelimitsPerInterval|getRatelimitsPerDay|getRatelimitsPerHour|getRatelimitsPerMinute|getRatelimitsPerMonth'
Length of output: 2018
apps/dashboard/app/(app)/apis/[apiId]/page.tsx (4)
55-55
: Ensure api.keyAuthId
is not null before database queries.
Similar to the previous comment, the use of api.keyAuthId!
assumes it is always defined. To prevent potential issues when querying the database, add a null check for api.keyAuthId
.
62-63
: Add null checks for api.keyAuthId
when calling ClickHouse methods.
Again, api.keyAuthId!
is used without verifying if it is defined. Ensure that api.keyAuthId
is checked for null
or undefined
before passing it to ClickHouse methods to avoid runtime errors.
69-70
: Verify api.keyAuthId
before passing to function parameters.
Consistently check for null
or undefined
values of api.keyAuthId
when using it in function calls. This practice enhances code safety and prevents unexpected exceptions.
140-140
: Ensure verificationsInBillingCycle
is correctly initialized and contains data.
When using reduce
on verificationsInBillingCycle
, confirm that it is not undefined
or empty to prevent errors. If there's a possibility it could be undefined
, provide a default value.
Run the following script to check if verificationsInBillingCycle
is always defined:
If the script returns matches where verificationsInBillingCycle
may not be properly initialized, consider adding default values or checks.
✅ Verification successful
Let me check the initialization context to verify if verificationsInBillingCycle
is properly handled.
Let me check the full Promise.all array to see how verificationsInBillingCycle
is being populated.
Let me check the implementation of getVerificationsPerInterval
function since it's being used to populate verificationsInBillingCycle
.
Let me check the implementation of the clickhouse verifications methods to understand how the data is structured.
Based on the analysis of the code, I can now provide the final response:
The verificationsInBillingCycle
array is properly initialized and safe to use with reduce.
The value comes from a Promise.all array where it's populated by getVerificationsPerInterval({...})
which is dynamically set to either clickhouse.verifications.perHour
, clickhouse.verifications.perDay
based on the time interval. The data is always initialized as it's part of a server-side data fetching operation that must complete before the component renders. If the query returns no data, it would still be an empty array rather than undefined, making it safe for the reduce operation.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Search for all initializations of `verificationsInBillingCycle` to ensure it is defined.
# Test: Ensure `verificationsInBillingCycle` is assigned a value.
rg 'const \[.*verificationsInBillingCycle.*\] ='
Length of output: 187
Script:
#!/bin/bash
# Get more context around the initialization
rg -B 5 -A 5 'const \[.*verificationsInBillingCycle.*\] ='
Length of output: 962
Script:
#!/bin/bash
# Get the full Promise.all context
rg -B 2 -A 30 'const \[.*verificationsInBillingCycle.*\] ='
Length of output: 2817
Script:
#!/bin/bash
# Search for getVerificationsPerInterval implementation
ast-grep --pattern 'function getVerificationsPerInterval($_) {
$$$
}'
# Also search with rg in case it's defined differently
rg -B 2 -A 10 'getVerificationsPerInterval'
Length of output: 14593
Script:
#!/bin/bash
# Search for clickhouse verifications implementation
rg -B 2 -A 10 'clickhouse\.verifications\.'
Length of output: 14462
internal/clickhouse/schema/022_create_business_active_workspaces_per_month_v1.sql
Outdated
Show resolved
Hide resolved
internal/clickhouse/schema/012_create_billing_billable_verifications_per_month_v1.sql
Show resolved
Hide resolved
internal/clickhouse/schema/015_create_ratelimits_ratelimits_per_hour_v1.sql
Show resolved
Hide resolved
internal/clickhouse/schema/017_create_ratelimits_ratelimits_per_month_v1.sql
Show resolved
Hide resolved
@@ -50,40 +43,35 @@ export default async function ApiPage(props: { | |||
prepareInterval(interval); | |||
const query = { | |||
workspaceId: api.workspaceId, | |||
apiId: api.id, | |||
keySpaceId: api.keyAuthId!, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle potential null or undefined api.keyAuthId
to prevent runtime errors.
Using the non-null assertion operator !
on api.keyAuthId
assumes that it is neither null
nor undefined
. If api.keyAuthId
can be null
, this may lead to runtime errors. Consider implementing a null check or providing a default value to ensure robustness.
Apply this diff to add a null check:
const query = {
workspaceId: api.workspaceId,
- keySpaceId: api.keyAuthId!,
+ keySpaceId: api.keyAuthId,
start,
end,
};
+if (!query.keySpaceId) {
+ // Handle the null or undefined case appropriately
+ throw new Error("api.keyAuthId is null or undefined.");
+}
Committable suggestion skipped: line range outside the PR's diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 46
🧹 Outside diff range and nitpick comments (82)
apps/www/.env.example (1)
Line range hint
4-5
: Remove extra whitespace linesThere are unnecessary blank lines at the end of the file. While having one newline at the end of the file is a good practice, multiple blank lines should be removed.
DATABASE_HOST= DATABASE_USERNAME= DATABASE_PASSWORD= - - +apps/dashboard/lib/clickhouse.ts (1)
4-4
: Consider enhancing error handling and client lifecycle management.While the implementation is functional, consider these improvements for production readiness:
- Add error handling for connection issues
- Implement proper client cleanup
- Add connection validation
Consider this enhanced implementation:
-export const clickhouse = new ClickHouse({ url: env().CLICKHOUSE_URL }); +export class ClickHouseClient { + private static instance: ClickHouse | null = null; + + public static getInstance(): ClickHouse { + if (!this.instance) { + const url = env().CLICKHOUSE_URL; + if (!url) { + throw new Error("CLICKHOUSE_URL is required"); + } + this.instance = new ClickHouse({ url }); + } + return this.instance; + } + + public static async cleanup(): Promise<void> { + if (this.instance) { + // Add cleanup logic here when needed + this.instance = null; + } + } +} + +export const clickhouse = ClickHouseClient.getInstance();This implementation provides:
- Singleton pattern to prevent multiple connections
- Better error handling for missing configuration
- A cleanup method for proper resource management
- Easy extension for connection pooling if needed
internal/clickhouse/schema/022_create_business_active_workspaces_per_month_v1.sql (2)
2-9
: Consider optimizing table structure for better query performance.The current table structure could benefit from several performance optimizations:
- Add a composite primary key to improve data locality and query performance
- Consider partitioning by month to optimize data pruning
- Add a TTL policy if historical data isn't needed indefinitely
Consider applying these optimizations:
CREATE TABLE business.active_workspaces_per_month_v1 ( time Date, - workspace_id String + workspace_id String, + count UInt32 DEFAULT 1 ) ENGINE = SummingMergeTree() -ORDER BY (time) +PARTITION BY toYYYYMM(time) +ORDER BY (workspace_id, time) +TTL time + INTERVAL 2 YEAR ;The changes:
- Added
count
column for SummingMergeTree to aggregate- Added monthly partitioning for efficient data pruning
- Changed sort order to optimize workspace-specific queries
- Added TTL policy to automatically clean old data
2-6
: Consider adding more business-relevant metrics.For better business analytics, consider adding columns that provide more insights:
CREATE TABLE business.active_workspaces_per_month_v1 ( time Date, - workspace_id String + workspace_id String, + organization_id String, + tier String, + region String, + active_users UInt32, + total_actions UInt64 )These additional columns would enable:
- Organization-level analytics
- Tier-based usage patterns
- Geographic distribution analysis
- User engagement metrics
internal/clickhouse/package.json (1)
Line range hint
2-14
: Consider enhancing package metadata.The package.json could benefit from additional metadata to improve its maintainability and documentation:
- Add a meaningful description
- Include repository field
- Add scripts section for common operations
- Consider adding keywords related to ClickHouse
Here's a suggested enhancement:
"name": "@unkey/clickhouse", "version": "1.0.0", - "description": "", + "description": "ClickHouse integration for Unkey", "main": "./src/index.ts", "types": "./src/index.ts", - "keywords": [], + "keywords": ["clickhouse", "database", "analytics", "unkey"], "author": "Andreas Thomas", "license": "AGPL-3.0", + "repository": { + "type": "git", + "url": "https://github.com/unkeyed/unkey" + }, + "scripts": { + "build": "tsc", + "test": "jest" + },internal/clickhouse/schema/012_create_billing_billable_verifications_per_month_v1.sql (1)
2-8
: Consider adding constraints and column descriptions.The schema looks good, but could benefit from additional constraints and documentation:
- Add constraints to ensure year and month are within valid ranges
- Add column descriptions using COMMENT clauses to document the purpose of each field
CREATE TABLE billing.billable_verifications_per_month_v1 ( - year Int, - month Int, - workspace_id String, - count Int64 + year Int COMMENT 'Calendar year of the billing period' CHECK year >= 2000, + month Int COMMENT 'Calendar month (1-12) of the billing period' CHECK month BETWEEN 1 AND 12, + workspace_id String COMMENT 'Unique identifier of the workspace being billed', + count Int64 COMMENT 'Number of billable verifications in this period' CHECK count >= 0 )internal/clickhouse/schema/016_create_ratelimits_ratelimits_per_day_v1.sql (1)
2-11
: Consider adding TTL and partitioning for data lifecycle management.The table structure looks good with appropriate column types. However, for time-series data like rate limits, consider:
- Adding TTL to automatically clean up old data
- Adding partitioning by time for better query performance and data management
CREATE TABLE ratelimits.ratelimits_per_day_v1 ( time DateTime, workspace_id String, namespace_id String, identifier String, passed Int64, total Int64 ) +PARTITION BY toYYYYMM(time) +TTL time + INTERVAL 3 MONTH ENGINE = SummingMergeTree()internal/clickhouse/schema/024_create_ratelimits_last_used_mv_v1.sql (1)
16-17
: Remove extra blank lineFor consistency, remove one of the blank lines between the up and down migrations.
identifier ; - -- +goose downinternal/clickhouse/schema/017_create_ratelimits_ratelimits_per_month_v1.sql (1)
12-13
: Consider adding a TTL policy for data retention.The SummingMergeTree engine is a good choice for aggregating rate limit data. However, consider adding a TTL policy to automatically clean up old data, as rate limit statistics typically don't need to be kept indefinitely.
ENGINE = SummingMergeTree() -ORDER BY (workspace_id, namespace_id, time, identifier) +ORDER BY (workspace_id, namespace_id, time, identifier) +TTL time + INTERVAL 6 MONTH DELETEinternal/clickhouse/schema/014_create_ratelimits_ratelimits_per_minute_v1.sql (2)
2-11
: Consider adding TTL for data retention.The table schema looks good for rate limiting purposes. However, since this is time-series data that grows continuously, consider adding a TTL clause to automatically clean up old data.
Example addition:
TTL time + INTERVAL 30 DAY DELETE
1-14
: Consider adding distributed table configuration.Since ClickHouse is often deployed in a distributed setup, consider whether this table needs a corresponding Distributed table wrapper for scalability.
Example configuration to discuss with the team:
CREATE TABLE ratelimits.ratelimits_per_minute_v1_distributed ON CLUSTER '{cluster}' AS ratelimits.ratelimits_per_minute_v1 ENGINE = Distributed('{cluster}', ratelimits, ratelimits_per_minute_v1, rand());internal/clickhouse/schema/010_create_ratelimits_raw_ratelimits_table.sql (1)
18-19
: Add IF EXISTS clause to DROP TABLE statement.For safer rollbacks, consider adding the IF EXISTS clause to prevent errors if the table doesn't exist.
-- +goose down -DROP TABLE ratelimits.raw_ratelimits_v1; +DROP TABLE IF EXISTS ratelimits.raw_ratelimits_v1;internal/clickhouse/schema/005_create_verifications_key_verifications_per_day_v1.sql (1)
2-14
: Well-structured schema with performance optimizations!The table design shows good practices:
- Using
LowCardinality(String)
for the outcome column optimizes storage and query performance for enum-like valuesSummingMergeTree
engine is a better choice than the previousAggregatingMergeTree
for this use case, as it will automatically sum the count column during background merges- The composite
ORDER BY
clause enables efficient filtering and aggregation queries by workspace and key spaceConsider adding a TTL policy if historical data doesn't need to be kept indefinitely. This can help manage storage growth:
TTL time + INTERVAL 1 YEAR DELETEinternal/clickhouse/schema/011_create_telemetry_raw_sdks_v1.sql (1)
Line range hint
2-19
: Table structure looks well-designed for telemetry data collection.The table schema is well-structured with appropriate data types and includes essential fields for tracking SDK usage. The use of
MergeTree
engine with ordering by(request_id, time)
is a good choice for efficient querying of telemetry data.Consider these optimizations if query patterns require them:
- Add a
TTL
clause if you need automatic data retention/cleanup- Consider adding
PARTITION BY toYYYYMM(fromUnixTimestamp64Milli(time))
if you'll be frequently querying specific time rangesinternal/clickhouse/schema/006_create_verifications_key_verifications_per_month_v1.sql (2)
2-11
: Consider adding TTL and compression settings for efficient data management.The table structure looks good with appropriate data types. However, since this is a time-series table storing monthly metrics, consider:
- Adding a TTL policy to automatically manage historical data
- Configuring compression settings for better storage efficiency
CREATE TABLE verifications.key_verifications_per_month_v1 ( time DateTime, workspace_id String, key_space_id String, identity_id String, key_id String, outcome LowCardinality(String), count Int64 ) ENGINE = SummingMergeTree() -ORDER BY (workspace_id, key_space_id, time, identity_id, key_id) +ORDER BY (workspace_id, key_space_id, time, identity_id, key_id) +TTL time + INTERVAL 12 MONTH +SETTINGS + index_granularity = 8192, + storage_policy = 'default', + compression_codec = 'LZ4'
1-17
: Consider adding data validation constraints.While the schema looks solid, consider adding additional constraints to ensure data integrity:
CREATE TABLE verifications.key_verifications_per_month_v1 ( - time DateTime, + time DateTime CODEC(DoubleDelta), workspace_id String, key_space_id String, identity_id String, key_id String, - outcome LowCardinality(String), + outcome LowCardinality(String) DEFAULT '', - count Int64 + count Int64 DEFAULT 1 ) ENGINE = SummingMergeTree() ORDER BY (workspace_id, key_space_id, time, identity_id, key_id) +PARTITION BY toYYYYMM(time)Changes suggested:
- Added CODEC(DoubleDelta) for better DateTime compression
- Added default values for outcome and count
- Added monthly partitioning for better query performance and data management
internal/clickhouse/schema/004_create_verifications_key_verifications_per_hour_v1.sql (2)
2-11
: Consider adding TTL and partitioning for better data management.The table structure looks good, but consider adding:
- TTL (Time-To-Live) for automatic data cleanup
- Partitioning by time for better query performance and data management
CREATE TABLE verifications.key_verifications_per_hour_v1 ( time DateTime, workspace_id String, key_space_id String, identity_id String, key_id String, outcome LowCardinality(String), count Int64 ) ENGINE = SummingMergeTree() +PARTITION BY toYYYYMM(time) ORDER BY (workspace_id, key_space_id, time, identity_id, key_id) +TTL time + INTERVAL 6 MONTH DELETE
12-13
: Good choice of engine and ordering for aggregation use case.The use of SummingMergeTree is appropriate for this table as it will automatically sum the
count
column during merges. The ordering by workspace_id and key_space_id first is optimal for typical queries filtering by these fields.For optimal query performance:
- Queries should include workspace_id and key_space_id in WHERE clauses when possible
- Use time ranges in queries to leverage the ordering by time
internal/clickhouse/schema/013_create_billing_billable_verifications_per_month_mv_v1.sql (1)
18-20
: Clean up file formatting.Consider removing extra blank lines between migrations and ensure there's a single newline at the end of the file.
internal/clickhouse/schema/019_create_ratelimits_ratelimits_per_hour_mv_v1.sql (1)
12-18
: Consider adding ORDER BY clause for better performance.While the current implementation is functionally correct, consider adding an ORDER BY clause to optimize query performance. In ClickHouse, materialized views benefit from ordered storage, especially for time-series data.
GROUP BY workspace_id, namespace_id, identifier, - time + time +ORDER BY + time, + workspace_id, + namespace_id, + identifierinternal/clickhouse/schema/020_create_ratelimits_ratelimits_per_day_mv_v1.sql (1)
12-17
: Consider adding ORDER BY clause for optimization.While the current implementation is functional, adding an ORDER BY clause could improve query performance, especially for time-series data.
GROUP BY workspace_id, namespace_id, identifier, - time + time +ORDER BY + workspace_id, + namespace_id, + identifier, + timeinternal/clickhouse/schema/021_create_ratelimits_ratelimits_per_month_mv_v1.sql (1)
13-17
: Verify index requirements for grouped columns.The GROUP BY clause uses multiple columns that might benefit from proper indexing in the source table for better query performance.
Consider adding appropriate indices on
workspace_id
,namespace_id
,identifier
, andtime
columns in the source table if not already present.internal/clickhouse/schema/018_create_ratelimits_ratelimits_per_minute_mv_v1.sql (1)
13-17
: Consider indexing implications for grouped columns.The GROUP BY clause uses workspace_id, namespace_id, identifier, and time. Ensure these columns are properly indexed in the source table for optimal performance.
Consider:
- Adding appropriate indices if not already present
- Monitoring query performance with large datasets
- Setting up TTL policies if historical data isn't needed indefinitely
deployment/grafana/grafana.yaml (2)
19-23
: Consider adding additional connection settings.The current configuration lacks important connection parameters that could affect reliability and performance.
Consider adding these settings:
jsonData: defaultDatabase: database port: 9000 host: clickhouse username: default # Add these settings queryTimeout: 60 connectionTimeout: 10 maxOpenConns: 10 maxIdleConns: 5
20-20
: Use a more specific database name.The generic database name "database" might cause confusion.
Consider using a more descriptive name that reflects its purpose, for example:
- defaultDatabase: database + defaultDatabase: analytics_dbinternal/clickhouse/schema/007_create_verifications_key_verifications_per_hour_mv_v1.sql (2)
5-12
: Consider adding a WHERE clause for time-based filtering.The time conversion using
fromUnixTimestamp64Milli
andtoStartOfHour
is correct, but consider adding a WHERE clause to limit historical data processing, especially during the initial materialization.SELECT workspace_id, key_space_id, identity_id, key_id, outcome, count(*) as count, toStartOfHour(fromUnixTimestamp64Milli(time)) AS time FROM verifications.raw_key_verifications_v1 +WHERE time >= toUnixTimestamp64Milli(now() - INTERVAL 30 DAY) GROUP BY
24-25
: Consider using CASCADE for view removal.The down migration should consider using CASCADE to ensure all dependent objects are also removed.
-DROP VIEW verifications.key_verifications_per_hour_mv_v1; +DROP VIEW IF EXISTS verifications.key_verifications_per_hour_mv_v1 CASCADE;internal/clickhouse/schema/009_create_verifications_key_verifications_per_month_mv_v1.sql (2)
13-20
: Consider adding a WHERE clause for data filtering.The current implementation aggregates all records without filtering. Consider whether you need to:
- Filter out specific outcomes (e.g., failed verifications)
- Add a time range limit to prevent processing historical data
Example addition:
FROM verifications.raw_key_verifications_v1 +WHERE time >= toUnixTimestamp64Milli(toStartOfMonth(now() - INTERVAL 12 MONTH)) GROUP BY
1-25
: Consider adding indexes for query optimization.Since this materialized view will likely be queried frequently for analytics, consider adding indexes on commonly filtered columns like
workspace_id
,time
, and potentiallyoutcome
.Example:
ALTER TABLE verifications.key_verifications_per_month_v1 ADD INDEX idx_workspace_time (workspace_id, time) TYPE minmax GRANULARITY 3;internal/clickhouse/schema/023_create_business_active_workspaces_per_month_mv_v1.sql (1)
6-7
: Consider adding workspace activity timestampThe views only track the date portion using
toDate(time)
. Consider preserving the full timestamp of the last activity to enable more granular analytics.- workspace_id, toDate(time) as time + workspace_id, + toDate(time) as date, + max(time) as last_activity_at FROM verifications.key_verifications_per_month_v1 +GROUP BY workspace_id, toDate(time)Also applies to: 14-15
internal/clickhouse/src/success.ts (2)
5-7
: Remove or update misleading comment.The comment about "month is not zero-indexed" seems unrelated to the current implementation, as the function doesn't take any month parameter as input.
-// get the billable verifications for a workspace in a specific month. -// month is not zero-indexed -> January = 1 +// Get the count of active workspaces grouped by time
18-22
: Consider adding error handling and input validation.The current implementation lacks error handling for query failures and timezone considerations for the date conversion.
Consider wrapping the query in a try-catch and adding timezone handling:
schema: z.object({ - time: dateTimeToUnix, + time: z.preprocess((val) => { + if (val instanceof Date) { + return dateTimeToUnix(val); + } + throw new Error('Expected DateTime value'); + }, z.number()), workspaces: z.number().int(), }),apps/dashboard/app/(app)/logs/page.tsx (1)
19-19
: Consider adding error handling and improving the UX.While the migration to
clickhouse.api.logs
is correct, consider these improvements:
- Add error handling for the Clickhouse API call
- Implement loading states during data fetching
- Consider using a structured log viewer instead of raw JSON output
Example implementation with error handling:
- const logs = await clickhouse.api.logs({ workspaceId: workspace.id, limit: 10 }); + try { + const logs = await clickhouse.api.logs({ workspaceId: workspace.id, limit: 10 }); + return ( + <div> + <PageHeader title="Logs" /> + {/* Consider using a structured log viewer component here */} + <pre>{JSON.stringify(logs, null, 2)}</pre> + </div> + ); + } catch (error) { + console.error('Failed to fetch logs:', error); + return ( + <div> + <PageHeader title="Logs" /> + <div className="text-red-500">Failed to load logs. Please try again later.</div> + </div> + ); + }internal/clickhouse/src/client/interface.ts (2)
Line range hint
3-16
: LGTM! Consider adding JSDoc documentation.The
Querier
interface is well-designed with strong typing and clear separation of concerns. The inline comments are helpful, but consider adding JSDoc documentation for better IDE support and API documentation generation.Example enhancement:
+/** + * Interface for executing strongly-typed SQL queries against Clickhouse. + * Uses Zod schemas for runtime type validation of inputs and outputs. + */ export interface Querier { + /** + * Creates a typed query function with the given SQL and schemas. + * @param req.query - SQL query with {paramName: Type} parameter placeholders + * @param req.params - Zod schema for validating query parameters + * @param req.schema - Zod schema for validating query results + * @returns A function that executes the query with validated parameters + */ query<TIn extends z.ZodSchema<any>, TOut extends z.ZodSchema<any>>(req: {
Line range hint
18-24
: Consider adding batch size limits and error handling specifications.The
Inserter
interface is well-typed and flexible, but could benefit from additional safeguards and specifications:
- Batch size limits to prevent memory issues
- Explicit error handling contract
- Optional batch options (e.g., timeout, retry policy)
Consider extending the interface like this:
export interface InsertOptions { maxBatchSize?: number; timeout?: number; retryPolicy?: { maxAttempts: number; backoffMs: number; }; } export interface Inserter { insert<TSchema extends z.ZodSchema<any>>(req: { table: string; schema: TSchema; options?: InsertOptions; }): ( events: z.input<TSchema> | z.input<TSchema>[], ) => Promise<{ executed: boolean; query_id: string; failed?: z.input<TSchema>[] }>; }internal/clickhouse/schema/003_create_verifications_raw_key_verifications_v1.sql (2)
Line range hint
2-31
: Consider adding data retention policy and partitioning.The table structure is well-defined with appropriate comments and column types. However, consider the following improvements:
- Add TTL policy to automatically manage data retention:
TTL time + INTERVAL 90 DAY DELETE
- Add partitioning by time to improve query performance and data management:
PARTITION BY toYYYYMM(FROM_UNIXTIME(time/1000))
32-32
: Consider optimizing table settings.While the MergeTree engine and ordering are appropriate, consider adding these settings for better performance:
SETTINGS index_granularity = 8192, enable_mixed_granularity_parts = 1, min_bytes_for_wide_part = 0The current ordering by (workspace_id, key_space_id, key_id, time) is good for filtering specific keys and their history.
Taskfile.yml (1)
26-26
: Consider updating documentation.Since this represents a significant structural change in how ClickHouse migrations are organized, it would be helpful to document this in the project's README or migration documentation.
Consider adding a note about the new migration path and the reasoning behind centralizing ClickHouse schemas in the internal directory.
internal/clickhouse/src/latest_verifications.ts (4)
4-8
: Consider adding stricter input validation.While the basic schema is correct, consider adding length constraints and format validation for the ID fields to prevent potential issues with malformed inputs.
const params = z.object({ - workspaceId: z.string(), - keySpaceId: z.string(), - keyId: z.string(), + workspaceId: z.string().min(1).max(255), + keySpaceId: z.string().min(1).max(255), + keyId: z.string().min(1).max(255), });
17-21
: Consider optimizing the query performance.The current query uses
ORDER BY time DESC LIMIT 1
which might not be optimal for large datasets. Consider adding an index on the time column and potentially using a materialized view for frequently accessed latest verifications.
17-17
: Extract table name as a constant.The table name
default.raw_key_verifications_v1
should be defined as a constant at the module level for better maintainability.+const VERIFICATIONS_TABLE = 'default.raw_key_verifications_v1'; + export function getLatestVerifications(ch: Querier) { // ... query: ` SELECT time, outcome, region, - FROM default.raw_key_verifications_v1 + FROM ${VERIFICATIONS_TABLE} WHERE workspace_id = {workspaceId: String} // ...
24-28
: Consider adding value constraints to the return schema.The return schema could benefit from additional validation rules for the outcome and region fields.
schema: z.object({ time: z.number(), - outcome: z.string(), - region: z.string(), + outcome: z.enum(['success', 'failure']), // adjust based on possible values + region: z.string().min(1).max(50), }),internal/clickhouse/schema/002_create_metrics_raw_api_requests_v1.sql (3)
31-34
: Consider adding NOT NULL constraints for location columns.The newly added location columns (
country
,city
,colo
,continent
) might benefit from NOT NULL constraints with default values if these fields are expected to always have values. This helps maintain data consistency.- country String, - city String, - colo String, - continent String, + country String NOT NULL DEFAULT '', + city String NOT NULL DEFAULT '', + colo String NOT NULL DEFAULT '', + continent String NOT NULL DEFAULT '',
38-40
: Consider adding TTL for data retention.Since this table stores raw API requests which can grow significantly over time, consider adding a TTL clause to automatically manage data retention. This helps control storage costs and maintain performance.
ENGINE = MergeTree() ORDER BY (workspace_id, time, request_id) +TTL time + INTERVAL 90 DAY DELETE
Update required: Query in
internal/clickhouse/src/logs.ts
needs to be modifiedThe schema change from
default.raw_api_requests_v1
tometrics.raw_api_requests_v1
requires updating the query ininternal/clickhouse/src/logs.ts
. The query currently uses the old table namespace which will cause it to fail after the schema change.
internal/clickhouse/src/logs.ts
: Update the FROM clause fromdefault.raw_api_requests_v1
tometrics.raw_api_requests_v1
🔗 Analysis chain
Line range hint
2-40
: Verify the impact of schema changes on existing queries.The table namespace has changed from 'default' to 'metrics'. Ensure all existing queries referencing this table are updated accordingly.
Let me check the content of the file that references the old table name to understand the impact.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for any references to the old table name that might need updating rg -l "default.raw_api_requests_v1" --type sql --type ts --type go # Search for any materialized views or other dependencies on this table ast-grep --pattern 'CREATE MATERIALIZED VIEW $_ TO metrics.raw_api_requests_v1'Length of output: 180
Script:
#!/bin/bash # Check the content of the file that references the old table name rg -A 5 -B 5 "default.raw_api_requests_v1" internal/clickhouse/src/logs.tsLength of output: 367
internal/clickhouse/src/billing.ts (2)
4-11
: Enhance function documentation with JSDoc.While the current comment is helpful, using JSDoc would improve IDE integration and provide better documentation.
Consider this enhancement:
-// get the billable verifications for a workspace in a specific month. -// month is not zero-indexed -> January = 1 +/** + * Get the total count of billable verifications for a workspace in a specific month. + * @param ch - ClickHouse querier instance + * @returns Async function that accepts workspace and date parameters + * @example + * const getBillable = getBillableVerifications(ch); + * const count = await getBillable({ workspaceId: "123", year: 2024, month: 1 }); + * @note Month is not zero-indexed (January = 1) + */
32-37
: LGTM! Robust result handling.The code handles edge cases well with proper null checks and default values. Consider adding a type assertion for better type safety:
- const res = await query(args); + const res = (await query(args)) as Array<{ count: number }> | null;apps/dashboard/app/(app)/settings/root-keys/[keyId]/history/page.tsx (1)
42-46
: Consider enhancing error handling for Clickhouse queryWhile the migration to Clickhouse looks good, consider adding error handling for the
clickhouse.verifications.logs
call to gracefully handle potential query failures.- const history = await clickhouse.verifications.logs({ - workspaceId: UNKEY_WORKSPACE_ID, - keySpaceId: key.keyAuthId, - keyId: key.id, - }); + try { + const history = await clickhouse.verifications.logs({ + workspaceId: UNKEY_WORKSPACE_ID, + keySpaceId: key.keyAuthId, + keyId: key.id, + }); + } catch (error) { + console.error('Failed to fetch verification logs:', error); + // Consider implementing appropriate error UI + return <div>Failed to load verification history</div>; + }apps/dashboard/app/(app)/ratelimits/sparkline.tsx (1)
Line range hint
33-38
: Consider improving tooltip clarity.While the change to use
passed
is correct, consider making the tooltip text more descriptive to better explain what "Passed" means in the context of rate limits.- Passed + Requests Passed Rate Limitinternal/clickhouse/src/logs.ts (1)
4-25
: Consider improving query robustness and maintainability.A few suggestions to enhance the implementation:
- Consider extracting the table name to a configuration constant for easier management across environments
- Add a secondary sort key to ensure consistent pagination when timestamps are identical
- Consider adding index hints if available for the workspace_id column
+const RAW_API_REQUESTS_TABLE = "default.raw_api_requests_v1"; + export function getLogs(ch: Querier) { return async (args: { workspaceId: string; limit: number }) => { const query = ch.query({ query: ` SELECT request_id, time, workspace_id, host, method, path, request_headers, request_body, response_status, response_headers, response_body, error, service_latency - FROM default.raw_api_requests_v1 + FROM {table: String} WHERE workspace_id = {workspaceId: String} - ORDER BY time DESC + ORDER BY time DESC, request_id DESC LIMIT {limit: Int}`, params: z.object({ + table: z.string(), workspaceId: z.string(), limit: z.number().int(), }),apps/api/src/pkg/ratelimit/interface.ts (1)
Line range hint
33-39
: Add JSDoc comments for schema properties.Consider adding JSDoc comments to document the
passed
andcurrent
properties, similar to howtriggered
is documented. This would improve the API documentation.export const ratelimitResponseSchema = z.object({ + /** Current count of requests made in the current interval */ current: z.number(), + /** Number of requests remaining in the current interval */ remaining: z.number(), + /** Timestamp in seconds when the current interval will reset */ reset: z.number(), + /** Whether the rate limit check passed */ passed: z.boolean(), /** * The name of the limit that triggered a rejection */ triggered: z.string().nullable(), });apps/www/components/stats.tsx (1)
Line range hint
6-27
: Consider enhancing error handling and loading states.While the current error handling is functional, consider these improvements:
- Add a retry mechanism for failed queries
- Implement loading states for better UX
- Consider using React Query or SWR for automatic retries and caching
Example implementation with React Query:
import { useQueries } from '@tanstack/react-query'; const useStats = () => { return useQueries({ queries: [ { queryKey: ['workspaces-count'], queryFn: () => db .select({ count: sql<number>`count(*)` }) .from(schema.workspaces) .then((res) => res.at(0)?.count ?? 0), retry: 3, }, // Similar queries for apis and keys ], }); };apps/dashboard/app/(app)/settings/root-keys/[keyId]/history/access-table.tsx (3)
Line range hint
37-45
: Consider using a date formatting library for consistent time display.The current implementation uses native Date methods which might not handle all edge cases consistently. Consider using a library like
date-fns
orluxon
for more robust date/time formatting and localization support.-<span className="text-content">{new Date(verification.time).toDateString()}</span> -<span className="text-xs text-content-subtle"> - {new Date(verification.time).toTimeString().split("(").at(0)} -</span> +import { format } from "date-fns"; + +<span className="text-content"> + {format(verification.time, "PPP")} +</span> +<span className="text-xs text-content-subtle"> + {format(verification.time, "p")} +</span>
Line range hint
46-48
: Enhance accessibility for verification outcomes.The Check icon and outcome Badge could benefit from improved accessibility:
- Add aria-label to the Check icon
- Use semantic colors for different outcomes
-{verification.outcome === "VALID" ? <Check /> : <Badge>{verification.outcome}</Badge>} +{verification.outcome === "VALID" ? ( + <Check aria-label="Valid verification" className="text-green-500" /> +) : ( + <Badge variant={verification.outcome === "INVALID" ? "destructive" : "secondary"}> + {verification.outcome} + </Badge> +)}
Line range hint
34-34
: Consider using a more reliable key for list items.The current implementation uses array indices as keys with a biome-ignore comment. While this works, it's not ideal for list stability and performance.
Consider creating a composite key using multiple fields:
-// biome-ignore lint/suspicious/noArrayIndexKey: I got nothing better right now -<TableRow key={i}> +<TableRow key={`${verification.time}-${verification.region}-${verification.outcome}`}>apps/dashboard/app/(app)/banner.tsx (1)
32-37
: Consider adding error handling for the Clickhouse queryThe display logic is correct, but the Clickhouse query could benefit from error handling to ensure a graceful user experience if the billing data fetch fails.
- const billableVerifications = await clickhouse.billing.billableVerifications({ + let billableVerifications = 0; + try { + billableVerifications = await clickhouse.billing.billableVerifications({ + workspaceId: workspace.id, + year, + month, + }); + } catch (error) { + console.error('Failed to fetch billing verifications:', error); + // Optionally show an error banner instead + return ( + <Banner variant="alert"> + <p className="text-xs text-center"> + Unable to fetch usage data. Please try again later or contact support. + </p> + </Banner> + ); + }apps/dashboard/app/(app)/settings/root-keys/[keyId]/layout.tsx (1)
80-87
: Consider enhancing error handling and readability.The implementation is functionally correct but could benefit from some improvements:
- Add error handling for the clickhouse query
- Make the array access more explicit
- Extract the time comparison logic
Consider this refactoring:
const LastUsed: React.FC<{ workspaceId: string; keySpaceId: string; keyId: string }> = async ({ workspaceId, keySpaceId, keyId, }) => { - const lastUsed = await clickhouse.verifications - .latest({ workspaceId, keySpaceId, keyId }) - .then((res) => res.at(0)?.time ?? 0); + try { + const results = await clickhouse.verifications.latest({ workspaceId, keySpaceId, keyId }); + const lastUsedTime = results[0]?.time ?? 0; + const timeAgo = lastUsedTime ? ms(Date.now() - lastUsedTime) : "Never"; + + return <Metric label="Last Used" value={lastUsedTime ? `${timeAgo} ago` : "Never"} />; + } catch (error) { + console.error("Failed to fetch last used time:", error); + return <Metric label="Last Used" value="Error fetching data" />; + } - return ( - <Metric label="Last Used" value={lastUsed ? `${ms(Date.now() - lastUsed)} ago` : "Never"} /> - ); };apps/dashboard/app/(app)/apis/[apiId]/keys/[keyAuthId]/[keyId]/verification-table.tsx (1)
65-65
: Consider using a more stable unique identifierWhile combining timestamp with array index works, it could lead to unnecessary re-renders if the array order changes. If available, consider using a unique verification ID or a more stable combination of fields.
-key={`${verification.time}-${i}`} +key={verification.id ?? `${verification.time}-${verification.region}-${verification.outcome}`}apps/dashboard/app/(app)/success/page.tsx (1)
121-121
: Consider removing redundant key prop.While adding a key to Suspense is good practice, the Chart component already has a key prop with the same value. Since they're in the same iteration, one key is sufficient.
- <Suspense fallback={<Loading />} key={title}> + <Suspense fallback={<Loading />}>apps/dashboard/app/(app)/identities/[identityId]/page.tsx (1)
162-171
: Add error handling for Clickhouse query.While the implementation is functionally correct, consider adding error handling for the Clickhouse query to gracefully handle potential failures.
Consider updating the implementation like this:
const LastUsed: React.FC<{ workspaceId: string; keySpaceId: string; keyId: string }> = async ( props, ) => { - const lastUsed = await clickhouse.verifications - .latest({ - workspaceId: props.workspaceId, - keySpaceId: props.keySpaceId, - keyId: props.keyId, - }) - .then((res) => res.at(0)?.time ?? null); + const lastUsed = await clickhouse.verifications + .latest({ + workspaceId: props.workspaceId, + keySpaceId: props.keySpaceId, + keyId: props.keyId, + }) + .then((res) => res.at(0)?.time ?? null) + .catch((error) => { + console.error('Failed to fetch last used time:', error); + return null; + });apps/play/app/page-bk.tsx (1)
Line range hint
52-58
: Enhance error handling for ClickHouse integration.Given the migration to ClickHouse, the error handling should be more specific to catch and handle ClickHouse-related errors appropriately. The current generic error handling might not provide enough context for debugging issues with the new implementation.
Consider implementing more specific error handling:
if (response) { const resJson = JSON.parse(JSON.stringify(response, null, 2)); if (resJson.error) { - postNewLine(JSON.stringify(response, null, 2), "text-red-500"); + const errorMessage = resJson.error.clickhouse_error + ? `ClickHouse Error: ${resJson.error.message}` + : JSON.stringify(response, null, 2); + postNewLine(errorMessage, "text-red-500"); return; }apps/api/src/pkg/analytics.ts (2)
175-178
: LGTM! Consider documenting the geo fields.The addition of geographical fields (continent, city, country, colo) is well-implemented with proper null handling and defaults. Consider adding documentation about the expected format and source of these fields.
Line range hint
1-178
: Plan for post-migration cleanup.The file currently maintains both Tinybird and ClickHouse implementations during the migration. Consider creating a follow-up task to:
- Remove all Tinybird-related code once the migration is complete
- Clean up commented-out code and TODO markers
- Update class documentation to reflect the ClickHouse implementation
This will help maintain code clarity and reduce technical debt.
apps/api/src/routes/v1_ratelimit_limit.ts (2)
Line range hint
356-394
: Consider adding proper type definitions for Cloudflare request properties.Instead of using
@ts-expect-error
, consider creating proper type definitions for the Cloudflare request object. This would improve type safety and maintainability.// Create a type definition file (e.g., types/cloudflare.d.ts) interface CloudflareRequestProperties { cf?: { country: string; continent: string; city: string; colo: string; }; } // Extend the base Request type declare global { interface Request extends CloudflareRequestProperties {} }
Line range hint
395-444
: Consider refactoring duplicate audit logging logic.The audit logging code is duplicated between
insertGenericAuditLogs
andingestGenericAuditLogsTinybird
. Consider extracting the common data structure and logic into a shared function.function createAuditLogData(c: Context, namespace: Namespace, rootKey: RootKey, ratelimitResponse: RatelimitResponse) { return { auditLogId: newId("auditLog"), workspaceId: rootKey.authorizedWorkspaceId, bucket: namespace.id, actor: { type: "key", id: rootKey.key.id, }, description: "ratelimit", event: ratelimitResponse.passed ? "ratelimit.success" : "ratelimit.denied", meta: { requestId: c.get("requestId"), namespacId: namespace.id, identifier: req.identifier, success: ratelimitResponse.passed, }, time: Date.now(), resources: req.resources, context: { location: c.req.header("True-Client-IP") ?? "", userAgent: c.req.header("User-Agent") ?? "", }, }; }apps/dashboard/app/(app)/settings/billing/page.tsx (1)
188-192
: Consider consistent error handling for Clickhouse queriesWhile the migration to Clickhouse looks good, consider adding error handling for the Clickhouse query similar to how we handle the ratelimits response.
Consider this pattern:
const [usedVerifications, usedRatelimits] = await Promise.all([ clickhouse.billing.billableVerifications({ workspaceId: workspace.id, year, month, - }), + }).then(res => res ?? 0), ratelimits({ workspaceId: workspace.id, year, month, }).then((res) => res.data.at(0)?.success ?? 0), ]);apps/api/src/pkg/keys/service.ts (1)
Line range hint
677-686
: Consider updating the variable name in_verifyKey
for consistency.The result of
ratelimit()
is stored in a variable namedpass
(line 603:const [pass, ratelimit] = await this.ratelimit(c, data.key, ratelimits);
), while the returned property is now namedpassed
. Consider updating the variable name for consistency.- const [pass, ratelimit] = await this.ratelimit(c, data.key, ratelimits); + const [passed, ratelimit] = await this.ratelimit(c, data.key, ratelimits); - if (!pass) { + if (!passed) {internal/clickhouse/src/client/noop.ts (3)
3-3
: Update Class Documentation to Reflect Interface ChangesSince the class signature has changed to implement
Querier
andInserter
instead ofClickhouse
, consider updating any associated documentation or comments to reflect this change for clarity and maintainability.
Line range hint
5-15
: Enhance Error Handling in thequery
MethodIn the
query
method, when validating parameters withsafeParse
, consider handling the case when validation fails. Currently, the result ofsafeParse
is not checked for success or failure. Providing feedback or throwing an error when validation fails will improve robustness.Apply this diff to handle validation errors:
return async (params: z.input<TIn>): Promise<z.output<TOut>[]> => { - req.params?.safeParse(params); + const validation = req.params?.safeParse(params); + if (validation && !validation.success) { + throw new Error(validation.error.message); + } return []; };
Line range hint
17-34
: Ensure Consistent Error Handling in theinsert
MethodIn the
insert
method, when validation fails, an error is thrown usingv.error.message
. To maintain consistency and provide clearer context, consider prefixing the error message or handling it in a way that aligns with the rest of the codebase's error handling practices.For example:
if (!v.success) { - throw new Error(v.error.message); + throw new Error(`Insert validation error: ${v.error.message}`); }apps/dashboard/app/(app)/ratelimits/card.tsx (1)
Line range hint
49-60
: Update static date and time to dynamic valuesIn the
<time>
element, thedateTime
andtitle
attributes are hardcoded with static values. To accurately reflect thelastUsed
timestamp, these attributes should be dynamically generated based onlastUsed
.Apply this diff to use dynamic time values:
{lastUsed ? ( <> Last request{" "} - <time dateTime="2024-03-11T19:38:06.192Z" title="20:38:06 CET"> + <time dateTime={new Date(lastUsed).toISOString()} title={new Date(lastUsed).toLocaleTimeString()}> {ms(Date.now() - lastUsed)} ago </time> </> ) : ( "Never used" )}internal/clickhouse/src/index.ts (1)
41-45
: Potential redundancy in 'verifications' getter: 'logs' and 'latest' return the same dataIn the
verifications
getter, bothlogs
andlatest
are assigned the result ofgetLatestVerifications(this.client)
. Is it intentional for them to return the same data, or shouldlogs
retrieve different information, such as detailed logs or a broader dataset?internal/clickhouse/src/verifications.ts (1)
28-55
: Refactor to reduce code duplication across functions.The functions
getVerificationsPerHour
,getVerificationsPerDay
, andgetVerificationsPerMonth
have a significant amount of duplicated code. Consider refactoring them into a single generic function that accepts an interval parameter (e.g.,'HOUR'
,'DAY'
,'MONTH'
) to improve maintainability and reduce repetition.Here's an example of how you might refactor the code:
export function getVerificationsPerInterval(ch: Querier, interval: 'HOUR' | 'DAY' | 'MONTH') { return async (args: z.input<typeof params>) => { const tableMap = { HOUR: 'verifications.key_verifications_per_hour_v1', DAY: 'verifications.key_verifications_per_day_v1', MONTH: 'verifications.key_verifications_per_month_v1', }; const startFunctionMap = { HOUR: 'toStartOfHour', DAY: 'toStartOfDay', MONTH: 'toStartOfMonth', }; const intervalStepMap = { HOUR: '1 HOUR', DAY: '1 DAY', MONTH: '1 MONTH', }; const query = ` SELECT time, outcome, count FROM ${tableMap[interval]} WHERE workspace_id = {workspaceId: String} AND key_space_id = {keySpaceId: String} AND time >= {start: Int64} AND time < {end: Int64} ${args.keyId ? 'AND key_id = {keyId: String}' : ''} GROUP BY time ORDER BY time ASC WITH FILL FROM ${startFunctionMap[interval]}(fromUnixTimestamp64Milli({start: Int64})) TO ${startFunctionMap[interval]}(fromUnixTimestamp64Milli({end: Int64})) STEP INTERVAL ${intervalStepMap[interval]} ;`; return ch.query({ query, params, schema })(args); }; }Then, you can export specific functions for each interval:
export const getVerificationsPerHour = (ch: Querier) => getVerificationsPerInterval(ch, 'HOUR'); export const getVerificationsPerDay = (ch: Querier) => getVerificationsPerInterval(ch, 'DAY'); export const getVerificationsPerMonth = (ch: Querier) => getVerificationsPerInterval(ch, 'MONTH');Also applies to: 57-80, 82-109
apps/dashboard/app/(app)/ratelimits/[namespaceId]/logs/page.tsx (1)
191-191
: Consider removing unused columns or displaying relevant dataThe "Remaining" and "IP address" columns display placeholder values
"--"
. If this data is no longer available or relevant, consider removing these columns from the table to simplify the UI and avoid potential confusion for users.Also applies to: 194-194
internal/clickhouse/src/ratelimits.ts (1)
13-44
: Refactor Suggestion: Reduce Code Duplication in Query FunctionsThe functions
getRatelimitsPerMinute
,getRatelimitsPerHour
,getRatelimitsPerDay
, andgetRatelimitsPerMonth
have similar structures and logic. Consider refactoring to create a generic function that accepts parameters for the time interval and corresponding table names to reduce code duplication.Example refactor:
function getRatelimitsPerInterval(ch: Querier, interval: 'minute' | 'hour' | 'day' | 'month') { const tableMap = { minute: { table: 'ratelimits.ratelimits_per_minute_v1', startFunc: 'toStartOfMinute', step: '1 MINUTE', }, // ... similar mappings for hour, day, month }; return async (args: z.infer<typeof params>) => { const { table, startFunc, step } = tableMap[interval]; const query = ch.query({ query: ` SELECT time, sum(passed) as passed, sum(total) as total FROM ${table} WHERE workspace_id = {workspaceId: String} AND namespace_id = {namespaceId: String} ${args.identifier ? 'AND multiSearchAny(identifier, {identifier: Array(String)}) > 0' : ''} AND time >= fromUnixTimestamp64Milli({start: Int64}) AND time <= fromUnixTimestamp64Milli({end: Int64}) GROUP BY time ORDER BY time ASC WITH FILL FROM ${startFunc}(fromUnixTimestamp64Milli({start: Int64})) TO ${startFunc}(fromUnixTimestamp64Milli({end: Int64})) STEP INTERVAL ${step} ;`, params: args, schema: z.object({ time: dateTimeToUnix, passed: z.number(), total: z.number(), }), }); return query(args); }; }This refactor simplifies the codebase and makes it easier to maintain.
Also applies to: 46-77, 78-109, 110-141
apps/dashboard/app/(app)/ratelimits/[namespaceId]/page.tsx (2)
65-82
: Handle Potential Errors in PromisesWhile using
Promise.all
, if any of the promises reject, the entire operation will reject. Consider adding error handling to manage potential rejections gracefully and provide fallback logic if needed.You might wrap each promise with a
catch
block or usePromise.allSettled
to handle each promise individually.
100-109
: Optimize Data TransformationUsing
flatMap
to constructdataOverTime
is efficient. However, consider precomputingx
outside the array to avoid creating multipleDate
objects for the same timestamp.Apply this diff to optimize:
const dataOverTime = ratelimitEvents.flatMap((d) => { const x = new Date(d.time).toISOString(); return [ { x, y: d.total - d.passed, category: "Ratelimited", }, { x, y: d.passed, category: "Passed", }, ]; });apps/dashboard/app/(app)/apis/[apiId]/page.tsx (3)
84-84
: Avoid mutating theverifications
array directly withsort
Using
verifications.sort()
mutates the original array, which might lead to unintended side effects ifverifications
is used elsewhere. Consider creating a copy of the array before sorting to prevent side effects.Apply this diff to avoid mutating the original array:
- for (const d of verifications.sort((a, b) => a.time - b.time)) { + for (const d of [...verifications].sort((a, b) => a.time - b.time)) {
167-193
: Refactor repetitive<Metric>
components into a dynamic loopMultiple
<Metric>
components are defined with similar structures for different verification outcomes. Refactoring this into a loop that maps over an array of outcome data reduces code duplication and enhances maintainability.Here's an example of how you might refactor the code:
- <Metric - label="Valid" - value={formatNumber(successOverTime.reduce((sum, day) => sum + day.y, 0))} - /> - <Metric - label="Ratelimited" - value={formatNumber(ratelimitedOverTime.reduce((sum, day) => sum + day.y, 0))} - /> - // Repeat for other outcomes + const metrics = [ + { label: "Valid", data: successOverTime }, + { label: "Ratelimited", data: ratelimitedOverTime }, + { label: "Usage Exceeded", data: usageExceededOverTime }, + { label: "Disabled", data: disabledOverTime }, + { label: "Insufficient Permissions", data: insufficientPermissionsOverTime }, + { label: "Expired", data: expiredOverTime }, + { label: "Forbidden", data: forbiddenOverTime }, + ]; + + {metrics.map((metric) => ( + <Metric + key={metric.label} + label={metric.label} + value={formatNumber(metric.data.reduce((sum, day) => sum + day.y, 0))} + /> + ))}
Line range hint
279-319
: RefactorprepareInterval
function to eliminate repetitive codeThe
prepareInterval
function contains repetitive code for each case in the switch statement. By parameterizing the differing values, you can reduce duplication and make the code cleaner.Here's how you might refactor the function:
function prepareInterval(interval: Interval) { const now = new Date(); + const intervals = { + "24h": { + adjustTime: () => now.setUTCHours(now.getUTCHours() + 1, 0, 0, 0), + intervalMs: 1000 * 60 * 60 * 24, + granularity: 1000 * 60 * 60, + getVerificationsPerInterval: clickhouse.verifications.perHour, + getActiveKeysPerInterval: clickhouse.activeKeys.perHour, + }, + "7d": { + adjustTime: () => { + now.setUTCDate(now.getUTCDate() + 1); + return now.setUTCHours(0, 0, 0, 0); + }, + intervalMs: 1000 * 60 * 60 * 24 * 7, + granularity: 1000 * 60 * 60 * 24, + getVerificationsPerInterval: clickhouse.verifications.perDay, + getActiveKeysPerInterval: clickhouse.activeKeys.perDay, + }, + // Add other intervals similarly + }; + + const config = intervals[interval]; + const end = config.adjustTime(); + return { + start: end - config.intervalMs, + end, + intervalMs: config.intervalMs, + granularity: config.granularity, + getVerificationsPerInterval: config.getVerificationsPerInterval, + getActiveKeysPerInterval: config.getActiveKeysPerInterval, + }; - - switch (interval) { - case "24h": { - const end = now.setUTCHours(now.getUTCHours() + 1, 0, 0, 0); - const intervalMs = 1000 * 60 * 60 * 24; - return { - start: end - intervalMs, - end, - intervalMs, - granularity: 1000 * 60 * 60, - getVerificationsPerInterval: clickhouse.verifications.perHour, - getActiveKeysPerInterval: clickhouse.activeKeys.perHour, - }; - } - // other cases - } }apps/dashboard/app/(app)/apis/[apiId]/keys/[keyAuthId]/[keyId]/page.tsx (2)
Line range hint
252-258
: Improve handling of expired keys to avoid negative time display.When a key has expired,
key.expires.getTime() - Date.now()
results in a negative value, which might be confusing when displayed to the user. Consider showing a static label like "Expired" or displaying the expiration date instead of a negative duration.Apply this diff to adjust the display:
-<Metric - label={key.expires && key.expires.getTime() < Date.now() ? "Expired" : "Expires in"} - value={key.expires ? ms(key.expires.getTime() - Date.now()) : <Minus />} -/> +<Metric + label={key.expires && key.expires.getTime() < Date.now() ? "Expired" : "Expires in"} + value={ + key.expires + ? key.expires.getTime() < Date.now() + ? <Minus /> + : ms(key.expires.getTime() - Date.now()) + : <Minus /> + } +/>
Line range hint
368-407
: Add default case to handle unexpected intervals inprepareInterval
.The
switch
statement oninterval
lacks a default case. Without a default, passing an unsupported interval could lead to undefined behavior. Adding a default case ensures that any unexpected values are properly handled.Apply this diff to add a default case:
} + default: { + throw new Error(`Unsupported interval: ${interval}`); + } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
⛔ Files ignored due to path filters (1)
pnpm-lock.yaml
is excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (82)
Taskfile.yml
(1 hunks)apps/agent/pkg/clickhouse/schema/005_create_mv_key_verifications_per_day.sql
(0 hunks)apps/api/package.json
(1 hunks)apps/api/src/pkg/analytics.ts
(7 hunks)apps/api/src/pkg/keys/service.ts
(1 hunks)apps/api/src/pkg/middleware/metrics.ts
(1 hunks)apps/api/src/pkg/ratelimit/client.ts
(8 hunks)apps/api/src/pkg/ratelimit/interface.ts
(1 hunks)apps/api/src/pkg/ratelimit/noop.ts
(1 hunks)apps/api/src/routes/v1_ratelimit_limit.ts
(5 hunks)apps/dashboard/app/(app)/apis/[apiId]/keys/[keyAuthId]/[keyId]/page.tsx
(11 hunks)apps/dashboard/app/(app)/apis/[apiId]/keys/[keyAuthId]/[keyId]/verification-table.tsx
(3 hunks)apps/dashboard/app/(app)/apis/[apiId]/page.tsx
(11 hunks)apps/dashboard/app/(app)/banner.tsx
(2 hunks)apps/dashboard/app/(app)/identities/[identityId]/page.tsx
(3 hunks)apps/dashboard/app/(app)/logs/page.tsx
(2 hunks)apps/dashboard/app/(app)/ratelimits/[namespaceId]/logs/page.tsx
(4 hunks)apps/dashboard/app/(app)/ratelimits/[namespaceId]/overrides/table.tsx
(3 hunks)apps/dashboard/app/(app)/ratelimits/[namespaceId]/page.tsx
(12 hunks)apps/dashboard/app/(app)/ratelimits/card.tsx
(2 hunks)apps/dashboard/app/(app)/ratelimits/sparkline.tsx
(2 hunks)apps/dashboard/app/(app)/settings/billing/page.tsx
(3 hunks)apps/dashboard/app/(app)/settings/root-keys/[keyId]/history/access-table.tsx
(1 hunks)apps/dashboard/app/(app)/settings/root-keys/[keyId]/history/page.tsx
(3 hunks)apps/dashboard/app/(app)/settings/root-keys/[keyId]/layout.tsx
(3 hunks)apps/dashboard/app/(app)/settings/root-keys/[keyId]/page.tsx
(4 hunks)apps/dashboard/app/(app)/success/page.tsx
(3 hunks)apps/dashboard/components/array-input.tsx
(2 hunks)apps/dashboard/lib/analytics.tsx
(0 hunks)apps/dashboard/lib/charts/sparkline.tsx
(3 hunks)apps/dashboard/lib/clickhouse.ts
(1 hunks)apps/dashboard/lib/clickhouse/index.ts
(0 hunks)apps/dashboard/lib/tinybird.ts
(1 hunks)apps/dashboard/middleware.ts
(0 hunks)apps/dashboard/package.json
(1 hunks)apps/play/app/page-bk.tsx
(1 hunks)apps/www/.env.example
(1 hunks)apps/www/components/stats.tsx
(1 hunks)apps/www/lib/env.ts
(0 hunks)apps/www/lib/tinybird.ts
(0 hunks)deployment/grafana/grafana.yaml
(1 hunks)internal/clickhouse/package.json
(1 hunks)internal/clickhouse/schema/000_README.md
(2 hunks)internal/clickhouse/schema/001_create_databases.sql
(1 hunks)internal/clickhouse/schema/002_create_metrics_raw_api_requests_v1.sql
(2 hunks)internal/clickhouse/schema/003_create_verifications_raw_key_verifications_v1.sql
(2 hunks)internal/clickhouse/schema/004_create_verifications_key_verifications_per_hour_v1.sql
(1 hunks)internal/clickhouse/schema/005_create_verifications_key_verifications_per_day_v1.sql
(1 hunks)internal/clickhouse/schema/006_create_verifications_key_verifications_per_month_v1.sql
(1 hunks)internal/clickhouse/schema/007_create_verifications_key_verifications_per_hour_mv_v1.sql
(1 hunks)internal/clickhouse/schema/008_create_verifications_key_verifications_per_day_mv_v1.sql
(1 hunks)internal/clickhouse/schema/009_create_verifications_key_verifications_per_month_mv_v1.sql
(1 hunks)internal/clickhouse/schema/010_create_ratelimits_raw_ratelimits_table.sql
(1 hunks)internal/clickhouse/schema/011_create_telemetry_raw_sdks_v1.sql
(2 hunks)internal/clickhouse/schema/012_create_billing_billable_verifications_per_month_v1.sql
(1 hunks)internal/clickhouse/schema/013_create_billing_billable_verifications_per_month_mv_v1.sql
(1 hunks)internal/clickhouse/schema/014_create_ratelimits_ratelimits_per_minute_v1.sql
(1 hunks)internal/clickhouse/schema/015_create_ratelimits_ratelimits_per_hour_v1.sql
(1 hunks)internal/clickhouse/schema/016_create_ratelimits_ratelimits_per_day_v1.sql
(1 hunks)internal/clickhouse/schema/017_create_ratelimits_ratelimits_per_month_v1.sql
(1 hunks)internal/clickhouse/schema/018_create_ratelimits_ratelimits_per_minute_mv_v1.sql
(1 hunks)internal/clickhouse/schema/019_create_ratelimits_ratelimits_per_hour_mv_v1.sql
(1 hunks)internal/clickhouse/schema/020_create_ratelimits_ratelimits_per_day_mv_v1.sql
(1 hunks)internal/clickhouse/schema/021_create_ratelimits_ratelimits_per_month_mv_v1.sql
(1 hunks)internal/clickhouse/schema/022_create_business_active_workspaces_per_month_v1.sql
(1 hunks)internal/clickhouse/schema/023_create_business_active_workspaces_per_month_mv_v1.sql
(1 hunks)internal/clickhouse/schema/024_create_ratelimits_last_used_mv_v1.sql
(1 hunks)internal/clickhouse/src/active_keys.ts
(1 hunks)internal/clickhouse/src/billing.ts
(1 hunks)internal/clickhouse/src/client/client.ts
(2 hunks)internal/clickhouse/src/client/interface.ts
(2 hunks)internal/clickhouse/src/client/noop.ts
(1 hunks)internal/clickhouse/src/index.ts
(1 hunks)internal/clickhouse/src/last_used.ts
(1 hunks)internal/clickhouse/src/latest_verifications.ts
(1 hunks)internal/clickhouse/src/logs.ts
(1 hunks)internal/clickhouse/src/ratelimits.ts
(1 hunks)internal/clickhouse/src/success.ts
(1 hunks)internal/clickhouse/src/util.ts
(1 hunks)internal/clickhouse/src/verifications.ts
(1 hunks)internal/db/src/schema/workspaces.ts
(1 hunks)internal/db/src/types.ts
(1 hunks)
⛔ Files not processed due to max files limit (3)
- tools/migrate/auditlog-import.ts
- tools/migrate/package.json
- tools/migrate/tinybird-export.ts
💤 Files with no reviewable changes (6)
- apps/agent/pkg/clickhouse/schema/005_create_mv_key_verifications_per_day.sql
- apps/dashboard/lib/analytics.tsx
- apps/dashboard/lib/clickhouse/index.ts
- apps/dashboard/middleware.ts
- apps/www/lib/env.ts
- apps/www/lib/tinybird.ts
🔇 Additional comments (143)
apps/dashboard/lib/clickhouse.ts (2)
1-2
: LGTM! Clean and focused imports.
The imports are well-organized and properly scoped for the file's purpose.
1-4
: Verify ClickHouse schema migrations.
Since this is part of a migration from Tinybird to ClickHouse, it's important to ensure all required schemas are properly migrated.
Let's check for migration files:
✅ Verification successful
Migration files are properly structured with comprehensive schema definitions
The codebase shows a well-organized set of ClickHouse migration files with:
- Raw tables for core data: api requests, key verifications, rate limits, and SDK telemetry
- Aggregation tables for different time intervals (minute/hour/day/month)
- Materialized views for efficient querying
- Business-specific tables for billing and workspace analytics
The migration appears complete with proper schema progression from raw data to aggregated views.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify ClickHouse migration files exist and contain necessary table definitions
# Look for migration files
fd -e sql . | grep -i "clickhouse"
# Check for specific table definitions
rg -i "create table|create materialized view"
Length of output: 5474
internal/clickhouse/src/util.ts (2)
1-2
: LGTM! Clear import and helpful comment.
The import of Zod and the explanatory comment effectively communicate the purpose of this utility file.
3-3
: Consider adding error handling and type safety improvements.
While the implementation is concise, it could benefit from additional robustness:
- Add error handling for invalid date strings
- Consider timezone handling
- Add explicit return type annotation
Here's a suggested improvement:
-export const dateTimeToUnix = z.string().transform((t) => new Date(t).getTime());
+export const dateTimeToUnix = z.string()
+ .refine((t) => !isNaN(new Date(t).getTime()), {
+ message: "Invalid DateTime string",
+ })
+ .transform((t): number => {
+ // Assuming ClickHouse returns UTC timestamps
+ return new Date(t).getTime();
+ });
Let's verify how this utility is being used across the codebase:
internal/clickhouse/schema/022_create_business_active_workspaces_per_month_v1.sql (1)
12-13
: LGTM! Clean down migration.
The down migration correctly drops the table, providing a clean rollback mechanism.
internal/clickhouse/schema/012_create_billing_billable_verifications_per_month_v1.sql (3)
9-10
: Good choice of SummingMergeTree engine for billing data.
The SummingMergeTree engine is an excellent choice here as it:
- Automatically aggregates count values during merges
- Provides efficient querying by workspace_id and time periods
- Supports high-performance inserts needed for billing data
The ordering by (workspace_id, year, month) is optimal for typical billing queries.
14-15
: LGTM: Clean rollback provided.
The down migration correctly drops the table, providing a clean rollback path.
1-15
: Verify integration with billing system.
Since this is a billing-related table, we should verify:
- Integration with existing billing queries
- Impact on billing materialized views
✅ Verification successful
Let me check the materialized view and billing query implementation to fully verify the integration.
Integration with billing system is properly implemented ✅
The table is correctly integrated with:
- Materialized view
billing.billable_verifications_per_month_mv_v1
that populates it fromverifications.key_verifications_per_month_v1
- Billing query function
getBillableVerifications
that uses the table with proper type safety and parameter validation
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check for related billing views and dependencies
echo "Checking for related billing materialized views..."
rg -l "billing\.billable_verifications_per_month" --type sql
echo "Checking for billing-related queries using this table..."
rg "FROM\s+billing\.billable_verifications_per_month" --type ts
Length of output: 625
Script:
#!/bin/bash
# Check the materialized view definition
echo "=== Materialized View Definition ==="
cat internal/clickhouse/schema/013_create_billing_billable_verifications_per_month_mv_v1.sql
echo -e "\n=== Billing Query Implementation ==="
cat internal/clickhouse/src/billing.ts
Length of output: 1705
internal/clickhouse/schema/001_create_databases.sql (1)
1-17
: Verify migration sequence and database completeness.
As this appears to be the first migration (001), please verify:
- That this migration runs before any table creation migrations
- That all necessary databases for the Tinybird to ClickHouse migration are included
✅ Verification successful
Migration sequence and database completeness verified successfully
The verification confirms:
- Migration 001 correctly runs first and creates all databases before any table creation migrations
- All necessary databases are included, as evidenced by subsequent migrations:
verifications
: Used in migrations 003-009telemetry
: Used in migration 011metrics
: Used in migration 002ratelimits
: Used in migrations 010, 014-021, 024business
: Used in migrations 022-023billing
: Used in migrations 012-013
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for other migration files and their dependencies
echo "Checking migration files order and dependencies..."
# List all migration files in order
fd -e sql . internal/clickhouse/schema | sort
# Check for any table creation statements that might need these databases
echo -e "\nChecking for dependent table creation statements..."
rg -A 2 "CREATE TABLE" internal/clickhouse/schema
Length of output: 6511
internal/clickhouse/schema/015_create_ratelimits_ratelimits_per_hour_v1.sql (3)
12-13
: LGTM! Appropriate engine and sorting key configuration.
The SummingMergeTree engine is well-suited for aggregating rate limit metrics, and the sorting key order is optimal for typical query patterns.
17-18
: LGTM! Clean and reversible down migration.
The down migration properly cleans up by dropping the table.
1-18
: Verify column naming consistency across the codebase.
Let's ensure the column names, especially the change from 'pass' to 'passed', are consistent across all related files.
✅ Verification successful
Let me analyze the code further to verify the column naming consistency.
Based on the analysis of the code, I can now generate the final response:
Column naming is consistent across the codebase ✅
The column name passed
is consistently used throughout the codebase:
- In the SQL schema for raw table:
passed Bool
inraw_ratelimits_v1
- In materialized views:
countIf(passed > 0) as passed
- In aggregation tables:
passed Int64
- In analytics code:
passed: z.boolean()
in the schema definition
The column naming follows a consistent pattern and there are no instances of the old 'pass' property being used in the rate limiting context. The occurrences of 'pass' found in the codebase are unrelated to this rate limiting functionality.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for any remaining instances of the old 'pass' property name
# and verify the usage of the new 'passed' column name
echo "Checking for old 'pass' property usage..."
rg -g '!*.sql' -i '\bpass\b'
echo "Verifying 'passed' column usage..."
rg -g '!*.sql' -i '\bpassed\b'
Length of output: 27747
Script:
#!/bin/bash
# Let's check the analytics code that was mentioned in the AI summary
# and any other relevant SQL files for column naming
echo "Checking analytics.ts files..."
fd -e ts "analytics" --exec cat {}
echo -e "\nChecking all SQL files for ratelimit tables..."
fd -e sql "ratelimit" --exec cat {}
Length of output: 13545
internal/clickhouse/schema/016_create_ratelimits_ratelimits_per_day_v1.sql (3)
12-13
: Good choice of engine and ordering columns.
The use of SummingMergeTree is appropriate for aggregating rate limit data, and the ordering by (workspace_id, namespace_id, time, identifier) will optimize queries filtering by workspace and namespace within time ranges.
19-19
: LGTM: Clean down migration.
The down migration correctly drops the table, providing a clean rollback mechanism.
1-19
: Verify consistency with related rate limit tables.
Let's ensure this table structure aligns with other rate limit tables in the schema.
✅ Verification successful
Table structure is consistent across all rate limit tables ✓
The verification confirms perfect alignment across all rate limit tables (minute
, hour
, day
, month
). They share identical:
- Column definitions
- Data types
- Engine type (SummingMergeTree)
- Order by clause structure
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for consistency across rate limit table definitions
# Expected: Similar structure across minute/hour/day tables
# Find and display all rate limit table definitions
fd -e sql -x cat {} \; | rg -A 15 "CREATE TABLE ratelimits\.ratelimits_per_"
Length of output: 1242
internal/clickhouse/schema/017_create_ratelimits_ratelimits_per_month_v1.sql (3)
2-11
: LGTM: Table schema looks well-designed.
The schema includes all necessary columns for rate limit tracking with appropriate data types. The combination of workspace_id, namespace_id, and identifier provides good granularity for rate limiting.
18-19
: LGTM: Clean down migration.
The down migration correctly drops the table.
12-13
: Verify the ordering strategy across rate limit tables.
Let's ensure the ordering strategy is consistent with other rate limit tables (per_minute, per_hour, per_day) for optimal query performance.
✅ Verification successful
Ordering strategy is consistent across all rate limit tables
The verification shows that all rate limit tables (per_minute, per_hour, per_day, per_month, and raw_ratelimits) use identical ordering: ORDER BY (workspace_id, namespace_id, time, identifier)
. This consistent ordering strategy is optimal for query performance across all time granularities.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check ordering consistency across rate limit tables
# Expected: All rate limit tables should have similar ordering strategies
# Search for other rate limit table definitions
rg -l "CREATE TABLE.*ratelimits.*" internal/clickhouse/schema/
rg "ORDER BY.*workspace_id.*namespace_id.*time" internal/clickhouse/schema/
Length of output: 1182
internal/clickhouse/schema/014_create_ratelimits_ratelimits_per_minute_v1.sql (2)
1-1
: LGTM: Migration format is correct.
The migration follows the goose format with clear up/down migrations.
Also applies to: 18-19
12-14
: Verify the ordering strategy for query patterns.
The SummingMergeTree engine is a good choice for aggregating rate limit data. The ordering by (workspace_id, namespace_id, time, identifier) seems appropriate for typical queries, but let's verify the common query patterns.
✅ Verification successful
The ordering strategy is correctly optimized for the query patterns
Based on the evidence found in the codebase:
-
The main query pattern in
internal/clickhouse/src/ratelimits.ts
filters by:- workspace_id (first)
- namespace_id (second)
- time range
- identifier (optional)
-
The materialized view
ratelimits_per_minute_mv_v1
groups by the same columns in a matching order:- workspace_id
- namespace_id
- identifier
- time
The current ORDER BY (workspace_id, namespace_id, time, identifier) perfectly aligns with these access patterns, enabling efficient filtering and aggregation operations.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check query patterns in the codebase to ensure the ordering aligns with typical usage
# Look for ClickHouse queries related to ratelimits_per_minute_v1
rg -l "ratelimits_per_minute_v1.*SELECT" --type ts --type go
# Look for WHERE clauses to verify column usage patterns
rg "WHERE.*workspace_id.*namespace_id.*time" --type ts --type go
Length of output: 129
Script:
#!/bin/bash
# Let's try a broader search to find any references to this table and analyze the query patterns
# Search for any mentions of the table name
rg -A 5 "ratelimits_per_minute_v1" --type sql --type go --type ts
# Look for rate limit related queries in general
rg -A 5 "rate.?limit.*SELECT|SELECT.*rate.?limit" --type sql --type go --type ts
# Check for any SQL files that might contain query patterns
fd -e sql -x cat {} \;
Length of output: 12695
internal/clickhouse/schema/005_create_verifications_key_verifications_per_day_v1.sql (1)
17-19
: Clean down migration provided.
The down migration correctly drops the table from the verifications schema.
internal/clickhouse/schema/011_create_telemetry_raw_sdks_v1.sql (2)
24-24
: Verify the safety of the DROP TABLE statement.
The down migration correctly drops the table, but ensure there's a data backup strategy in place before running this in production.
Consider adding a safety check before dropping:
-- +goose down
-DROP TABLE telemetry.raw_sdks_v1;
+DROP TABLE IF EXISTS telemetry.raw_sdks_v1;
Line range hint 2-24
: Verify the impact on existing queries.
Since this is part of a migration from Tinybird to Clickhouse, we should verify that all queries have been updated to use the new table name.
✅ Verification successful
Let me check the analytics code to understand how this table is being used.
No issues found with table usage in queries
The code shows a clean migration path from Tinybird to Clickhouse. The telemetry.raw_sdks_v1
table is only used in the new Clickhouse implementation via insertSdkTelemetry
, while the old Tinybird implementation (ingestSdkTelemetry
) is marked with a comment "tinybird, to be removed". The schema matches the SQL definition and there are no other references to this table in the codebase.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for any SQL queries that might need updating
rg -t sql "raw_sdks" --glob '!**/schema/**'
# Look for potential hardcoded table references in code
rg "raw_sdks" --type-not sql --glob '!**/schema/**'
Length of output: 312
Script:
#!/bin/bash
# Check the analytics code implementation
cat apps/api/src/pkg/analytics.ts
Length of output: 7926
internal/clickhouse/schema/006_create_verifications_key_verifications_per_month_v1.sql (2)
16-17
: LGTM: Clean rollback strategy.
The down migration correctly drops the table, providing a clean rollback path.
12-13
: Verify the ordering optimization for common query patterns.
The current ORDER BY clause suggests queries will commonly filter by workspace_id and key_space_id first. Let's verify if this matches the intended query patterns.
✅ Verification successful
Based on the query patterns shown in the results, I can now generate the final response:
The ORDER BY clause is well-optimized for the observed query patterns
The ordering (workspace_id, key_space_id, time, identity_id, key_id) aligns perfectly with the actual query patterns found in the codebase:
- Both query implementations in
verifications.ts
andactive_keys.ts
filter by workspace_id and key_space_id first - Time ranges are consistently used as the next filter condition
- The optional key_id filter matches its position at the end of the ORDER BY clause
This ordering will enable efficient range scans and skip index utilization for the common query patterns.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check query patterns in the codebase to verify if the ordering aligns with common usage
# Look for SQL queries using this table
rg -A 5 "key_verifications_per_month_v1" --type ts
# Look for related ClickHouse query builders
ast-grep --pattern 'function get$_VerificationsPerMonth($_) {
$$$
}'
Length of output: 1080
internal/clickhouse/schema/004_create_verifications_key_verifications_per_hour_v1.sql (1)
17-18
: LGTM! Clean rollback command.
The down migration correctly drops the table for clean rollback.
internal/clickhouse/schema/013_create_billing_billable_verifications_per_month_mv_v1.sql (1)
2-16
: Verify target table existence and consider performance optimizations.
The materialized view looks good overall, but there are a few important considerations:
- Ensure that the target table
billing.billable_verifications_per_month_v1
exists before creating this view - Consider adding indexes on
workspace_id
and(year, month)
in the target table to optimize billing queries
✅ Verification successful
Target table exists with appropriate schema and optimizations
The materialized view implementation is correct and well-designed:
- Target table
billing.billable_verifications_per_month_v1
is properly created in migration 012 - The table uses
SummingMergeTree
engine which is optimal for aggregation - The
ORDER BY (workspace_id, year, month)
clause serves as an index and matches the materialized view's grouping, ensuring good query performance
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check if the target table exists in other migration files
rg -l "CREATE TABLE.*billing\.billable_verifications_per_month_v1" internal/clickhouse/schema/
Length of output: 182
Script:
#!/bin/bash
# Let's check the content of the target table creation migration
cat internal/clickhouse/schema/012_create_billing_billable_verifications_per_month_v1.sql
Length of output: 393
internal/clickhouse/schema/019_create_ratelimits_ratelimits_per_hour_mv_v1.sql (3)
5-11
: LGTM! Well-structured aggregations with proper timestamp handling.
The implementation:
- Correctly transforms millisecond timestamps to hour buckets using
toStartOfHour(fromUnixTimestamp64Milli())
- Appropriately calculates success rate with
countIf(passed > 0)
vscount(*)
21-22
: LGTM! Clean down migration.
The down migration correctly drops the materialized view.
1-4
: Verify target table existence.
The materialized view writes to ratelimits.ratelimits_per_hour_v1
. Ensure this target table exists and is created in a previous migration.
✅ Verification successful
Let me verify the content of the target table creation to ensure proper ordering and compatibility.
Target table exists and is properly configured
The target table ratelimits.ratelimits_per_hour_v1
is created in migration 015, which correctly precedes the materialized view creation in migration 019. The table has appropriate columns and uses SummingMergeTree engine with correct ordering, making it suitable as a target for aggregation.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for the creation of the target table
rg -l "CREATE TABLE.*ratelimits\.ratelimits_per_hour_v1" internal/clickhouse/schema/
Length of output: 162
Script:
#!/bin/bash
# Check the content of the target table creation
cat internal/clickhouse/schema/015_create_ratelimits_ratelimits_per_hour_v1.sql
Length of output: 438
internal/clickhouse/schema/020_create_ratelimits_ratelimits_per_day_mv_v1.sql (3)
5-11
: LGTM: Well-structured aggregation logic.
The aggregation logic is well-designed:
- Maintains dimensional data (workspace_id, namespace_id, identifier)
- Computes meaningful metrics (total attempts and successful passes)
- Properly handles timestamp conversion to daily granularity
1-23
: Verify data retention policy.
Since this is a materialized view that aggregates data, we should ensure there's a TTL (Time To Live) policy in place for data retention, either on this view or the target table.
#!/bin/bash
# Search for TTL settings in the schema files
rg "TTL|SETTINGS" internal/clickhouse/schema/
Consider adding a TTL policy if one doesn't exist, to manage data retention and storage costs effectively.
1-4
: Verify the target table existence.
The materialized view is being created with a target table ratelimits.ratelimits_per_day_v1
, but we need to ensure this table exists before creating the view.
✅ Verification successful
Target table exists and is properly defined
The target table ratelimits.ratelimits_per_day_v1
is correctly created in migration 016
before the materialized view creation in migration 020
. The table has the appropriate schema with DateTime, workspace_id, namespace_id, identifier fields, and the necessary metrics (passed, total) using a SummingMergeTree engine, which is suitable for aggregation purposes.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for the creation of the target table
rg -l "CREATE TABLE.*ratelimits\.ratelimits_per_day_v1" internal/clickhouse/schema/
Length of output: 141
Script:
#!/bin/bash
# Let's try a broader search for the target table name without regex
rg -l "ratelimits_per_day_v1" internal/clickhouse/schema/
Length of output: 210
Script:
#!/bin/bash
# Also search for any table creation in the schema directory
fd . internal/clickhouse/schema/ -t f -x cat {} \; | grep -A 5 "CREATE TABLE"
Length of output: 2109
Script:
#!/bin/bash
# List all migration files to understand the order
ls -1 internal/clickhouse/schema/
Length of output: 1302
Script:
#!/bin/bash
# Let's check the content of the target table creation file
cat internal/clickhouse/schema/016_create_ratelimits_ratelimits_per_day_v1.sql
Length of output: 436
internal/clickhouse/schema/021_create_ratelimits_ratelimits_per_month_mv_v1.sql (4)
1-3
: LGTM: Clear naming and structure.
The materialized view follows consistent naming conventions with the _v1
version suffix and clear schema organization.
21-22
: LGTM: Clean down migration.
The down migration properly removes the view using the correct syntax.
1-22
: Verify data consistency during migration.
Since this materialized view aggregates rate limit data, ensure there's no data loss during the transition period.
#!/bin/bash
# Check for related tables and views that might be affected
echo "Checking for related rate limit tables and views..."
rg -g '*.sql' 'CREATE.*ratelimits\.' -A 5
# Check for dependencies on the raw table
echo "Checking for other views depending on raw_ratelimits_v1..."
rg -g '*.sql' 'FROM.*raw_ratelimits_v1' -A 5
9-11
: Consider adding error handling for time conversion.
The time conversion using fromUnixTimestamp64Milli
could potentially fail with invalid timestamps. Consider adding error handling or validation.
- countIf(passed > 0) as passed,
- count(*) as total,
- toStartOfMonth(fromUnixTimestamp64Milli(time)) AS time
+ countIf(passed > 0) as passed,
+ count(*) as total,
+ toStartOfMonth(assumeNotNull(fromUnixTimestamp64Milli(time))) AS time
internal/clickhouse/schema/018_create_ratelimits_ratelimits_per_minute_mv_v1.sql (4)
1-4
: LGTM: View creation follows naming conventions.
The materialized view name includes version suffix _v1
and follows the pattern <schema>.<entity>_<aggregation>_mv_v1
, which is consistent with versioning best practices.
22-23
: LGTM: Clean down migration.
The down migration correctly drops the view. The use of DROP VIEW is appropriate for cleaning up the materialized view.
9-10
: Verify the passed field logic.
The aggregation countIf(passed > 0)
assumes the passed
field is numeric. This could potentially miss edge cases if passed
is nullable or has special values.
Let's verify the schema of the passed field:
#!/bin/bash
# Check the schema definition of the passed field
rg -A 5 'CREATE TABLE.*raw_ratelimits_v1.*passed'
5-11
: Consider potential precision loss in time aggregation.
The time conversion and aggregation logic looks correct, but be aware that:
toStartOfMinute
rounds down to minute precision, which means you'll lose sub-minute granularityfromUnixTimestamp64Milli
expects millisecond precision input
Let's verify the time precision in the source table:
Consider documenting these precision details in a comment, especially if sub-minute granularity might be needed in the future.
✅ Verification successful
Time precision handling is correctly implemented
The source table schema confirms that the time
column is explicitly documented as "unix milli" (milliseconds) and stored as Int64
, which perfectly matches with the fromUnixTimestamp64Milli
function used in the materialized view. The intentional rounding to minute precision using toStartOfMinute
is appropriate for rate limiting aggregations per minute.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check the schema of raw_ratelimits_v1 to confirm time column precision
ast-grep --pattern 'CREATE TABLE ratelimits.raw_ratelimits_v1'
Length of output: 145
Script:
#!/bin/bash
# Let's try to find the raw_ratelimits_v1 table definition using ripgrep
rg -A 10 "CREATE TABLE.*raw_ratelimits_v1"
# Also search for any files that might contain the table definition
fd "raw_ratelimits.*\.sql"
Length of output: 1198
internal/clickhouse/schema/008_create_verifications_key_verifications_per_day_mv_v1.sql (4)
1-4
: LGTM: View creation follows naming conventions.
The materialized view creation follows good practices with:
- Clear versioning (v1) in the name
- Consistent schema naming (verifications)
- Proper separation between source and target tables
24-25
: LGTM: Clean down migration.
The down migration correctly drops the view. Since this is a materialized view, the target table data will be preserved, which is the expected behavior.
13-20
: Verify the existence of the source table.
The materialized view depends on verifications.raw_key_verifications_v1
. Let's ensure this table exists and has the expected schema.
#!/bin/bash
# Search for the source table definition
rg "CREATE TABLE.*raw_key_verifications_v1"
# Verify the required columns exist in the source table
ast-grep --pattern 'CREATE TABLE $_ (
$$$
workspace_id $_,
$$$
key_space_id $_,
$$$
identity_id $_,
$$$
key_id $_,
$$$
outcome $_,
$$$
time $_,
$$$
)'
5-12
: ```shell
#!/bin/bash
Let's check the content of the table creation file
cat internal/clickhouse/schema/005_create_verifications_key_verifications_per_day_v1.sql
</details>
<details>
<summary>internal/clickhouse/schema/007_create_verifications_key_verifications_per_hour_mv_v1.sql (3)</summary>
`1-4`: **LGTM: View creation follows best practices.**
The materialized view creation follows proper naming conventions with versioning (`_v1`) and clear purpose indication (`key_verifications_per_hour`).
---
`1-25`: **Verify data consistency with existing views.**
Since this is part of a migration from Tinybird (as mentioned in PR objectives), ensure that the aggregation logic matches any existing views or queries that might depend on this data.
```shell
#!/bin/bash
# Description: Look for related verification views or tables that might need to maintain consistency
# Search for other verification-related views
rg "CREATE.*VIEW.*verification" --type sql
# Search for queries that might be using verification data
rg "FROM.*verification" --type ts
13-20
: Verify index availability for grouped columns.
The GROUP BY clause includes multiple columns that will be frequently queried. Ensure proper indexes exist on the source table for these columns to optimize the materialized view's performance.
internal/clickhouse/schema/009_create_verifications_key_verifications_per_month_mv_v1.sql (3)
1-4
: LGTM! Clear and well-structured materialized view definition.
The naming convention follows best practices with clear versioning (_v1
), and the materialized view is properly created with a dedicated target table.
24-25
: LGTM! Clean down migration.
The down migration correctly drops the view. Note that this will also automatically drop the target table verifications.key_verifications_per_month_v1
.
5-12
: Verify timestamp conversion behavior across time zones.
The toStartOfMonth(fromUnixTimestamp64Milli(time))
conversion needs careful consideration:
- Ensure the timestamp conversion handles milliseconds correctly
- Verify the time zone handling meets requirements, as
toStartOfMonth
behavior depends on the server's time zone setting
✅ Verification successful
Timestamp conversion pattern is consistent across materialized views
The codebase shows a clear and consistent pattern for timestamp handling:
- All timestamps use
fromUnixTimestamp64Milli
for millisecond precision conversion - Time bucketing functions (
toStartOfMonth
,toStartOfDay
, etc.) are used appropriately for different time granularities - The same conversion approach is used across all rate limits and verifications materialized views
This consistent implementation indicates the timestamp conversion behavior has been standardized across the system, reducing the risk of timezone-related issues.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for other timestamp conversions to verify consistency
rg "fromUnixTimestamp64Milli|toStartOfMonth" --type sql
Length of output: 1036
internal/clickhouse/schema/023_create_business_active_workspaces_per_month_mv_v1.sql (1)
2-16
: Verify target table structure
The materialized views write to business.active_workspaces_per_month_v1
, but we need to ensure this table exists and has the correct schema to accept these columns.
Let's check the target table definition:
✅ Verification successful
Target table schema is compatible with materialized views ✅
The target table business.active_workspaces_per_month_v1
exists and has the correct schema to accept data from both materialized views:
- It has
time
(Date) andworkspace_id
(String) columns which match the columns selected in both views - The table uses SummingMergeTree engine and is ordered by time, suitable for time-series aggregation
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Find the migration that creates the target table
# Expected: Should find a migration file that creates business.active_workspaces_per_month_v1
fd --type f --extension sql . internal/clickhouse/schema/ --exec grep -l "CREATE TABLE.*business.active_workspaces_per_month_v1"
Length of output: 212
Script:
#!/bin/bash
# Let's check the table definition in the found migration file
cat internal/clickhouse/schema/022_create_business_active_workspaces_per_month_v1.sql
Length of output: 317
apps/api/src/pkg/ratelimit/noop.ts (2)
Line range hint 1-18
: LGTM! Well-structured null object pattern implementation.
The NoopRateLimiter class provides a clean implementation of the null object pattern for rate limiting, which is useful for testing or scenarios where rate limiting needs to be disabled.
10-10
: Property rename maintains consistency across the codebase.
The renaming of pass
to passed
aligns with the broader changes across related files (RateLimiter interface, KeyService, AgentRatelimiter) as mentioned in the summary, ensuring consistent property naming throughout the rate limiting functionality.
Let's verify that this property rename has been consistently applied across all rate limiter implementations:
Also applies to: 16-16
✅ Verification successful
Let me gather more information about the rate limiter implementations to verify the property usage more thoroughly.
Let me try another search with a fixed regex pattern to examine the rate limiter implementations.
Let me check one more thing to verify the schema definition and all rate limiter implementations.
Based on the gathered information, I can now provide the final response:
Property rename is consistently implemented across the codebase
The verification confirms that:
- The
passed
property is correctly defined in theratelimitResponseSchema
interface - Both implementations (NoopRateLimiter and AgentRatelimiter) consistently use
passed
property - No instances of the old
pass
property remain in return statements - The schema and all implementations align with this naming convention
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for any remaining instances of the old property name
# and confirm the new property name is used consistently
# Search for any remaining instances of the old "pass" property in rate limiter related files
echo "Checking for old 'pass' property usage:"
rg -l "pass:\s*(true|false)" --type ts
# Verify the new "passed" property is used consistently
echo "Verifying new 'passed' property usage:"
rg -l "passed:\s*(true|false)" --type ts
Length of output: 343
Script:
#!/bin/bash
# Let's examine the actual implementation details of rate limiter files
# to verify the property usage in interfaces and implementations
# First, let's find all rate limiter related files
echo "Rate limiter related files:"
fd "ratelimit" --type f --extension ts
# Then, let's check the content of these files focusing on return types and property usage
echo -e "\nChecking rate limiter implementations:"
for file in $(fd "ratelimit" --type f --extension ts); do
echo -e "\n=== $file ==="
rg -A 5 -B 5 "(interface|type|return.*{.*passed|return.*{.*pass)" "$file" || true
done
Length of output: 8084
Script:
#!/bin/bash
# Let's examine the rate limiter interface and implementations
# First, let's check the interface definition
echo "=== Checking RateLimiter interface ==="
rg -B 3 -A 3 "interface.*RateLimiter" --type ts
# Then check all return statements in rate limiter files
echo -e "\n=== Checking return statements in rate limiter implementations ==="
rg "return.*\{.*passed.*\}" --type ts
# Also check for any remaining old "pass" property usage in return statements
echo -e "\n=== Checking for old 'pass' property usage in return statements ==="
rg "return.*\{.*pass:.*\}" --type ts
Length of output: 1284
Script:
#!/bin/bash
# Let's check the schema definition and all rate limiter implementations
# First, check the schema definition
echo "=== Checking ratelimit response schema ==="
rg -B 5 -A 5 "ratelimitResponseSchema" --type ts
# Check all rate limiter implementation files
echo -e "\n=== Checking all rate limiter implementations ==="
rg -l "class.*implements.*RateLimiter" --type ts
# Show the content of these implementation files
echo -e "\n=== Content of rate limiter implementations ==="
for file in $(rg -l "class.*implements.*RateLimiter" --type ts); do
echo -e "\n--- $file ---"
cat "$file"
done
Length of output: 12023
internal/clickhouse/src/success.ts (2)
1-3
: LGTM! Dependencies are well-organized.
The imports are clean and all are actively used in the implementation.
7-7
: Verify usage patterns before implementing suggested changes.
Let's analyze how this function is currently being used in the codebase to ensure our suggested changes align with existing usage patterns.
✅ Verification successful
Let me gather more specific information about the usage patterns.
Based on the gathered information, I can now provide the final response:
Function usage patterns are clear and consistent
The getActiveWorkspacesPerMonth
function follows a clean pattern where it:
- Is properly exported and imported as a named export
- Returns an async function that queries the
business.active_workspaces_per_month_v1
table - Is used as part of an object property assignment in index.ts
- Follows similar patterns to other time-based queries (like
getVerificationsPerDay
,getVerificationsPerMonth
)
The implementation and usage are straightforward and well-structured, requiring no additional changes.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for usage of getActiveWorkspacesPerMonth
echo "Searching for function usage patterns..."
rg -l "getActiveWorkspacesPerMonth" --type ts
# Look for similar time-based query patterns
echo -e "\nSearching for similar time-based query patterns..."
ast-grep --pattern 'SELECT $$$
FROM business.$table
WHERE time $op $time
$$$'
Length of output: 407
Script:
#!/bin/bash
# Check the actual implementation and usage in both files
echo "Content of success.ts:"
rg -A 10 "getActiveWorkspacesPerMonth" internal/clickhouse/src/success.ts
echo -e "\nUsage in index.ts:"
rg -A 5 "getActiveWorkspacesPerMonth" internal/clickhouse/src/index.ts
Length of output: 809
apps/dashboard/app/(app)/logs/page.tsx (1)
3-3
: LGTM: Import statement correctly updated for Clickhouse migration.
The change from importing getLogs
to clickhouse
aligns with the PR's objective of migrating to Clickhouse.
internal/clickhouse/src/client/interface.ts (1)
Line range hint 3-24
: Verify interface implementations in Client and Noop classes.
Let's ensure that both interfaces are properly implemented in the mentioned classes.
✅ Verification successful
Let me try a different approach to verify the implementations.
Both Client and Noop classes correctly implement the Querier and Inserter interfaces
Based on the verification results:
- Both
Client
andNoop
classes properly implement theQuerier
andInserter
interfaces - The implementations include all required methods with matching signatures:
query<TIn, TOut>
with correct parameter and return typesinsert<TSchema>
with correct parameter and return types
- No other classes in the codebase implement these interfaces
- Both implementations include proper type validation using the Zod schemas
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify that Client and Noop classes implement both interfaces correctly
# Check Client class implementation
echo "Checking Client class implementation..."
ast-grep --pattern 'class Client implements $_ {
$$$
query<$_>($_): $_
$$$
insert<$_>($_): $_
$$$
}'
# Check Noop class implementation
echo "Checking Noop class implementation..."
ast-grep --pattern 'class Noop implements $_ {
$$$
query<$_>($_): $_
$$$
insert<$_>($_): $_
$$$
}'
Length of output: 397
Script:
#!/bin/bash
# Search for Client and Noop class definitions with more context
echo "Searching for Client class..."
rg "class Client" -A 30
echo -e "\nSearching for Noop class..."
rg "class Noop" -A 30
# Also search for interface implementations
echo -e "\nSearching for interface implementations..."
rg "implements (Querier|Inserter)" -A 30
Length of output: 12856
Script:
#!/bin/bash
# Let's verify the insert method implementation in Client class
echo "Checking Client's insert method implementation..."
rg "insert<" internal/clickhouse/src/client/client.ts -A 15
# Also check if there are any other files implementing these interfaces
echo -e "\nChecking for any other implementations..."
rg "implements.*(Querier|Inserter)" --type ts
Length of output: 1108
internal/clickhouse/schema/003_create_verifications_raw_key_verifications_v1.sql (2)
34-35
: LGTM! Clean down migration.
The down migration correctly drops the table.
Line range hint 2-31
: Verify data consistency with existing systems.
Since this is part of a migration from Tinybird to ClickHouse, ensure that:
- The column types match the existing data
- The timestamp precision (Int64 for unix milli) is consistent
- The LowCardinality String types are appropriate for the cardinality of region and outcome fields
✅ Verification successful
Based on the results, I can now generate the final response:
Schema consistency verified across ClickHouse tables
The verification shows consistent patterns across the schema:
- All timestamp fields use
Int64
type consistently for unix milliseconds LowCardinality(String)
is appropriately used for fields with limited distinct values:- HTTP methods
- Verification outcomes
- Region codes
- The schema aligns with aggregation tables (
key_verifications_per_hour_v1
,key_verifications_per_day_v1
,key_verifications_per_month_v1
) which use the same data types
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check for any other timestamp formats or inconsistencies in schema
rg -g '*.sql' -A 5 'time.*Int64|timestamp' internal/clickhouse/schema/
# Check for other uses of LowCardinality to ensure consistency
rg -g '*.sql' 'LowCardinality' internal/clickhouse/schema/
Length of output: 2950
internal/clickhouse/src/latest_verifications.ts (2)
1-2
: LGTM! Clean and minimal imports.
The imports are well-organized and include only the necessary dependencies.
11-23
: Verify query parameterization patterns across the codebase.
Let's ensure that the query parameterization approach is consistent with other ClickHouse queries in the codebase.
✅ Verification successful
Let me analyze the query parameterization patterns across the codebase to verify security concerns.
Query parameterization is properly implemented and secure
The implementation follows consistent and secure parameterization patterns across the codebase:
- All queries use typed parameters with explicit type annotations (e.g.,
{workspaceId: String}
,{start: Int64}
) - Parameters are validated using Zod schemas before query execution
- The ClickHouse client interface enforces proper parameter typing and validation
- No instances of string concatenation or direct parameter interpolation were found
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for similar ClickHouse query patterns to verify consistency
rg -A 5 "ch\.query\(\{" internal/clickhouse/src/
Length of output: 6532
Script:
#!/bin/bash
# Search for parameter usage patterns in queries
rg -U "params: \{.*?\}" internal/clickhouse/src/
Length of output: 50
Script:
#!/bin/bash
# Look for the Querier type definition to understand the query interface
rg -l "type Querier" internal/clickhouse/src/
Length of output: 80
Script:
#!/bin/bash
# Search for parameter validation or schema definitions
rg "schema = z\." internal/clickhouse/src/
Length of output: 111
Script:
#!/bin/bash
# Check the Querier interface definition
cat internal/clickhouse/src/index.ts
# Look for parameter definitions in a different way
rg -A 5 "params = \{" internal/clickhouse/src/
# Search for actual parameter usage in queries
rg -A 5 "\{.*?:\s*(?:String|Int64|DateTime)" internal/clickhouse/src/
Length of output: 17983
internal/clickhouse/schema/002_create_metrics_raw_api_requests_v1.sql (1)
42-43
: LGTM: Clean down migration.
The down migration properly drops the table, making the migration reversible.
internal/clickhouse/src/last_used.ts (2)
1-4
: LGTM! Well-structured imports with type safety.
The imports are well-organized and demonstrate good practices by using the typed ClickHouse client with zod integration for schema validation.
17-23
: Verify index existence for query optimization.
The query would benefit from an index on (workspace_id, key_space_id, key_id, time).
internal/clickhouse/src/billing.ts (2)
1-2
: LGTM! Clean and minimal imports.
The imports are well-organized with proper type imports.
12-30
: Optimize query performance.
The query is well-structured with proper parameter validation, but could benefit from some optimizations.
Consider these improvements:
SELECT
sum(count) as count
FROM billing.billable_verifications_per_month_v1
WHERE workspace_id = {workspaceId: String}
AND year = {year: Int64}
AND month = {month: Int64}
- GROUP BY workspace_id, year, month
+ GROUP BY workspace_id, year, month
+ LIMIT 1
Also, verify that there's an index on (workspace_id, year, month) for optimal performance.
apps/dashboard/app/(app)/settings/root-keys/[keyId]/history/page.tsx (3)
2-2
: LGTM: Clean import changes for Clickhouse migration
The import changes correctly reflect the migration from Tinybird to Clickhouse.
48-48
: LGTM: Clean and straightforward rendering logic
The simplified return statement improves code readability while maintaining functionality.
21-21
: Verify environment variable configuration
The switch to UNKEY_WORKSPACE_ID
aligns with the migration, but let's verify its configuration.
✅ Verification successful
Environment variable is properly configured
The UNKEY_WORKSPACE_ID
is correctly typed as a required string in the environment configuration (apps/dashboard/lib/env.ts
). The variable is being used consistently across the codebase, particularly in the dashboard app for workspace-related operations. The migration from UNKEY_API_ID
to UNKEY_WORKSPACE_ID
appears to be complete and properly implemented.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Verify UNKEY_WORKSPACE_ID is properly configured
echo "Checking env configuration..."
rg -l "UNKEY_WORKSPACE_ID" --type typescript
Length of output: 148
Script:
#!/bin/bash
# Search for UNKEY_WORKSPACE_ID in all files
rg "UNKEY_WORKSPACE_ID"
# Look for env configuration files
fd -e ts -e js env
Length of output: 2136
Script:
#!/bin/bash
# Check the env configuration in dashboard app
cat apps/dashboard/lib/env.ts
Length of output: 2166
apps/dashboard/app/(app)/ratelimits/sparkline.tsx (2)
27-27
: LGTM: Series configuration updated correctly.
The series configuration has been properly updated to use the new passed
property while maintaining the existing color scheme and chart structure.
9-9
: Verify all consumers of the Sparkline component.
The renaming of the type property from success
to passed
is a breaking change. While this aligns with the broader refactoring effort, we should ensure all consuming components have been updated.
✅ Verification successful
Property rename from success
to passed
is consistent across the codebase
The verification shows that the property rename is correctly implemented:
- The only consumer of the Sparkline component is
card.tsx
- In
card.tsx
, the data passed to Sparkline already uses thepassed
property name in the values object - No remaining references to
success
were found in rate limit related files
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Find all usages of the Sparkline component to verify they've been updated
# to use the new 'passed' property instead of 'success'
# Search for Sparkline component usage
rg -l "Sparkline" --type tsx
# Search for any remaining references to 'success' in rate limit related files
rg "success.*values" --type tsx
Length of output: 118
Script:
#!/bin/bash
# Let's try without the file type flag since it's not recognized
# First, find all TSX files that contain Sparkline
rg -l "Sparkline" --glob "**/*.tsx"
# Then look for any remaining success prop references
rg "success.*values?" --glob "**/*.tsx"
# Also search for direct Sparkline prop usage
rg "Sparkline.*success" --glob "**/*.tsx"
Length of output: 216
Script:
#!/bin/bash
# Let's examine the content of card.tsx to see how Sparkline is used
cat "apps/dashboard/app/(app)/ratelimits/card.tsx"
Length of output: 2123
apps/api/package.json (1)
30-30
: LGTM: Dependency update aligns with ClickHouse migration.
The addition of @unkey/clickhouse
is consistent with the PR's objective of migrating to ClickHouse.
internal/clickhouse/src/logs.ts (2)
1-3
: LGTM! Clean and minimal imports.
The imports are well-organized, using type imports appropriately for TypeScript types.
30-44
: LGTM! Comprehensive response schema validation.
The schema validation is thorough and correctly typed for all returned fields.
apps/api/src/pkg/ratelimit/interface.ts (1)
34-34
: Property rename looks good but requires verification.
The rename from pass
to passed
is more grammatically accurate. However, as this is a breaking change, we should verify all consumers have been updated.
✅ Verification successful
Property rename is safe to proceed
The verification shows that the schema definition in apps/api/src/pkg/ratelimit/interface.ts
is already using the new passed
property name, and there are no remaining instances of the old pass
property in the codebase. The change appears to be complete and consistent.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify all usages of the rate limit response schema have been updated
# to use 'passed' instead of 'pass'
# Search for any remaining references to '.pass' in rate limit related code
echo "Checking for old '.pass' references:"
rg -g '!*.{md,json}' -g '!{.git,node_modules}/*' --multiline '(ratelimit|RateLimit).*\.pass\W'
Length of output: 185
Script:
#!/bin/bash
# Let's try a more comprehensive search strategy:
# 1. First, find all files that might be related to rate limiting
echo "=== Files containing 'ratelimit' or 'RateLimit' ==="
rg -l -i "ratelimit|RateLimit"
# 2. Search for any usage of 'pass' property in the context of objects
echo -e "\n=== Searching for '.pass' property access ==="
rg "\.pass\b" --type ts --type js
# 3. Look for object literals that might be constructing rate limit responses
echo -e "\n=== Searching for object literals with 'pass:' property ==="
rg "pass:\s*" --type ts --type js
# 4. Search for imports of the rate limit interface
echo -e "\n=== Files importing from ratelimit interface ==="
rg "from.*ratelimit/interface"
Length of output: 18683
Script:
#!/bin/bash
# Let's search specifically in the ratelimit-related files we found
# and look for any references to the response schema or its properties
# 1. Search in the ratelimit package files
echo "=== Checking ratelimit package files ==="
rg -A 5 -B 5 "ratelimitResponseSchema|\.pass|passed:" packages/ratelimit/src/
# 2. Search in the API routes that handle rate limiting
echo -e "\n=== Checking API route handlers ==="
rg -A 5 -B 5 "ratelimitResponseSchema|\.pass|passed:" "apps/api/src/routes/v1_ratelimit_"
# 3. Check the agent service implementation
echo -e "\n=== Checking agent service implementation ==="
rg -A 5 -B 5 "ratelimitResponseSchema|\.pass|passed:" "apps/agent/services/ratelimit/"
Length of output: 615
Script:
#!/bin/bash
# Let's try a different approach using ast-grep to find schema and response patterns
# 1. Look for the schema definition and its usage
echo "=== Checking schema definitions and usage ==="
ast-grep --pattern 'const ratelimitResponseSchema = z.object({
$$$
})'
# 2. Look for response object construction
echo -e "\n=== Checking response object construction ==="
ast-grep --pattern 'return {
$$$,
passed: $_,
$$$
}'
# 3. Look for any remaining pass property usage
echo -e "\n=== Checking for any pass property usage ==="
ast-grep --pattern '{
$$$
pass: $_,
$$$
}'
Length of output: 1177
apps/www/components/stats.tsx (2)
Line range hint 6-27
: LGTM! Clean implementation of concurrent data fetching.
The Promise.all implementation with proper error handling for each query is well-structured and efficient.
Line range hint 29-43
: LGTM! Clean component structure.
The component follows React best practices with proper composition and styling.
apps/dashboard/app/(app)/settings/root-keys/[keyId]/history/access-table.tsx (1)
12-19
: Verify the security implications of removing IP and UserAgent tracking.
The removal of ipAddress
and userAgent
fields from the verification data could impact security auditing capabilities. While this simplification aligns with the ClickHouse migration, please ensure that:
- These fields are still being logged elsewhere for security purposes
- The team has reviewed and approved this reduction in tracked verification metadata
✅ Verification successful
Based on the shell script results, I can now generate a final response as the findings are conclusive.
IP and UserAgent tracking are properly maintained in the system
The removal of these fields from the UI component does not impact security auditing capabilities. The data is still being tracked in multiple critical locations:
- The
metrics.raw_api_requests_v1
ClickHouse table explicitly stores bothip_address
anduser_agent
- IP tracking is present in key verification and rate-limiting systems (apps/api/src/routes/v1_keys_verifyKey.ts)
- UserAgent is captured across numerous endpoints including key operations, permissions, and audit logging
- The audit system maintains these fields (apps/dashboard/lib/audit/index.ts, internal/schema/src/auditlog.ts)
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check if IP and UserAgent are tracked elsewhere in the codebase
# Test 1: Search for IP address tracking in ClickHouse related files
echo "Checking for IP tracking in ClickHouse files..."
rg -l "ipAddress|ip_address" --type ts
# Test 2: Search for UserAgent tracking in ClickHouse related files
echo "Checking for UserAgent tracking in ClickHouse files..."
rg -l "userAgent|user_agent" --type ts
# Test 3: Check ClickHouse schema definitions
echo "Checking ClickHouse schema definitions..."
fd -e sql . | xargs rg "CREATE TABLE|ip|agent"
Length of output: 7111
internal/db/src/types.ts (2)
23-24
: LGTM! The new type definitions follow the established patterns.
The new AuditLog
and AuditLogTarget
types are properly exported and consistently follow the same type inference pattern using InferSelectModel
as other types in the file.
23-24
: Verify the corresponding schema entities exist.
Let's ensure the referenced schema entities auditLog
and auditLogTarget
are properly defined in the schema file.
✅ Verification successful
Both schema entities are properly defined
Both auditLog
and auditLogTarget
are correctly defined in internal/db/src/schema/audit_logs.ts
:
auditLog
is defined at line 51 as a MySQL table with proper columns and indexesauditLogTarget
is defined at line 100 as a MySQL table with proper columns and indexes
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify the existence of audit log schema entities
# Test: Search for auditLog and auditLogTarget table definitions
echo "Searching for audit log schema definitions..."
ast-grep --pattern 'export const auditLog = $$$'
ast-grep --pattern 'export const auditLogTarget = $$$'
Length of output: 7785
internal/clickhouse/schema/000_README.md (2)
56-56
: LGTM! Examples are consistent with the new naming convention.
The examples have been properly updated to reflect the new naming convention, providing clear illustrations of how to apply it in different scenarios. The consistency example particularly well demonstrates the relationship between raw tables and their corresponding materialized views.
Also applies to: 62-62, 69-70
44-46
: LGTM! Verify the impact of the naming convention change.
The new materialized view naming convention is well-documented and follows a logical structure. However, this change requires updates to existing materialized views to maintain consistency.
Let's verify the impact on existing materialized views:
✅ Verification successful
All materialized views already follow the new naming convention
The verification results show that:
- No materialized views were found using the old format with
mv_
prefix - All existing materialized views (10 files) already follow the new naming convention with
_mv_v[version]
suffix
This confirms that the documentation update in 000_README.md
is formalizing an existing convention rather than introducing a breaking change.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for any materialized views that don't follow the new naming convention
# Test 1: Search for old format (mv_ prefix)
echo "Checking for materialized views using old naming convention..."
rg -l 'CREATE MATERIALIZED VIEW mv_' 'internal/clickhouse'
# Test 2: Search for new format (_mv_ suffix)
echo "Checking for materialized views using new naming convention..."
rg -l 'CREATE MATERIALIZED VIEW .*_mv_v[0-9]+' 'internal/clickhouse'
Length of output: 1225
apps/dashboard/app/(app)/banner.tsx (2)
2-2
: LGTM: Import changes align with Clickhouse migration
The import change correctly reflects the migration from Tinybird to Clickhouse for billing verification data.
26-30
: Verify Clickhouse billing method implementation
The method call looks correct, but let's verify the implementation of clickhouse.billing.billableVerifications
.
✅ Verification successful
Method implementation verified and correctly used
The clickhouse.billing.billableVerifications
method is properly implemented in the codebase:
- Defined in
internal/clickhouse/src/billing.ts
- Accepts correct parameters:
workspaceId
(string),year
(number), andmonth
(number) - Returns a Promise with the billable verifications count
- Implementation queries the
billing.billable_verifications_per_month_v1
table with proper parameter validation
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify the implementation of billableVerifications method
# Expected: Find the implementation and verify its parameters match the usage
# Search for the billableVerifications implementation
ast-grep --pattern 'export const billing = {
$$$
billableVerifications: async ({ workspaceId, year, month }: { workspaceId: string; year: number; month: number }) $$$
$$$
}'
# Backup: Search using ripgrep if ast-grep doesn't find TypeScript implementation
rg -A 10 "billableVerifications.*=.*async.*\{.*workspaceId.*year.*month.*\}"
Length of output: 260
Script:
#!/bin/bash
# Let's try a broader search to find the implementation
# First, search for files containing 'clickhouse' and 'billing'
rg -l "clickhouse.*billing|billing.*clickhouse"
# Then search for any mentions of billableVerifications
rg -l "billableVerifications"
# Search for the implementation with more context
rg -A 10 "billableVerifications"
# Look for files that might contain the implementation
fd -e ts -e tsx clickhouse
Length of output: 5458
Script:
#!/bin/bash
# Let's check the implementation details in the found files
cat internal/clickhouse/src/index.ts
# Also check for the getBillableVerifications function implementation
rg -A 20 "getBillableVerifications"
# Check the clickhouse lib file as well
cat apps/dashboard/lib/clickhouse.ts
Length of output: 5907
apps/dashboard/app/(app)/settings/root-keys/[keyId]/layout.tsx (3)
3-3
: LGTM: Import aligns with Clickhouse migration.
The addition of the clickhouse import is consistent with the PR's objective of migrating from Tinybird to Clickhouse.
58-58
: LGTM: Component usage is properly updated.
The LastUsed component is correctly implemented with the new required props, properly wrapped in Suspense for async operation.
85-87
: Verify consistent usage of Clickhouse verifications across the codebase.
Let's ensure all verification-related queries have been migrated from Tinybird to Clickhouse.
apps/dashboard/app/(app)/ratelimits/[namespaceId]/overrides/table.tsx (3)
11-11
: LGTM: Import change aligns with ClickHouse migration.
The import change correctly reflects the migration from Tinybird to ClickHouse for rate limit data retrieval.
42-42
: LGTM: Added required key prop for list rendering.
The addition of the key prop using rl.id
follows React best practices for list rendering and will improve performance during reconciliation.
89-93
: LGTM: Successfully migrated to ClickHouse API.
The implementation correctly uses the new ClickHouse API for fetching rate limit data.
apps/dashboard/app/(app)/apis/[apiId]/keys/[keyAuthId]/[keyId]/verification-table.tsx (3)
12-12
: LGTM: Clean migration to ClickHouse types
The import and type definitions have been properly updated to use the ClickHouse client, maintaining type safety throughout the component.
Also applies to: 20-20
43-54
: Excellent documentation of visual behavior
The ASCII art comments effectively illustrate the visual grouping logic, making it easier for future maintainers to understand the intended UI behavior.
Line range hint 67-71
: Verify handling of all possible verification outcomes
The component currently handles specific outcomes in YELLOW_STATES
and RED_STATES
. Let's verify that these cover all possible outcomes from the ClickHouse integration.
✅ Verification successful
Based on the search results, I can now generate the final response:
All verification outcomes are properly handled in the UI
The component correctly handles all possible verification outcomes from the ClickHouse integration:
- Success states:
"VALID"
outcomes are handled implicitly (no special styling) - Warning states:
YELLOW_STATES
covers["RATE_LIMITED", "EXPIRED", "USAGE_EXCEEDED"]
- Error states:
RED_STATES
covers["DISABLED", "FORBIDDEN", "INSUFFICIENT_PERMISSIONS"]
This matches the complete set of outcomes defined in the ClickHouse schema and used throughout the codebase.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for all possible verification outcomes in the codebase
rg -g '*.{ts,tsx}' --no-filename -A 2 'type\s+VerificationOutcome.*=.*{'
Length of output: 202
Script:
#!/bin/bash
# Search for verification outcome related constants and types
rg -g '*.{ts,tsx}' 'VerificationOutcome' -A 3
# Also search for the state arrays used in the component
rg -g '*.{ts,tsx}' '(YELLOW_STATES|RED_STATES)' -A 2
Length of output: 1483
Script:
#!/bin/bash
# Let's search for the clickhouse verifications interface or type definitions
rg -g '*.{ts,tsx}' 'clickhouse\.verifications' -B 5 -A 5
# Also search for any enum or type definition related to verification outcomes
rg -g '*.{ts,tsx}' 'export (type|enum|const).*Verification.*=' -B 2 -A 5
Length of output: 17947
Script:
#!/bin/bash
# Let's search for any successful or valid states since we only see yellow and red states
rg -g '*.{ts,tsx}' 'SUCCESS|VALID|OK' -B 2 -A 2
# Also search for any enum or type definition that might contain verification outcomes
ast-grep --pattern 'type $_Outcome = $_'
Length of output: 31750
apps/dashboard/components/array-input.tsx (2)
44-44
: Bug Fix: Added missing dependency to useCallback
Good catch adding setSelected
to the dependency array. This fixes a potential stale closure issue where the callback might use an outdated version of setSelected
.
17-17
: Breaking Change: Removal of default value for selected
prop
The removal of the default value selected = []
makes this a breaking change. Parent components that previously relied on the default empty array will need to explicitly provide a value.
Let's verify the usage of this component to identify affected components:
internal/clickhouse/src/active_keys.ts (2)
1-2
: LGTM: Clean and minimal imports
The imports are well-structured and all dependencies are actively used.
11-29
: LGTM: Secure and efficient query implementation
The implementation demonstrates several good practices:
- Proper use of parameterized queries preventing SQL injection
- Strong type validation using zod
- Efficient use of ClickHouse-specific features for time-series data
Also applies to: 42-70, 82-110
internal/db/src/schema/workspaces.ts (1)
136-138
: LGTM! Consistent relation naming pattern.
The addition of an explicit relationName
for the auditLogBuckets
relation follows the established pattern used by other relations in the file (e.g., workspace_key_relation
, vercel_workspace_relation
). This enhances schema clarity and maintainability.
apps/dashboard/app/(app)/success/page.tsx (2)
8-8
: LGTM: Import change aligns with Clickhouse migration.
The addition of the Clickhouse import and removal of Tinybird import is consistent with the PR's migration objectives.
67-68
: Verify data consistency after Clickhouse migration.
The migration to Clickhouse looks good, with proper data transformation for the chart component. However, let's verify that the new query returns consistent data compared to the previous Tinybird implementation.
✅ Verification successful
Let me gather more specific information about the data structure and usage.
Let me try one more search to find the implementation details of the Clickhouse query.
Migration to Clickhouse implementation verified and consistent
The Clickhouse implementation maintains the same data structure as expected by the chart component:
- Query correctly selects distinct workspace counts grouped by time
- Returns data with
workspaces
andtime
fields that match the chart's transformation - Uses the same table structure (
business.active_workspaces_per_month_v1
)
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for any other Tinybird references that might need migration
# and verify the structure of activeWorkspaces usage
# Check for any remaining Tinybird imports or references
rg -l 'tinybird|getMonthlyActiveWorkspaces' --type ts
# Check the structure expected by the chart component
ast-grep --pattern 'ColumnChart.*data={$_}'
Length of output: 809
Script:
#!/bin/bash
# Check the implementation of activeWorkspaces in clickhouse.business
ast-grep --pattern 'activeWorkspaces = async function() {
$$$
}'
# Check for any previous implementation or tests that might show the expected structure
rg -A 5 'activeWorkspaces' --type ts
# Look for chart component usage with similar data structure
rg -A 3 'workspaces.*time' --type ts
Length of output: 885
Script:
#!/bin/bash
# Look for the implementation of getActiveWorkspacesPerMonth
rg -A 10 'getActiveWorkspacesPerMonth' --type ts
# Check for any test files that might contain expected data structure
fd -e test.ts -e spec.ts -X rg -l 'activeWorkspaces|getActiveWorkspacesPerMonth'
Length of output: 1625
apps/dashboard/app/(app)/settings/root-keys/[keyId]/page.tsx (4)
5-5
: LGTM: Import change aligns with ClickHouse migration.
The import change correctly reflects the migration from Tinybird to ClickHouse for verification data.
94-97
: Verify error handling for ClickHouse query.
While the migration to ClickHouse looks good, we should ensure proper error handling is in place.
#!/bin/bash
# Description: Check error handling for clickhouse.verifications.latest
# Test: Look for error handling patterns in clickhouse implementation
echo "Checking clickhouse verifications implementation..."
fd "clickhouse.ts" --exec rg -A 5 "verifications.latest" {}
Consider wrapping the ClickHouse query in a try-catch block to handle potential query failures gracefully:
- const history = await clickhouse.verifications.latest({
- workspaceId: UNKEY_WORKSPACE_ID,
- keySpaceId: key.keyAuthId,
- keyId: key.id,
- });
+ let history = [];
+ try {
+ history = await clickhouse.verifications.latest({
+ workspaceId: UNKEY_WORKSPACE_ID,
+ keySpaceId: key.keyAuthId,
+ keyId: key.id,
+ });
+ } catch (error) {
+ console.error('Failed to fetch verification history:', error);
+ // Optionally show an error UI component
+ }
73-73
: Verify removal of UNKEY_API_ID environment variable.
The removal of UNKEY_API_ID
aligns with the migration, but let's verify it's not needed elsewhere.
155-155
: Verify AccessTable compatibility with new data structure.
The prop passing has been simplified, but we should verify that the AccessTable component is updated to handle the new ClickHouse data structure.
✅ Verification successful
Let me gather more information about the AccessTable component and its type definitions.
Based on the code and git history shown, I can now generate the final response:
AccessTable component is compatible with the simplified data structure
The AccessTable component has been recently refactored to match the new data structure. The type definition and implementation perfectly align with the simplified verifications prop that contains only:
time
: numberregion
: stringoutcome
: string
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check AccessTable component for data structure compatibility
# Test: Look for AccessTable component definition and type declarations
echo "Checking AccessTable component..."
fd "access-table.tsx" --exec rg -B 5 "verifications:" {}
Length of output: 250
Script:
#!/bin/bash
# Get the full AccessTable component implementation and type definitions
fd "access-table.tsx" --exec cat {}
# Also check for any recent changes to this file
fd "access-table.tsx" --exec git log -p {} | head -n 50
Length of output: 3306
apps/dashboard/lib/charts/sparkline.tsx (2)
64-64
: LGTM! Correct dependency added to useMemo.
Adding 'data' to the dependency array is essential as the startDate and endDate calculations directly depend on the data array. This prevents stale calculations when the data changes.
76-76
: LGTM! Optimized dependencies by removing unused ones.
The removal of padding.bottom and padding.top from the dependency array is correct as these values are not used in the minY and maxY calculations. This optimization prevents unnecessary recalculations.
apps/dashboard/app/(app)/identities/[identityId]/page.tsx (2)
17-17
: LGTM: Import change aligns with Clickhouse migration.
The addition of the Clickhouse import is consistent with the PR's objective of migrating from Tinybird to Clickhouse.
145-145
: LGTM: Component usage properly updated with required props.
The LastUsed component invocation has been correctly updated to include all necessary props (workspaceId, keySpaceId, keyId) with appropriate values from the parent context.
apps/api/src/pkg/analytics.ts (3)
110-122
: LGTM! Well-structured rate limit implementation.
The new ClickHouse implementation for rate limits includes all necessary fields and follows a consistent pattern with other insertions.
133-134
: Verify the completeness of verification outcomes.
The implementation looks good, but ensure that all possible verification outcomes from the old implementation are covered in the new enum.
#!/bin/bash
# Description: Compare verification outcomes between old and new implementations
echo "Checking for any additional verification outcomes in the codebase..."
rg "enum.*VALID|RATE_LIMITED|EXPIRED|DISABLED|FORBIDDEN|USAGE_EXCEEDED|INSUFFICIENT_PERMISSIONS" --type ts
45-46
: Consider removing the deprecated Tinybird implementation.
The file contains both ClickHouse and Tinybird implementations for SDK telemetry. Since the migration to ClickHouse is in progress, consider removing the Tinybird implementation (ingestSdkTelemetry
) once the migration is complete.
apps/api/src/pkg/ratelimit/client.ts (8)
Line range hint 90-94
: LGTM: Property rename is consistent with schema changes.
The renaming of pass
to passed
in the Cloudflare rate limiter response maintains interface compatibility while using more descriptive terminology.
112-112
: LGTM: Metrics properly updated to use new property name.
The metrics emission correctly uses optional chaining to safely access the renamed passed
property.
130-132
: LGTM: Multi-limit check properly updated.
The multi-limit validation correctly uses the renamed passed
property to determine if any rate limits were exceeded.
140-140
: LGTM: Default response properly updated.
The default success response correctly uses the renamed passed
property with appropriate default values.
Line range hint 165-170
: LGTM: Cache hit response properly updated.
The cache hit response correctly uses the renamed passed
property while maintaining the existing logic.
Line range hint 227-232
: LGTM: Rate limit exceeded response properly updated.
The rate limit exceeded response correctly uses the renamed passed
property while maintaining the existing logic.
Line range hint 237-242
: LGTM: Success response properly updated.
The success response correctly uses the renamed passed
property while maintaining the existing logic.
291-294
: LGTM: Agent response mapping properly updated.
The agent response mapping correctly uses the renamed passed
property. However, let's verify that the agent's response schema has been updated to match this change.
✅ Verification successful
Agent's response schema is correctly aligned with the mapping
The agent's response type in apps/api/src/pkg/ratelimit/agent.ts
uses success: boolean
which matches the property being used in the response mapping. The schema is consistent with the implementation, and no changes are needed.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify that the agent's response schema matches the new property name
# Search for the agent's response type or schema definition
ast-grep --pattern 'interface $interface {
$$$
success: boolean;
$$$
}'
# Search for any remaining uses of the old property name
rg -g '!*.{log,md}' '\bpass\b.*:.*boolean'
Length of output: 124
Script:
#!/bin/bash
# Let's try a broader search for the agent's response type and schema
# Search for files containing 'agent' and 'response' related types/interfaces
ast-grep --pattern 'interface $name {
$$$
}'
# Search for success property usage in type definitions
ast-grep --pattern 'type $name = {
$$$
success: boolean;
$$$
}'
# Search for any response-related types with success field
rg -A 5 -B 5 'success.*:.*boolean' apps/api/src/pkg/ratelimit/
Length of output: 764
apps/api/src/routes/v1_ratelimit_limit.ts (2)
345-355
: LGTM: Analytics insertion is well-implemented.
The implementation correctly uses waitUntil
for non-blocking analytics insertion and captures all essential rate limit data.
Line range hint 445-453
: LGTM: Response structure is well-defined and consistent.
The response structure correctly includes all required fields and maintains consistency with the OpenAPI schema.
apps/dashboard/app/(app)/settings/billing/page.tsx (2)
59-63
: LGTM: Clean migration to Clickhouse for verifications
The implementation correctly fetches billable verifications from Clickhouse with proper parameters.
12-15
: Verify completion of Tinybird migration
While the migration to Clickhouse for verifications is implemented, ratelimits
is still being imported from Tinybird. This suggests a partial migration.
Let's check if there are any remaining Tinybird dependencies:
internal/clickhouse/src/client/noop.ts (2)
2-2
: Confirm Implementation of All Required Interfaces
The Noop
class now implements Querier
and Inserter
. Please verify that all necessary methods from these interfaces are correctly implemented and that any methods from the previously implemented Clickhouse
interface are no longer needed.
2-3
: Verify Removal of Obsolete Interfaces and References
Since Clickhouse
is no longer implemented by the Noop
class, please ensure that any references to the Clickhouse
interface are removed or updated throughout the codebase to prevent potential issues.
You can run the following script to check for any remaining references to Clickhouse
:
apps/dashboard/app/(app)/ratelimits/card.tsx (4)
1-1
: Import statement updated correctly
The import of clickhouse
from @/lib/clickhouse
is appropriately added to utilize ClickHouse functionalities.
20-25
: Verify the correctness of start
and end
timestamps
While the calculation for start
and end
seems to fetch data from the past hour up to the next minute, please ensure that the timestamps are accurate and account for any potential timezone issues. It's important that end
is indeed after start
and both are in the expected format for the ClickHouse query.
26-28
: Handle potential empty responses from clickhouse.ratelimits.latest
When using .then((res) => res.at(0)?.time)
, there is a possibility that res
could be an empty array, leading res.at(0)
to be undefined
. While you've used optional chaining, ensure that downstream logic gracefully handles lastUsed
being undefined
to prevent unexpected behavior.
40-40
: Confirm that passed
aligns with the data structure
You've changed the key from success
to passed
in values
. Ensure that the data returned from clickhouse.ratelimits.perMinute
indeed has a passed
property. If the data structure hasn't been updated accordingly, this could lead to undefined
values in your chart.
internal/clickhouse/src/client/client.ts (2)
3-3
: Appropriate import of Inserter
and Querier
interfaces
The addition of the import statement for Inserter
and Querier
interfaces enhances type safety and code clarity.
9-9
: Class now implements Querier
and Inserter
interfaces
Updating the Client
class to implement Querier
and Inserter
interfaces aligns with the interface segregation principle and improves the modularity of the code.
apps/dashboard/lib/tinybird.ts (1)
17-17
: LGTM
The function _dateToUnixMilli
has been correctly renamed, and the transformation logic remains accurate.
apps/dashboard/app/(app)/ratelimits/[namespaceId]/logs/page.tsx (2)
183-187
: Confirm the correctness of the passed
property usage
The code uses l.passed
to determine the success status. Ensure that the passed
property accurately reflects the intended success state of each log entry and aligns with the data returned from ClickHouse.
170-171
: Ensure request_id
is unique when used as a key in TableRow
When mapping over logs, l.request_id
is used as the key for each TableRow
. Please verify that request_id
values are unique across all log entries to prevent potential rendering issues in React. If request_id
is not guaranteed to be unique, consider using a combination of unique identifiers.
You can run the following script to check for duplicate request_id
entries:
✅ Verification successful
From the search results, I can see in internal/clickhouse/src/ratelimits.ts
that request_id
is being selected from the ratelimits.raw_ratelimits
table. Let me verify if this is coming from a unique column.
Looking at the schema in internal/clickhouse/schema/010_create_ratelimits_raw_ratelimits_table.sql
, I can see that request_id
is defined as a String column in the raw_ratelimits_v1
table. However, I need to check if it's used as a unique identifier in the table definition.
Based on the schema inspection from earlier results, I can now provide a final response:
request_id
is safe to use as a React key
The request_id
field comes from the raw_ratelimits_v1
table in ClickHouse which stores individual rate limit events. While there's no explicit UNIQUE constraint (as ClickHouse doesn't support traditional constraints), the nature of rate limit events means each request generates a unique request_id
for tracking. The table's MergeTree engine with ordering by (workspace_id, namespace_id, time, identifier)
ensures we get distinct records for each rate limit check.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for duplicate request_id values in logs.
# Test: Count occurrences of each request_id. Expect: All counts should be 1.
ast-grep --pattern 'request_id: $_' | sort | uniq -c | awk '$1 > 1 { print "Duplicate request_id:", $2 }'
Length of output: 107
Script:
#!/bin/bash
# Let's first check the type definition or interface for the logs data structure
rg -g "*.ts*" "type.*Log.*=|interface.*Log.*{" -A 10
# Also search for any API response type definitions related to logs
rg -g "*.ts*" "type.*LogResponse.*=|interface.*LogResponse.*{" -A 10
# Additionally, let's check the actual API call that fetches the logs
ast-grep --pattern 'const $_ = $$$fetch$$$("/api/ratelimits/$$$logs$$$")'
Length of output: 481
Script:
#!/bin/bash
# Fix the regex syntax and search for type definitions
rg -g "*.ts*" "(type|interface).*Log.*[={]" -A 10
# Look for the API route handler to understand the response structure
rg -g "*.ts*" "route.*logs" -A 10
# Check the actual component implementation for logs mapping
rg -g "*.tsx" "logs\\.map.*request_id" -A 5
Length of output: 5262
Script:
#!/bin/bash
# Check the schema definition or table creation for ratelimits table
rg -g "*.sql" "CREATE TABLE.*ratelimits" -A 15
# Also check if there's any type definition for the rate limits response
rg -g "*.ts*" "type.*RateLimit.*Response" -A 10
# Look for the actual API endpoint implementation
rg -g "*.ts*" "export.*async.*function.*getRatelimitLogs" -A 15
Length of output: 8506
Script:
#!/bin/bash
# Check for any unique constraints or indexes on request_id
rg -g "*.sql" "UNIQUE.*request_id|PRIMARY KEY.*request_id"
# Look for the code that generates request_id
rg -g "*.ts*" "request_id.*=" -B 5 -A 5
# Check the API route implementation for logs
rg -g "*.ts*" "app.get.*logs" -A 10
Length of output: 140
internal/clickhouse/src/ratelimits.ts (2)
24-24
: Confirm Correct Use of multiSearchAny
Function
The use of multiSearchAny
with identifier
might not be the most efficient method for filtering identifiers. Consider verifying that this function is appropriate for your use case and that the syntax is correct.
Run the following script to check for correct usage:
#!/bin/bash
# Description: Check the usage of `multiSearchAny` with `identifier`.
# Test: Search for occurrences of `multiSearchAny` in codebase. Expect: Correct syntax and parameters.
rg 'multiSearchAny\(' --glob '*.ts' -A 2
Also applies to: 57-57, 89-89, 121-121, 166-166, 200-200
167-168
: Verify Time Units Consistency in getRatelimitLogs
The time comparisons in getRatelimitLogs
use {start: Int64}
and {end: Int64}
directly, unlike other functions where fromUnixTimestamp64Milli
is used. Ensure that the time
field in raw_ratelimits
is in milliseconds since epoch to match the units of {start}
and {end}
.
Run the following script to verify the time units:
apps/dashboard/app/(app)/ratelimits/[namespaceId]/page.tsx (6)
8-8
: Migration to ClickHouse
The addition of import { clickhouse } from "@/lib/clickhouse";
aligns with the migration from Tinybird to ClickHouse. This ensures that the subsequent functions utilize ClickHouse for data retrieval.
43-43
: Simplify Redirection Logic
Returning the redirect
directly enhances code readability and efficiency. This change streamlines the flow when the namespace is not found or unauthorized.
87-90
: Consistent Sorting of ratelimitEvents
The sorting of ratelimitEvents
using .sort((a, b) => a.time - b.time)
ensures chronological order. Confirm that d.time
is always a valid number to prevent sorting issues.
Line range hint 238-285
: Validate Interval Configuration in prepareInterval
In the prepareInterval
function, verify that the granularity
and getRatelimitsPerInterval
mappings correspond correctly for each case, especially for longer intervals like "30d" and "90d" where perDay
is used.
Ensure that using perDay
for longer intervals provides the desired data granularity.
65-82
:
Handle Null or Undefined Results from Database Queries
When accessing results from database queries, such as res.at(0)?.count
and res.at(0)?.time
, ensure that you handle cases where the result might be undefined
to prevent runtime errors.
Apply this diff to provide default values:
.then((res) => res.at(0)?.count ?? 0),
...
.then((res) => res.at(0)?.time ?? null),
Likely invalid or redundant comment.
57-82
: Efficient Data Fetching with ClickHouse
The consolidation of data fetching using getRatelimitsPerInterval
and ClickHouse methods improves performance and aligns with the new architecture. Ensure that all parameters in the query
object are correctly passed and that the ClickHouse functions handle them as expected.
To verify parameter consistency in getRatelimitsPerInterval
, you can run:
✅ Verification successful
Based on the search results, I can now generate the final response as we have sufficient information about the implementation. The code shows that getRatelimitsPerInterval
is correctly assigned from the ClickHouse client methods (perMinute
, perHour
, perDay
) based on the selected time interval, and these methods are properly defined in the ClickHouse client implementation.
Parameter Consistency in Rate Limit Data Fetching Verified
The implementation correctly uses ClickHouse's rate limit methods with consistent parameter passing. The query object contains all required fields (workspaceId, namespaceId, start, end, identifier) and matches the expected interface across all usage patterns.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check the usage of getRatelimitsPerInterval across the codebase.
# Test: Search for all usages and ensure consistent parameter passing.
rg 'getRatelimitsPerInterval\(\s*{[^}]*}\s*\)'
Length of output: 202
Script:
#!/bin/bash
# Let's try a simpler search pattern first to find usages of getRatelimitsPerInterval
rg "getRatelimitsPerInterval" -A 3
# Also search for the function definition
ast-grep --pattern 'function getRatelimitsPerInterval($_) {
$$$
}'
Length of output: 2663
Script:
#!/bin/bash
# Let's examine the clickhouse.ratelimits methods that are being used
ast-grep --pattern 'clickhouse.ratelimits = {
$$$
}'
# Also look at the actual implementation of these methods
rg "perMinute|perHour|perDay" -A 5
Length of output: 9277
DATABASE_HOST= | ||
DATABASE_USERNAME= | ||
DATABASE_PASSWORD= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Add CLICKHOUSE_URL to the environment variables template
Based on the codebase analysis, the Clickhouse connection requires the CLICKHOUSE_URL
environment variable, which is used across multiple files:
internal/clickhouse/src/index.ts
apps/api/src/pkg/middleware/init.ts
apps/dashboard/lib/clickhouse.ts
Please add the following to .env.example
:
CLICKHOUSE_URL=
🔗 Analysis chain
Verify if Clickhouse credentials are needed
Since this PR is migrating from Tinybird to Clickhouse, please verify if any Clickhouse-specific environment variables need to be added to this template.
Consider adding environment variables for Clickhouse configuration if they are required for the migration, such as:
- CLICKHOUSE_HOST
- CLICKHOUSE_USER
- CLICKHOUSE_PASSWORD
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for Clickhouse-related environment variable usage in the codebase
rg -i 'process\.env.*clickhouse'
rg -i 'env.*clickhouse'
Length of output: 576
@@ -1,5 +1,5 @@ | |||
{ | |||
"name": "@unkey/clickhouse-zod", | |||
"name": "@unkey/clickhouse", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Update required: Found outdated import in last_used.ts
The package name change requires one update:
internal/clickhouse/src/last_used.ts
still imports from@unkey/clickhouse-zod
instead of@unkey/clickhouse
All package.json dependencies are correctly using the new package name @unkey/clickhouse
, but there is one remaining reference to the old package name in the import statement.
🔗 Analysis chain
Package name change looks good, but verify dependencies.
The package name change from @unkey/clickhouse-zod
to @unkey/clickhouse
makes sense as part of the ClickHouse migration.
Let's verify that all dependencies are updated across the codebase:
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for any remaining references to the old package name
rg "@unkey/clickhouse-zod"
# Search for package.json files that might need updating
fd "package.json" --exec grep -l "@unkey/clickhouse"
Length of output: 303
Script:
#!/bin/bash
# Check the content of package.json files that reference @unkey/clickhouse
echo "=== tools/migrate/package.json ==="
cat tools/migrate/package.json | grep -A 2 -B 2 "@unkey/clickhouse"
echo -e "\n=== apps/api/package.json ==="
cat apps/api/package.json | grep -A 2 -B 2 "@unkey/clickhouse"
echo -e "\n=== apps/dashboard/package.json ==="
cat apps/dashboard/package.json | grep -A 2 -B 2 "@unkey/clickhouse"
# Check the specific file with old reference
echo -e "\n=== internal/clickhouse/src/last_used.ts ==="
cat internal/clickhouse/src/last_used.ts | grep -A 5 -B 5 "@unkey/clickhouse-zod"
Length of output: 1360
-- +goose down | ||
DROP DATABASE verifications; | ||
DROP DATABASE telemetry; | ||
DROP DATABASE metrics; | ||
DROP DATABASE ratelimits; | ||
DROP DATABASE business; | ||
DROP DATABASE billing; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Enhance rollback safety with IF EXISTS and warnings.
The database drop statements should include IF EXISTS clauses and consider dependencies between databases. Also, add a warning comment about potential data loss.
Apply this diff to improve safety:
-- +goose down
-DROP DATABASE verifications;
-DROP DATABASE telemetry;
-DROP DATABASE metrics;
-DROP DATABASE ratelimits;
-DROP DATABASE business;
-DROP DATABASE billing;
+-- WARNING: This will permanently delete all data in these databases
+-- Ensure you have backups before proceeding
+
+-- Drop in reverse order of potential dependencies
+DROP DATABASE IF EXISTS billing;
+DROP DATABASE IF EXISTS business;
+DROP DATABASE IF EXISTS ratelimits;
+DROP DATABASE IF EXISTS metrics;
+DROP DATABASE IF EXISTS telemetry;
+DROP DATABASE IF EXISTS verifications;
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
-- +goose down | |
DROP DATABASE verifications; | |
DROP DATABASE telemetry; | |
DROP DATABASE metrics; | |
DROP DATABASE ratelimits; | |
DROP DATABASE business; | |
DROP DATABASE billing; | |
-- +goose down | |
-- WARNING: This will permanently delete all data in these databases | |
-- Ensure you have backups before proceeding | |
-- Drop in reverse order of potential dependencies | |
DROP DATABASE IF EXISTS billing; | |
DROP DATABASE IF EXISTS business; | |
DROP DATABASE IF EXISTS ratelimits; | |
DROP DATABASE IF EXISTS metrics; | |
DROP DATABASE IF EXISTS telemetry; | |
DROP DATABASE IF EXISTS verifications; |
internal/clickhouse/schema/015_create_ratelimits_ratelimits_per_hour_v1.sql
Show resolved
Hide resolved
label="Passed" | ||
value={formatNumber(ratelimitEvents.reduce((sum, day) => sum + day.passed, 0))} | ||
/> | ||
<Metric | ||
label="Ratelimited" | ||
value={formatNumber( | ||
ratelimitEvents.data.reduce((sum, day) => sum + (day.total - day.success), 0), | ||
ratelimitEvents.reduce((sum, day) => sum + (day.total - day.passed), 0), | ||
)} | ||
/> | ||
<Metric | ||
label="Total" | ||
value={formatNumber(ratelimitEvents.data.reduce((sum, day) => sum + day.total, 0))} | ||
value={formatNumber(ratelimitEvents.reduce((sum, day) => sum + day.total, 0))} | ||
/> | ||
<Metric | ||
label="Success Rate" | ||
value={`${formatNumber( | ||
(ratelimitEvents.data.reduce((sum, day) => sum + day.success, 0) / | ||
ratelimitEvents.data.reduce((sum, day) => sum + day.total, 0)) * | ||
(ratelimitEvents.reduce((sum, day) => sum + day.passed, 0) / | ||
ratelimitEvents.reduce((sum, day) => sum + day.total, 0)) * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prevent Division by Zero in Success Rate Calculation
When computing the "Success Rate", ensure that the total number of events is not zero to avoid a division by zero error.
Apply this diff to handle the edge case:
const totalEvents = ratelimitEvents.reduce((sum, day) => sum + day.total, 0);
const successRate = totalEvents > 0
? (ratelimitEvents.reduce((sum, day) => sum + day.passed, 0) / totalEvents) * 100
: 0;
value={`${formatNumber(successRate)}%`}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
label="Passed" | |
value={formatNumber(ratelimitEvents.reduce((sum, day) => sum + day.passed, 0))} | |
/> | |
<Metric | |
label="Ratelimited" | |
value={formatNumber( | |
ratelimitEvents.data.reduce((sum, day) => sum + (day.total - day.success), 0), | |
ratelimitEvents.reduce((sum, day) => sum + (day.total - day.passed), 0), | |
)} | |
/> | |
<Metric | |
label="Total" | |
value={formatNumber(ratelimitEvents.data.reduce((sum, day) => sum + day.total, 0))} | |
value={formatNumber(ratelimitEvents.reduce((sum, day) => sum + day.total, 0))} | |
/> | |
<Metric | |
label="Success Rate" | |
value={`${formatNumber( | |
(ratelimitEvents.data.reduce((sum, day) => sum + day.success, 0) / | |
ratelimitEvents.data.reduce((sum, day) => sum + day.total, 0)) * | |
(ratelimitEvents.reduce((sum, day) => sum + day.passed, 0) / | |
ratelimitEvents.reduce((sum, day) => sum + day.total, 0)) * | |
label="Passed" | |
value={formatNumber(ratelimitEvents.reduce((sum, day) => sum + day.passed, 0))} | |
/> | |
<Metric | |
label="Ratelimited" | |
value={formatNumber( | |
ratelimitEvents.reduce((sum, day) => sum + (day.total - day.passed), 0), | |
)} | |
/> | |
<Metric | |
label="Total" | |
value={formatNumber(ratelimitEvents.reduce((sum, day) => sum + day.total, 0))} | |
/> | |
<Metric | |
label="Success Rate" | |
value={`${formatNumber( | |
(() => { | |
const totalEvents = ratelimitEvents.reduce((sum, day) => sum + day.total, 0); | |
const successRate = totalEvents > 0 | |
? (ratelimitEvents.reduce((sum, day) => sum + day.passed, 0) / totalEvents) * 100 | |
: 0; | |
return successRate; | |
})() | |
)}%`} |
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me.
Checked the tables and they aren't missing anything.
Summary by CodeRabbit
New Features
Bug Fixes
Refactor
Documentation