-
-
Notifications
You must be signed in to change notification settings - Fork 135
Updates for Prism #1224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updates for Prism #1224
Conversation
WalkthroughThis pull request makes significant modifications across multiple modules to revise statistical aggregation, asynchronous operations, and type safety. It reworks the functions for fetching and computing ingestion metrics—shifting from asynchronous HTTP calls to synchronous metadata accumulation and converting size fields from strings to numeric types. Additionally, the PR updates dashboard and filter handling to use session key extraction with asynchronous operations and improved error variants, while also introducing a helper function for aggregating stream statistics in the Prism home module. Changes
Sequence Diagram(s)sequenceDiagram
participant App
participant FetchDailyStats
App->>FetchDailyStats: call fetch_daily_stats_from_ingestors(date, stream_meta_list)
loop For each manifest in stream_meta_list
FetchDailyStats->>FetchDailyStats: Check manifest date & accumulate events, ingestion, storage sizes
end
FetchDailyStats->>App: Return computed Stats
sequenceDiagram
participant Client
participant SessionExtractor
participant DashboardHandler
participant FilterHandler
Client->>SessionExtractor: extract_session_key_from_req(request)
SessionExtractor-->>DashboardHandler: Process session key for dashboards
DashboardHandler->>DashboardHandler: Async operations (list, get, update, delete)
DashboardHandler-->>Client: Return dashboard response
Client->>SessionExtractor: extract_session_key_from_req(request)
SessionExtractor-->>FilterHandler: Process session key for filters
FilterHandler->>FilterHandler: Async operations (get, update, delete, list)
FilterHandler-->>Client: Return filter response
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (9)
src/prism/logstream/mod.rs (1)
99-106
: Add logging or comments for non-user-defined streams.If the stream type is not
UserDefined
, the code quietly returnsNone
foringestor_stats
. Consider logging a debugging message or adding a comment to clarify that ingestor retrieval is skipped for these streams.src/handlers/http/cluster/mod.rs (1)
370-394
: Potential performance concerns with large datasets.The function iterates over all manifests to accumulate daily stats. For large
stream_meta_list
, this could become slow. Consider indexing or short-circuiting if you have performance constraints.src/prism/home/mod.rs (1)
189-195
: Possible performance issue when cloning large HashMap
You are cloningstream_wise_ingestor_stream_json
for each date iteration, which can be expensive if there are many streams or large metadata. Consider refactoring to pass references or to compute stats without cloning.src/users/filters.rs (2)
127-128
: Concurrent updates/deletes via async RwLock
Usingasync fn update
andasync fn delete_filter
withawait
on a write-locked vector is straightforward. Note that all reads and writes are serialized on the same lock, which might limit concurrency if the number of filters grows large.Consider storing filters in a map keyed by filter_id or user_id to reduce lock contention during frequent updates.
Also applies to: 133-137, 139-142
155-172
:list_filters
with async authorization
The approach of skipping filters lacking afilter_query
and then auth-checking is sensible. If further concurrency is desired, consider parallelizing the user_auth_for_query checks, but ensure that it doesn’t introduce undue complexity.src/handlers/http/users/filters.rs (1)
146-147
:Custom
error variant for session key extraction
Introducing a custom error variant lets you surface more specific problems (e.g., session key parse failures). The chosen status code of 500 might be reconsidered if user input errors are to be treated differently.Also applies to: 157-157
src/handlers/http/cluster/utils.rs (1)
132-132
: Nitpick: Consider renaming the function to fix the spelling mistake.The function name
merge_quried_stats
contains a minor typographical error. Updating it tomerge_queried_stats
would improve clarity.-pub fn merge_quried_stats(stats: Vec<QueriedStats>) -> QueriedStats { +pub fn merge_queried_stats(stats: Vec<QueriedStats>) -> QueriedStats {src/users/dashboards.rs (1)
203-225
: Performance note: Potential optimization for multi-tile dashboards.The loop checks user authorization (
user_auth_for_query
) against every tile in a dashboard to decide if the entire dashboard should be discarded. If there are many dashboards with many tiles, this can be expensive. Consider caching or pre-checking if the final user’s session scope covers the queries, or short-circuit once a tile fails.src/handlers/http/users/dashboards.rs (1)
168-169
: Consider returning more precise HTTP codes for custom errors.All custom errors currently map to
INTERNAL_SERVER_ERROR
, which might conflate unrelated conditions. Returning more specific status codes such as403 Forbidden
or400 Bad Request
can provide better feedback.Also applies to: 178-178
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
src/handlers/http/cluster/mod.rs
(2 hunks)src/handlers/http/cluster/utils.rs
(3 hunks)src/handlers/http/logstream.rs
(1 hunks)src/handlers/http/modal/query/querier_logstream.rs
(4 hunks)src/handlers/http/users/dashboards.rs
(9 hunks)src/handlers/http/users/filters.rs
(9 hunks)src/prism/home/mod.rs
(7 hunks)src/prism/logstream/mod.rs
(2 hunks)src/users/dashboards.rs
(3 hunks)src/users/filters.rs
(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (10)
- GitHub Check: coverage
- GitHub Check: Quest Smoke and Load Tests for Distributed deployments
- GitHub Check: Build Default aarch64-unknown-linux-gnu
- GitHub Check: Build Default x86_64-unknown-linux-gnu
- GitHub Check: Quest Smoke and Load Tests for Standalone deployments
- GitHub Check: Build Default x86_64-pc-windows-msvc
- GitHub Check: Build Default x86_64-apple-darwin
- GitHub Check: Build Kafka x86_64-unknown-linux-gnu
- GitHub Check: Build Default aarch64-apple-darwin
- GitHub Check: Build Kafka aarch64-apple-darwin
🔇 Additional comments (20)
src/handlers/http/logstream.rs (1)
265-275
: Consider verifying potential integer overflow.Now that ingestion & storage stats use direct
u64
numeric fields, extremely large ingestion or storage volumes might cause overflow. If there's a scenario with extremely large data, it might be prudent to incorporate overflow checks or saturating additions.src/prism/logstream/mod.rs (4)
29-32
: New imports appear consistent.These imports are relevant for fetching stats from ingestors. No immediate concerns.
38-38
: Import usage is aligned with new references.
StreamType
is used for determining if a stream is user-defined or not. The usage looks correct.
113-117
: Direct numeric ingestion fields.Switching to raw
u64
ingestion fields is consistent with the rest of the codebase. No issues found.
121-123
: Direct numeric storage fields.Switching to raw
u64
storage fields is consistent as well. No issues found.src/handlers/http/modal/query/querier_logstream.rs (3)
29-29
: RelativePathBuf import is valid.No immediate concerns. This import is necessary for constructing paths to object store resources.
48-48
: Additional import forStreamType
andSTREAM_ROOT_DIRECTORY
.This import aligns with new usage in the code. No issues found.
207-217
: Check for potential overflow in large-scale scenarios.Since these fields directly add to
u64
counters, ensure there's no risk of overflow under high ingestion volumes. Adding a saturating or overflow check might be prudent if extremely large data is expected.src/prism/home/mod.rs (4)
19-25
: Imports updated to accommodate fetch and stats operations
These additional imports (e.g.,HashMap
,RelativePathBuf
,fetch_daily_stats_from_ingestors
, andget_stats_date
) seem necessary for the new functionality. No concerns here.Also applies to: 31-34, 38-38
201-201
: Includingstream_titles.clone()
in HomeResponse
No issues in reusing the same vector for response. This supports consistency in your UI.
210-230
:stats_for_date
function logic
The async function succinctly aggregates querier and ingestor stats. The approach is clear, but be mindful of error handling if partial data is acceptable or not. Currently, a single stream's error aborts the whole process.Would you like to allow partial success if one of the streams fails?
242-243
: New error variant introduced
AddingObjectStorageError
to the local error type integrates well with existingResponseError
handling. This ensures relevant HTTP 500 responses when storage errors occur.Also applies to: 254-254
src/users/filters.rs (2)
22-22
: Switched totokio::sync::RwLock
and introduced additional utility functions
Shifting from a synchronous RwLock to an async one is suitable for asynchronous contexts. Likewise, referencing new modules (e.g.,user_auth_for_query
) aligns with the new flow.Also applies to: 26-28
144-144
: Asynchronousget_filter
Reading from the RwLock is now asynchronous. This is consistent with the rest of the codebase and is properly awaited.Also applies to: 147-148
src/handlers/http/users/filters.rs (2)
36-39
: Extraction of session key for listing filters
Replacing manual user ID handling withextract_session_key_from_req
is consistent with the new session-based approach. Converting the extraction error toFiltersError::Custom
also simplifies error handling.
50-50
: Async usage ofFILTERS
These updates ensure thatget_filter
,update
, anddelete_filter
are performed asynchronously. This is correct for an async environment and helps align filter operations with other asynchronous flows.Also applies to: 67-67, 90-90, 96-96, 120-121, 131-131
src/handlers/http/cluster/utils.rs (3)
83-88
: Enhancement: Switching size fields to unsigned 64-bit integers is a good move.All changes from
String
tou64
for ingestion stats (size, lifetime_size, deleted_size) strengthen type safety and reduce overhead by avoiding string parsing. Good job on this update!Also applies to: 94-98
115-118
: Consistent numeric usage in storage stats.Changing the storage fields (
size
,lifetime_size
,deleted_size
) tou64
ensures consistency with the ingestion stats and avoids runtime conversions. This simplifies the codebase and improves performance.Also applies to: 122-122, 158-162
134-134
:❓ Verification inconclusive
Potential out-of-bounds panic.
The code uses
stats[1]
to retrieve the stream name. If thestats
vector is empty or has only one element, this will panic. Consider using safe indexing or a check on the length ofstats
.-let stream_name = stats[1].stream.clone(); +let stream_name = if stats.len() > 1 { + stats[1].stream.clone() +} else if !stats.is_empty() { + stats[0].stream.clone() +} else { + "unknown_stream".to_string() +};
Action Required: Validate Safe-Indexing Logic for
stats
The current fix prevents a potential out-of-bounds panic by checking the length of
stats
. The diff snippet ensures that ifstats
contains more than one element, the second element is used; if only one element exists, it falls back to using that single element; otherwise, it defaults to"unknown_stream"
. Please confirm that:
- Falling back to
stats[0].stream
when there’s exactly one element aligns with the intended logic.- The default value
"unknown_stream"
is appropriate in cases wherestats
is empty.-let stream_name = stats[1].stream.clone(); +let stream_name = if stats.len() > 1 { + stats[1].stream.clone() +} else if !stats.is_empty() { + stats[0].stream.clone() +} else { + "unknown_stream".to_string() +};src/users/dashboards.rs (1)
22-22
: Good choice using asynchronous RwLock.Replacing
std::sync::RwLock
withtokio::sync::RwLock
aligns with async operations. This helps avoid blocking the entire thread during reads and writes.
- Modified dashboards and filters to have async safe RwLock - Modified dashboards and filters to send back list of all available items (any user can GET all items) - Bugfix for stats - Modified stats format to be sent back as u64 instead of `format!("{number} Bytes")` - `fetch_daily_stats_from_ingestors()` function will read `stream.json` files for each ingestor and calculate date-wise stats instead of calling all ingestors' /metric endpoint
5e83d9e
to
233f8ba
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
src/handlers/http/modal/query/querier_logstream.rs (1)
185-188
: Consider adding more context to error log.When logging a parsing error, consider including the stream name or file path to make troubleshooting easier.
-error!("Failed to parse stream metadata: {:?}", e); +error!("Failed to parse stream metadata for stream {}: {:?}", stream_name, e);src/prism/home/mod.rs (1)
196-196
: Avoid unnecessary cloning of HashMap.The
stream_wise_ingestor_stream_json.clone()
creates a full copy of the HashMap on each iteration, which may impact performance, especially with many streams.-let dated_stats = stats_for_date(date, stream_wise_ingestor_stream_json.clone()).await?; +let dated_stats = stats_for_date(date, &stream_wise_ingestor_stream_json).await?;And update the function signature:
-async fn stats_for_date( - date: String, - stream_wise_meta: HashMap<String, Vec<ObjectStoreFormat>>, -) -> Result<DatedStats, PrismHomeError> { +async fn stats_for_date( + date: String, + stream_wise_meta: &HashMap<String, Vec<ObjectStoreFormat>>, +) -> Result<DatedStats, PrismHomeError> {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
src/handlers/http/cluster/mod.rs
(2 hunks)src/handlers/http/cluster/utils.rs
(3 hunks)src/handlers/http/logstream.rs
(1 hunks)src/handlers/http/modal/query/querier_logstream.rs
(4 hunks)src/handlers/http/users/dashboards.rs
(9 hunks)src/handlers/http/users/filters.rs
(9 hunks)src/prism/home/mod.rs
(7 hunks)src/prism/logstream/mod.rs
(2 hunks)src/users/dashboards.rs
(3 hunks)src/users/filters.rs
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (7)
- src/prism/logstream/mod.rs
- src/handlers/http/logstream.rs
- src/users/filters.rs
- src/handlers/http/cluster/utils.rs
- src/handlers/http/users/filters.rs
- src/handlers/http/users/dashboards.rs
- src/users/dashboards.rs
⏰ Context from checks skipped due to timeout of 90000ms (10)
- GitHub Check: Build Default x86_64-unknown-linux-gnu
- GitHub Check: Build Default aarch64-unknown-linux-gnu
- GitHub Check: Build Default x86_64-apple-darwin
- GitHub Check: Build Kafka x86_64-unknown-linux-gnu
- GitHub Check: Build Default x86_64-pc-windows-msvc
- GitHub Check: Build Kafka aarch64-apple-darwin
- GitHub Check: Build Default aarch64-apple-darwin
- GitHub Check: Quest Smoke and Load Tests for Standalone deployments
- GitHub Check: Quest Smoke and Load Tests for Distributed deployments
- GitHub Check: coverage
🔇 Additional comments (12)
src/handlers/http/modal/query/querier_logstream.rs (3)
29-29
: Good addition of RelativePathBuf for path handling.The addition of
relative_path::RelativePathBuf
is a good choice for handling paths in a cross-platform manner and working with object storage paths.
156-182
: Improved error handling for stream metadata parsing.The implementation now properly handles JSON parsing errors instead of using
.expect()
, which prevents runtime panics when encountering malformed JSON files. This is a significant improvement for application stability.The change from HTTP requests to direct object store access should also improve performance by reducing network overhead.
209-223
: Improved type safety by using raw numeric values.The code now passes raw numeric values to the
IngestionStats
andStorageStats
constructors instead of formatted strings. This is more type-safe and avoids unnecessary string conversions.src/handlers/http/cluster/mod.rs (3)
370-395
: Good refactoring of fetch_daily_stats_from_ingestors to be synchronous.Changing this function from async to synchronous and modifying its parameters to accept pre-fetched metadata is a good design improvement. This makes the code:
- More efficient by eliminating unnecessary HTTP requests
- More flexible by allowing callers to control how metadata is retrieved
- Easier to test by reducing external dependencies
The implementation effectively iterates through the metadata to accumulate statistics for the specified date.
423-425
:⚠️ Potential issueAdd error handling to prevent panic on invalid JSON.
Using
.expect()
here can cause runtime panics if the JSON is malformed. This is inconsistent with the error handling patterns used elsewhere in the codebase.-let stream_metadata: ObjectStoreFormat = - serde_json::from_slice(&ob).expect("stream.json is valid json"); +let stream_metadata: ObjectStoreFormat = match serde_json::from_slice(&ob) { + Ok(metadata) => metadata, + Err(e) => { + error!("Failed to parse stream metadata: {:?}", e); + continue; + } +};
437-454
: Improved type safety in statistic constructors.The code now uses raw numeric values in the
IngestionStats
andStorageStats
constructors, which is a better design than using formatted strings. This provides:
- Better type safety
- Reduced conversion overhead
- More flexibility for formatting at the display layer
src/prism/home/mod.rs (6)
19-40
: Good module organization with proper imports.The addition of
HashMap
and other necessary imports helps structure the code for better data management.
118-120
: Simplified dashboard and filter retrieval using session key.Using session keys directly for dashboard and filter retrieval improves the code by:
- Removing unnecessary user ID extraction
- Reducing potential error paths
- Making authorization more consistent across the codebase
Also applies to: 134-136
167-193
: Well-structured metadata collection with proper error handling.The implementation now:
- Effectively uses a HashMap to organize stream metadata
- Properly handles JSON parsing errors without panicking
- Improves organization and readability
This is a significant improvement over nested loops for better maintainability.
195-202
: Efficient date-wise statistics processing.Using the new helper function
stats_for_date
creates a cleaner, more modular design. The stream summary accumulation is straightforward and easy to understand.
216-236
: Good separation of concerns with stats_for_date helper function.Creating a dedicated function for date-specific statistics aggregation improves code organization and readability. This function effectively:
- Collects stats for all streams for a given date
- Aggregates querier and ingestor statistics
- Follows the single-responsibility principle
The code is well-structured and properly handles errors.
248-250
: Properly extended error handling for ObjectStorageError.Adding the new variant to
PrismHomeError
improves error handling and propagation throughout the application. This ensures that object storage errors are properly reported and handled.Also applies to: 259-259
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good
This PR also solves #1187 |
format!("{number} Bytes")
fetch_daily_stats_from_ingestors()
function will readstream.json
files for each ingestor and calculate date-wise stats instead of calling all ingestors' /metric endpointFixes #XXXX.
Description
This PR has:
Summary by CodeRabbit
New Features
Bug Fixes
Refactor