-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(core): introduce data control attributes for wren MDL base #1014
Conversation
WalkthroughThe pull request introduces a comprehensive enhancement to the data model security and expression normalization in the Rust codebase. New procedural macros and structs are added to support row-level and column-level security features. The changes include the implementation of comparison operators, normalized expression handling, and methods for evaluating security constraints. These modifications enable more granular access control and expression validation within the data modeling framework, with support for both Python and non-Python configurations. Changes
Sequence DiagramsequenceDiagram
participant ColumnBuilder
participant NormalizedExpr
participant ColumnLevelSecurity
ColumnBuilder->>NormalizedExpr: Create normalized expression
NormalizedExpr-->>ColumnBuilder: Return normalized expr
ColumnBuilder->>ColumnLevelSecurity: Set security parameters
ColumnLevelSecurity->>ColumnLevelSecurity: Evaluate input
ColumnLevelSecurity-->>ColumnBuilder: Return evaluation result
Poem
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (8)
wren-core-base/src/mdl/cls.rs (2)
26-37
: Consider handling parse errors gracefully.
Right now, numeric parsing insideeval
can panic ifinput_expr
or the threshold cannot be converted to a float. This might be okay for strictly validated inputs, but in production, it’s safer to handle potential parsing errors.
72-82
: Avoid usingunwrap()
inNumeric
comparisons.
Usingunwrap()
may cause unwanted panics on invalid numeric inputs. Consider usingparse().ok()
to handle errors gracefully or to returnfalse
when parsing fails.wren-core-base/src/mdl/builder.rs (2)
210-216
: Encourage stronger validation for rls config.
Therow_level_security
method setsname
andoperator
; consider verifying thatname
is not empty if that is a requirement.
218-230
: Check threshold usage incolumn_level_security
.
Currently,NormalizedExpr::new(threshold)
will panic on empty strings. Consider handling an empty or invalid numeric threshold gracefully.wren-core-base/tests/data/mdl.json (4)
67-71
: Consider performance implications of nested aggregationThe
totalcost
calculation performs a nested aggregation through relationships (sum(customer.orders.o_totalprice)
). This could impact performance with large datasets. Consider:
- Adding appropriate indexes
- Implementing materialization if the calculation is frequently used
- Adding filters to limit the aggregation scope
113-117
: Document the purpose of hash_orderkeyThe purpose of hashing the order key isn't clear. If this is for security purposes, MD5 might not be the best choice as it's cryptographically broken.
Consider adding a comment explaining the intended use case.
156-159
: Consider explicit column selection in viewUsing
SELECT *
is generally discouraged as it:
- Makes the view sensitive to schema changes
- Might expose unnecessary columns
- Could impact performance
Consider explicitly listing required columns.
161-162
: Consider specifying MySQL version compatibilityAdding a minimum supported MySQL version would help ensure compatibility with features used in the schema (especially for security features and functions like MD5).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
wren-core-base/manifest-macro/src/lib.rs
(2 hunks)wren-core-base/src/mdl/builder.rs
(6 hunks)wren-core-base/src/mdl/cls.rs
(1 hunks)wren-core-base/src/mdl/manifest.rs
(3 hunks)wren-core-base/src/mdl/mod.rs
(1 hunks)wren-core-base/tests/data/mdl.json
(1 hunks)
🔇 Additional comments (14)
wren-core-base/src/mdl/cls.rs (3)
40-59
: Validate string detection logic.
is_string
relies on single quotes as a strict delimiter. This could be prone to edge cases (e.g., escaped quotes). Consider allowing or validating possible alternative string formats, or re-checking if this is the intended strict usage.
96-102
: Check logical consistency of gte
and lte
.
gte
invokes gt
or eq
; lte
invokes lt
or eq
. This is correct but verify that the type mismatch logic aligns with your overall error handling strategy (i.e., no parse fallback or different data types yield false
).
122-264
: Thorough test coverage.
The test suite appears robust, covering string vs. numeric types, boundary conditions, and scenarios with mismatched types. This is a good practice to ensure correctness.
wren-core-base/src/mdl/manifest.rs (2)
Line range hint 27-52
: Macros usage looks consistent.
All newly introduced macros (e.g., column_level_operator
, column_level_security
) align with the existing pattern. Their usage in Python and non-Python bindings is coherent.
47-52
: Validate serde_with
usage.
DeserializeFromStr
and SerializeDisplay
are used for advanced serialization logic. Be sure to confirm that it doesn’t conflict with existing custom deserializers.
wren-core-base/manifest-macro/src/lib.rs (3)
172-173
: New fields rls
and cls
in Column
.
Adding optional security fields is a minimal intrusiveness approach. Confirm that existing code handles these new fields gracefully in older manifests or partial updates.
348-367
: Row-level security macro.
Generates a straightforward struct with name
and operator
. Implementation looks consistent with the pattern used in other macros.
447-467
: NormalizedExpr
macro approach is flexible.
Using SerializeDisplay
and DeserializeFromStr
in the generated struct ensures smooth serialization and deserialization. Good job.
wren-core-base/src/mdl/builder.rs (2)
398-399
: Test coverage for row_level_security
.
The test test_column_roundtrip
includes a row-level security scenario. This further confirms that RLS is integrated into the serialization/deserialization flow.
693-708
: Full integration test for RLS and CLS.
Verifying the behavior in mdl.json
merges row-level and column-level security with expression-based logic. This is a comprehensive integration test.
wren-core-base/src/mdl/mod.rs (1)
21-21
: Public cls
module export.
Exposing cls
publicly is consistent with the code changes for column-level security.
wren-core-base/tests/data/mdl.json (3)
1-4
: LGTM!
The schema configuration follows standard patterns.
12-41
: LGTM!
The customer model is well-structured with:
- Appropriate column types
- Clear calculated column definition
- Well-documented properties
141-154
: LGTM!
Relationships are well-defined with:
- Clear join conditions using primary keys
- Appropriate join types (ONE_TO_MANY, ONE_TO_ONE)
Describe
Introduce the
RowLevelSecurity
andColumnLevelSecurity
for MDL.Now, we can define a column in the MDL JSON like
TODO works