-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add power transformation logic to forecaster transforms #185
Conversation
WalkthroughThe changes in this pull request involve modifications to the Changes
Possibly related PRs
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Outside diff range and nitpick comments (3)
crates/augurs-forecaster/src/power_transforms.rs (1)
49-62
: Expand test coverage to include edge cases and invalid inputsThe current tests cover standard scenarios but do not account for edge cases such as non-positive data values or extreme lambda values. Testing these cases ensures the robustness of the functions.
Consider adding test cases for:
- Data containing zero or negative values.
- Empty or very large datasets.
- Extreme
lambda
values, including negative values and zero.This will help identify potential issues with input validation and function stability.
crates/augurs-forecaster/src/transforms.rs (2)
15-15
: Review the necessity of makingTransforms
publicly accessibleChanging the visibility of the
Transforms
struct topub
exposes its internal structure to external users. Ensure that this is intentional and doesn't compromise encapsulation or future flexibility.If only certain functionalities need to be public, consider exposing them through public methods while keeping the struct fields private.
605-621
: Add test cases for edge conditions in Box-Cox transformationsThe tests currently cover basic functionality. Including edge cases enhances test coverage and reliability.
Suggested test cases:
- Very small
x
values approaching zero.- Negative
lambda
values.lambda
equal to zero.- Non-positive
x
values to ensure error handling works as expected.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (4)
crates/augurs-forecaster/Cargo.toml
(1 hunks)crates/augurs-forecaster/src/lib.rs
(1 hunks)crates/augurs-forecaster/src/power_transforms.rs
(1 hunks)crates/augurs-forecaster/src/transforms.rs
(8 hunks)
🔇 Additional comments (5)
crates/augurs-forecaster/src/transforms.rs (1)
94-99
:
Validate input values in the Box-Cox transformation
The box_cox
function assumes x > 0
, but this is not enforced. Passing non-positive x
values will result in math errors.
Add input validation to ensure x
is positive:
pub fn box_cox(x: f64, lambda: f64) -> f64 {
+ if x <= 0.0 {
+ panic!("Input x must be positive for Box-Cox transformation.");
+ }
if lambda == 0.0 {
x.ln()
} else {
(x.powf(lambda) - 1.0) / lambda
}
}
Alternatively, handle the error by returning a Result<f64, Error>
or skipping invalid values.
Likely invalid or redundant comment.
crates/augurs-forecaster/src/lib.rs (2)
6-6
: Verify integration of the power_transforms
module
The addition of the power_transforms
module enhances functionality. Ensure all components are correctly integrated and documented.
13-14
: Confirm the public exposure of Transforms
and optimize_lambda
Making Transforms
and optimize_lambda
public increases their accessibility. Verify that this aligns with the intended API design and that appropriate documentation is provided for external users.
crates/augurs-forecaster/Cargo.toml (2)
19-19
: Include the argmin
crate for optimization capabilities
Adding argmin = "0.10.0"
as a dependency provides necessary optimization routines for the Box-Cox transformation logic.
22-22
: Specify features for the augurs
dependency
Updating augurs
to include specific features ensures only required functionalities are compiled, potentially improving build times and reducing binary size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (3)
crates/augurs-forecaster/src/power_transforms.rs (2)
38-40
: Avoid unnecessary data copying by borrowing the data sliceIn the
optimize_lambda
function, you're cloning the input data withdata.to_vec()
, which introduces unnecessary memory overhead. Consider borrowing the data slice inBoxCoxProblem
to improve performance and reduce memory usage.Modify
BoxCoxProblem
to borrow the data:-pub struct BoxCoxProblem { - data: Vec<f64>, +pub struct BoxCoxProblem<'a> { + data: &'a [f64], } -impl CostFunction for BoxCoxProblem { +impl<'a> CostFunction for BoxCoxProblem<'a> { // ... }Update the
optimize_lambda
function accordingly:pub fn optimize_lambda(data: &[f64]) -> f64 { let cost = BoxCoxProblem { - data: data.to_vec(), + data, }; // Rest of the function... }
37-49
: Add documentation comments to public functionsThe public function
optimize_lambda
lacks documentation. Adding///
comments explaining the purpose, parameters, and return value of the function will improve code readability and maintainability.Consider adding documentation:
/// Optimize the lambda parameter for the Box-Cox transformation. /// /// # Arguments /// /// * `data` - A slice of positive real numbers. /// /// # Returns /// /// * The optimal lambda value that maximizes the log-likelihood function. pub fn optimize_lambda(data: &[f64]) -> f64 { // Function implementation... }crates/augurs-forecaster/src/transforms.rs (1)
609-624
: Enhance test coverage for edge casesThe test cases
boxcox_test
andinverse_boxcox_test
currently use positive input values. Consider adding tests with zero and negative inputs to verify that the functions handle invalid inputs appropriately.For example, add a test case for zero input:
#[test] #[should_panic(expected = "Input x must be positive for Box-Cox transformation.")] fn boxcox_zero_input() { let data = vec![0.0]; let lambda = 0.5; let _ = data.into_iter().boxcox(lambda).collect::<Vec<_>>(); }And a test case for negative input:
#[test] #[should_panic(expected = "Input x must be positive for Box-Cox transformation.")] fn boxcox_negative_input() { let data = vec![-1.0]; let lambda = 0.5; let _ = data.into_iter().boxcox(lambda).collect::<Vec<_>>(); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
crates/augurs-forecaster/src/lib.rs
(1 hunks)crates/augurs-forecaster/src/power_transforms.rs
(1 hunks)crates/augurs-forecaster/src/transforms.rs
(8 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- crates/augurs-forecaster/src/lib.rs
🔇 Additional comments (3)
crates/augurs-forecaster/src/power_transforms.rs (2)
5-19
: Add input validation for data containing non-positive values
The function box_cox_log_likelihood
assumes that all elements in data
are positive because it uses logarithmic functions (x.ln()
) which are undefined for non-positive values. If data
contains zero or negative values, this will lead to runtime errors or invalid calculations.
44-48
: Handle potential errors from the optimizer to prevent panics
Unwrapping the result of run()
and best_param
can lead to a panic if the optimization fails. It's safer to handle possible errors explicitly to prevent unexpected crashes.
crates/augurs-forecaster/src/transforms.rs (1)
436-443
: Ensure valid computations in the inverse Box-Cox transformation
The inverse_box_cox
function may encounter invalid operations if y * lambda + 1.0
is negative, especially when lambda < 0
and y
has certain values. This can lead to undefined behavior or runtime errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (1)
crates/augurs-forecaster/src/transforms.rs (1)
611-627
: Enhance test coverage for Box-Cox transformationsConsider adding tests for:
- Edge case:
lambda = 0
(logarithmic case)- Error cases: negative inputs for Box-Cox and invalid inputs for inverse Box-Cox
Example test cases to add:
#[test] fn boxcox_edge_cases() { // Test lambda = 0 case (logarithmic) let data = vec![1.0, 2.0, 3.0]; let lambda = 0.0; let expected = vec![0.0, 2.0_f64.ln(), 3.0_f64.ln()]; let actual: Vec<_> = data.into_iter().boxcox(lambda).collect(); assert_all_close(&expected, &actual); } #[test] #[should_panic(expected = "Input x must be positive")] fn boxcox_negative_input() { let data = vec![-1.0]; let lambda = 0.5; let _: Vec<_> = data.into_iter().boxcox(lambda).collect(); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
crates/augurs-forecaster/src/transforms.rs
(8 hunks)
🔇 Additional comments (5)
crates/augurs-forecaster/src/transforms.rs (5)
10-15
: LGTM! Clear documentation and appropriate visibility.
The documentation clearly explains the purpose of the Transforms
struct and its relationship with the Transform
enum. The visibility change to pub
is appropriate for the module's public API.
56-57
: LGTM! Clean implementation of Box-Cox transform variant.
The BoxCox
variant and its constructor method follow Rust's idiomatic patterns and are well-documented.
Also applies to: 93-98
Line range hint 100-111
: LGTM! Consistent implementation of transform methods.
The Box-Cox transform handling in both transform
and inverse_transform
methods follows the established pattern of other transformations.
Also applies to: 114-125
408-478
: LGTM! Well-structured iterator adapters.
The Box-Cox iterator adapters follow Rust's iterator patterns consistently and are well-documented.
439-446
:
Add validation for inverse Box-Cox transformation
The inverse_box_cox
function needs validation to prevent invalid operations when y * lambda + 1.0
is negative.
Apply this fix:
fn inverse_box_cox(y: f64, lambda: f64) -> f64 {
+ let value = y * lambda + 1.0;
+ if value <= 0.0 {
+ panic!("Invalid value encountered in inverse Box-Cox transformation.");
+ }
if lambda == 0.0 {
y.exp()
} else {
- (y * lambda + 1.0).powf(1.0 / lambda)
+ value.powf(1.0 / lambda)
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, would be very cool to see this in! Good excuse to finish off #184 too.
I've added loads of inline comments but this is awesome for your first Rust contribution 🥳
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (5)
crates/augurs-forecaster/src/power_transforms.rs (1)
55-74
: Enhance test coverage with edge casesThe current tests only cover happy path scenarios. Consider adding tests for:
- Empty data array
- Data containing zero or negative values
- Data with identical values (zero variance)
- Optimization failure cases
Example additional test:
#[test] fn test_edge_cases() { // Empty data let data: &[f64] = &[]; assert!(optimize_lambda(data).is_err()); // Negative values let data = &[-0.1, 0.2, 0.3]; let llf = box_cox_log_likelihood(data, 1.0); assert_eq!(llf, f64::NEG_INFINITY); // Identical values (zero variance) let data = &[0.5, 0.5, 0.5]; let llf = box_cox_log_likelihood(data, 1.0); assert_eq!(llf, f64::NEG_INFINITY); }crates/augurs-forecaster/src/transforms.rs (4)
58-64
: Convert BoxCox to a struct variant for better clarity.Following the suggestion from the previous review, convert the BoxCox variant to a struct for better documentation and clarity of the lambda parameter.
- BoxCox{ - /// The lambda parameter for the Box-Cox transformation. - /// If lambda == 0, the transformation is equivalent to the natural logarithm. - /// Otherwise, the transformation is (x^lambda - 1) / lambda. - lambda: f64, - }, + /// Box-Cox transformation with a configurable lambda parameter. + BoxCox(BoxCoxParams), + /// Parameters for the Box-Cox transformation. + #[derive(Debug, Clone)] + pub struct BoxCoxParams { + /// The lambda parameter for the Box-Cox transformation. + /// If lambda == 0, the transformation is equivalent to the natural logarithm. + /// Otherwise, the transformation is (x^lambda - 1) / lambda. + pub lambda: f64, + }
110-118
: Improve documentation for power_transform method.The documentation for
power_transform
is incomplete. Add details about:
- How lambda is optimized
- The constraints on input data
- The relationship between power transform and Box-Cox transform
/// Create the power transform that optimizes the lambda parameter for the Box-Cox transformation. /// /// This transform applies the Power transformation to each item. - /// The Power transformation is defined as: + /// The Power transformation is a Box-Cox transformation where lambda is optimized to + /// maximize the log-likelihood of the transformed data being normally distributed. + /// + /// # Arguments + /// * `data` - The input data used to optimize lambda. Must be positive values. + /// + /// # Returns + /// A Box-Cox transformation with the optimized lambda parameter. /// pub fn power_transform(data: &[f64]) -> Self {
417-424
: Consider using Result instead of assert for input validation.Replace the assertion with a Result return type for more graceful error handling.
-pub fn box_cox(x: f64, lambda: f64) -> f64 { - assert!(x > 0.0, "Input x must be positive for Box-Cox transformation."); +pub fn box_cox(x: f64, lambda: f64) -> Result<f64, &'static str> { + if x <= 0.0 { + return Err("Input x must be positive for Box-Cox transformation"); + } if lambda == 0.0 { - x.ln() + Ok(x.ln()) } else { - (x.powf(lambda) - 1.0) / lambda + Ok((x.powf(lambda) - 1.0) / lambda) } }
Based on the verification results, I can now provide a final response:
Box-Cox transformation implementation needs additional error handling
The Box-Cox transformation has been properly integrated into the Transform enum and its match patterns, but there are some concerns:
- The
box_cox
function only asserts for positive input values, but this error handling isn't propagated through the iterator adaptors- No test cases were found specifically for Box-Cox transformation error cases
- The
power_transform
function that optimizes lambda doesn't include input validationSuggested improvements:
- Add proper error handling in
BoxCox
andInverseBoxCox
iterator implementations- Add test cases for invalid inputs (non-positive values)
- Add input validation in
power_transform
function🔗 Analysis chain
Line range hint
1-645
: Verify usage of Box-Cox transformation in the codebase.Let's ensure all callers are properly updated to handle the new Box-Cox transformation.
Let me generate another verification script to check for the actual usage patterns and error handling in the codebase.
Let me try one more verification script to check the match patterns and error handling in the correct file path:
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Find all usages of BoxCox transformation and direct calls to box_cox function # Find all uses of BoxCox variant in Transform enum echo "Checking BoxCox variant usage:" rg "Transform::BoxCox" -A 2 # Find direct calls to box_cox function echo "Checking direct box_cox function calls:" rg "box_cox\(" -A 2 # Find potential places where error handling might be needed echo "Checking for potential error handling sites:" ast-grep --pattern 'match $expr { $$$ Transform::BoxCox { $$ } => $$, $$$ }'Length of output: 1657
Script:
#!/bin/bash # Check for Transform enum match patterns and BoxCox usage in tests/examples # Find all match expressions on Transform enum echo "Checking Transform enum match patterns:" rg "match.*self.*\{" -A 10 "transforms.rs" # Find test cases or examples using BoxCox echo "Checking test cases and examples:" rg "BoxCox" --type rust -A 5 # Check for any error handling patterns around box_cox echo "Checking error handling patterns:" rg "assert!.*box_cox" --type rustLength of output: 7679
Script:
#!/bin/bash # Check for Transform enum match patterns and error handling # Find all match expressions on Transform enum with correct path echo "Checking Transform enum match patterns:" rg "match.*self.*\{" -A 10 "crates/augurs-forecaster/src/transforms.rs" # Check for any panic or error handling around BoxCox echo "Checking error handling:" rg "assert!|panic!|Error" -A 2 "crates/augurs-forecaster/src/transforms.rs" # Check for any tests related to BoxCox transformation echo "Checking BoxCox tests:" rg "#\[test\].*box" -A 10 "crates/augurs-forecaster/src/transforms.rs"Length of output: 1610
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
crates/augurs-forecaster/Cargo.toml
(1 hunks)crates/augurs-forecaster/src/power_transforms.rs
(1 hunks)crates/augurs-forecaster/src/transforms.rs
(8 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- crates/augurs-forecaster/Cargo.toml
🔇 Additional comments (4)
crates/augurs-forecaster/src/power_transforms.rs (3)
5-20
:
Add input validation to prevent mathematical domain errors
The Box-Cox transformation requires strictly positive values, and the log-likelihood calculation has potential mathematical domain errors.
Apply these changes to handle edge cases:
fn box_cox_log_likelihood(data: &[f64], lambda: f64) -> f64 {
let n = data.len() as f64;
assert!(n > 0.0, "Data must not be empty");
+ // Validate positive values
+ if data.iter().any(|&x| x <= 0.0) {
+ return f64::NEG_INFINITY; // Invalid transformation
+ }
let transformed_data: Vec<f64> = data.iter().map(|&x| box_cox(x, lambda)).collect();
let mean_transformed: f64 = transformed_data.iter().copied().sum::<f64>() / n;
let variance: f64 = transformed_data
.iter()
.map(|&x| (x - mean_transformed).powi(2))
.sum::<f64>()
/ n;
+ // Handle zero variance case
+ if variance <= 0.0 {
+ return f64::NEG_INFINITY; // All values are identical
+ }
let log_likelihood =
-0.5 * n * variance.ln() + (lambda - 1.0) * data.iter().map(|&x| x.ln()).sum::<f64>();
log_likelihood
}
22-35
: 🛠️ Refactor suggestion
Optimize memory usage by using references
The struct can be more memory efficient by using a reference to the data instead of cloning it.
Apply this change:
#[derive(Clone)]
-struct BoxCoxProblem {
- data: Vec<f64>,
+struct BoxCoxProblem<'a> {
+ data: &'a [f64],
}
-impl CostFunction for BoxCoxProblem {
+impl<'a> CostFunction for BoxCoxProblem<'a> {
type Param = f64;
type Output = f64;
fn cost(&self, lambda: &Self::Param) -> Result<Self::Output, Error> {
Ok(-box_cox_log_likelihood(self.data, *lambda))
}
}
37-53
:
Replace panic with proper error handling
The function should return a Result instead of panicking on optimization failure.
Apply these changes:
-pub fn optimize_lambda(data: &[f64]) -> f64 {
+pub fn optimize_lambda(data: &[f64]) -> Result<f64, Error> {
let cost = BoxCoxProblem {
- data: data.to_vec(),
+ data,
};
let init_param = 0.5;
let solver = BrentOpt::new(-2.0, 2.0);
- let result = Executor::new(cost, solver)
+ let result = Executor::new(cost, solver)
.configure(|state| state.param(init_param).max_iters(100))
.run();
- match result {
- Ok(result) => result.state().best_param.unwrap(),
- Err(error) => panic!("Optimization failed: {}", error),
- }
+ result.and_then(|res| {
+ res.state()
+ .best_param
+ .ok_or_else(|| Error::new(argmin::core::ErrorKind::NotConverged, "Failed to find optimal parameter"))
+ })
}
crates/augurs-forecaster/src/transforms.rs (1)
458-464
:
Add input validation to inverse_box_cox.
The function needs to validate that y * lambda + 1.0
is positive when lambda != 0.0
to prevent invalid operations.
-fn inverse_box_cox(y: f64, lambda: f64) -> f64 {
+fn inverse_box_cox(y: f64, lambda: f64) -> Result<f64, &'static str> {
if lambda == 0.0 {
- y.exp()
+ Ok(y.exp())
} else {
- (y * lambda + 1.0).powf(1.0 / lambda)
+ let value = y * lambda + 1.0;
+ if value <= 0.0 {
+ return Err("Invalid value encountered in inverse Box-Cox transformation");
+ }
+ Ok(value.powf(1.0 / lambda))
}
}
Likely invalid or redundant comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (3)
crates/augurs-forecaster/src/power_transforms.rs (3)
37-38
: Improve function documentationAdd documentation explaining the meaning of the returned lambda value and its valid range.
/// Optimize the lambda parameter for the Box-Cox transformation +/// +/// Returns the optimal lambda value that maximizes the log-likelihood of the Box-Cox transformation. +/// The lambda value typically falls between -2 and 2, where: +/// * λ = 1.0: No transformation needed +/// * λ = 0.0: Log transformation +/// * λ = 0.5: Square root transformation +/// * λ = -1.0: Reciprocal transformation pub fn optimize_lambda(data: &[f64]) -> f64{
42-47
: Consider making optimization parameters configurableThe optimization bounds (-2.0, 2.0) and maximum iterations (100) are currently hard-coded. Consider making these configurable for more flexibility.
+#[derive(Debug, Clone)] +pub struct OptimizationConfig { + pub lower_bound: f64, + pub upper_bound: f64, + pub max_iterations: u64, + pub initial_guess: f64, +} + +impl Default for OptimizationConfig { + fn default() -> Self { + Self { + lower_bound: -2.0, + upper_bound: 2.0, + max_iterations: 100, + initial_guess: 0.5, + } + } +} + -pub fn optimize_lambda(data: &[f64]) -> Result<f64, Error> { +pub fn optimize_lambda(data: &[f64], config: Option<OptimizationConfig>) -> Result<f64, Error> { + let config = config.unwrap_or_default(); let cost = BoxCoxProblem { data: data, }; - let init_param = 0.5; - let solver = BrentOpt::new(-2.0, 2.0); + let solver = BrentOpt::new(config.lower_bound, config.upper_bound); let result = Executor::new(cost, solver) - .configure(|state| state.param(init_param).max_iters(100)) + .configure(|state| state.param(config.initial_guess).max_iters(config.max_iterations))
55-74
: Add more comprehensive test casesThe current tests only cover happy paths. Consider adding tests for:
- Edge cases (empty data, single value)
- Error cases (negative values, zero values)
- Boundary conditions (very small/large values)
- Known transformations (log transform λ=0, no transform λ=1)
#[test] fn test_edge_cases() { // Single value should return lambda=1 (no transformation needed) let data = &[1.0]; assert_approx_eq!(optimize_lambda(data).unwrap(), 1.0); } #[test] fn test_known_transformations() { // Data that should prefer log transformation (λ≈0) let data = &[1.0, 10.0, 100.0, 1000.0]; assert_approx_eq!(optimize_lambda(data).unwrap(), 0.0, 0.1); } #[test] fn test_error_cases() { // Negative values should return error let data = &[-1.0, 1.0, 2.0]; assert!(optimize_lambda(data).is_err()); // Zero values should return error let data = &[0.0, 1.0, 2.0]; assert!(optimize_lambda(data).is_err()); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
crates/augurs-forecaster/src/power_transforms.rs
(1 hunks)
🔇 Additional comments (2)
crates/augurs-forecaster/src/power_transforms.rs (2)
1-4
: LGTM!
The imports are appropriate for implementing Box-Cox transformation optimization.
5-20
:
Add comprehensive input validation and error handling
The log-likelihood calculation needs more robust error handling:
- Input data must be validated for positive values (Box-Cox requires x > 0)
- Variance calculation should be protected against zero/negative values
- NaN/infinite values should be handled
Consider implementing these safeguards:
-fn box_cox_log_likelihood(data: &[f64], lambda: f64) -> f64 {
+fn box_cox_log_likelihood(data: &[f64], lambda: f64) -> Result<f64, &'static str> {
let n = data.len() as f64;
assert!(n > 0.0, "Data must not be empty");
+
+ // Validate input data
+ if data.iter().any(|&x| x <= 0.0) {
+ return Err("All data values must be positive for Box-Cox transformation");
+ }
+ if data.iter().any(|&x| x.is_nan() || x.is_infinite()) {
+ return Err("Data contains NaN or infinite values");
+ }
+
let transformed_data: Vec<f64> = data.iter().map(|&x| box_cox(x, lambda)).collect();
let mean_transformed: f64 = transformed_data.iter().copied().sum::<f64>() / n;
let variance: f64 = transformed_data
.iter()
.map(|&x| (x - mean_transformed).powi(2))
.sum::<f64>()
/ n;
+ // Ensure variance is valid
+ if variance <= 0.0 {
+ return Err("Variance must be positive");
+ }
+
// Avoid log(0) by ensuring variance is positive
let log_likelihood =
-0.5 * n * variance.ln() + (lambda - 1.0) * data.iter().map(|&x| x.ln()).sum::<f64>();
- log_likelihood
+ Ok(log_likelihood)
}
Likely invalid or redundant comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
crates/augurs-forecaster/src/lib.rs
(1 hunks)crates/augurs-forecaster/src/power_transforms.rs
(1 hunks)crates/augurs-forecaster/src/transforms.rs
(8 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- crates/augurs-forecaster/src/lib.rs
- crates/augurs-forecaster/src/power_transforms.rs
🔇 Additional comments (9)
crates/augurs-forecaster/src/transforms.rs (9)
10-11
: Ensure optimize_lambda
is utilized correctly
The import of optimize_lambda
suggests you are optimizing the lambda parameter for the Box-Cox transformation. Verify that this function is correctly implemented and used in the code.
12-15
: Documentation enhancements are clear and informative
The added documentation for Transforms
and Transform
improves clarity on their roles and usage in the codebase.
58-64
: Addition of BoxCox
variant to Transform
enum
The introduction of the BoxCox
variant with detailed documentation properly extends the Transform
enum to support Box-Cox transformations.
106-108
: Method boxcox
correctly creates a Box-Cox transform
The boxcox
method provides a straightforward way to create a new Box-Cox transform with a specified lambda
parameter.
121-121
: Consider restricting the visibility of transform
method
To reduce the public API surface and encapsulate implementation details, consider changing the visibility of the transform
method from pub
to pub(crate)
if it is not intended for external use.
135-135
: Consider restricting the visibility of inverse_transform
method
Similar to the transform
method, assess whether inverse_transform
needs to be publicly accessible. If not, changing it to pub(crate)
can enhance encapsulation.
149-149
: Consider restricting the visibility of inverse_transform_forecast
method
Evaluate if inverse_transform_forecast
should be part of the public API. If it is used internally, consider changing its visibility to pub(crate)
to maintain a cleaner public interface.
632-648
: Expand test coverage for Box-Cox transformation
The current tests cover basic scenarios. To enhance robustness, consider adding:
- Edge case where
lambda = 0
: Test the logarithmic case of the Box-Cox transformation. - Input validation tests: Ensure that the transformation functions handle non-positive inputs appropriately, either by panicking or returning an error.
- Inverse transformation accuracy: Add tests to verify that
inverse_box_cox(box_cox(x, lambda), lambda)
approximatesx
within an acceptable tolerance for various values ofx
andlambda
. - Negative
lambda
values: Include tests wherelambda
is negative to ensure correct behavior in these cases.
461-467
:
Ensure valid computations in the inverse Box-Cox transformation
The inverse_box_cox
function may produce invalid results if y * lambda + 1.0
is non-positive when lambda < 0
and y
has certain values. This results in attempting to calculate the root of a non-positive number, leading to a runtime error.
Add input validation to prevent invalid computations:
fn inverse_box_cox(y: f64, lambda: f64) -> f64 {
+ let value = y * lambda + 1.0;
+ if value <= 0.0 {
+ panic!("Invalid value encountered in inverse Box-Cox transformation: y * lambda + 1.0 must be positive.");
+ }
if lambda == 0.0 {
y.exp()
} else {
- (y * lambda + 1.0).powf(1.0 / lambda)
+ value.powf(1.0 / lambda)
}
}
Alternatively, handle errors gracefully by returning a Result<f64, Error>
instead of panicking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (2)
crates/augurs-forecaster/src/transforms.rs (2)
58-64
: Consider using a newtype pattern for lambdaThe lambda parameter is a critical value that should be constrained to valid ranges. Consider wrapping it in a newtype with validation.
+#[derive(Debug, Clone, Copy)] +pub struct BoxCoxLambda(f64); + +impl BoxCoxLambda { + pub fn new(lambda: f64) -> Self { + Self(lambda) + } + + pub fn value(&self) -> f64 { + self.0 + } +} pub enum Transform { - BoxCox { - lambda: f64, - }, + BoxCox(BoxCoxLambda), }
429-458
: Consider implementing custom Debug for iterator adaptersThe derived Debug implementation might not be ideal for the iterator adapters. Consider implementing a custom Debug that shows the lambda value.
-#[derive(Clone, Debug)] +#[derive(Clone)] struct BoxCox<T> { inner: T, lambda: f64, } +impl<T: fmt::Debug> fmt::Debug for BoxCox<T> { + fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { + f.debug_struct("BoxCox") + .field("inner", &self.inner) + .field("lambda", &self.lambda) + .finish() + } +}Also applies to: 469-498
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
crates/augurs-forecaster/src/power_transforms.rs
(1 hunks)crates/augurs-forecaster/src/transforms.rs
(8 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- crates/augurs-forecaster/src/power_transforms.rs
🔇 Additional comments (3)
crates/augurs-forecaster/src/transforms.rs (3)
10-11
: LGTM: Import and dependency addition
The addition of optimize_lambda
from the power transforms module is appropriate for the Box-Cox transformation implementation.
110-118
: Add input validation in power_transform
The function should validate that all input data points are positive before attempting the transformation.
460-467
: Add validation for inverse Box-Cox transformation
The inverse transformation should validate that y * lambda + 1.0
is positive when lambda != 0
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (4)
crates/augurs-forecaster/src/power_transforms.rs (3)
5-28
: Consider improving numerical stability and performanceThe implementation could benefit from the following improvements:
- Use a two-pass algorithm or Welford's online algorithm for variance calculation to improve numerical stability
- Use itertools'
sum()
for more efficient summation operationsfn box_cox_log_likelihood(data: &[f64], lambda: f64) -> Result<f64, Error> { let n = data.len() as f64; if n == 0.0 { return Err(Error::msg("Data must not be empty")); } if data.iter().any(|&x| x <= 0.0) { return Err(Error::msg("All data must be greater than 0")); } let transformed_data: Vec<f64> = data.iter().map(|&x| box_cox(x, lambda)).collect(); - let mean_transformed: f64 = transformed_data.iter().copied().sum::<f64>() / n; - let variance: f64 = transformed_data - .iter() - .map(|&x| (x - mean_transformed).powi(2)) - .sum::<f64>() - / n; + // Use Welford's online algorithm for better numerical stability + let (mean_transformed, variance) = transformed_data.iter().fold( + (0.0, 0.0), + |(m, s2), &x| { + let delta = x - m; + let m_next = m + delta / n; + let s2_next = s2 + delta * (x - m_next); + (m_next, s2_next) + }, + ); + let variance = variance / n; // Avoid log(0) by ensuring variance is positive if variance <= 0.0 { return Err(Error::msg("Variance must be positive")); } let log_likelihood = - -0.5 * n * variance.ln() + (lambda - 1.0) * data.iter().map(|&x| x.ln()).sum::<f64>(); + -0.5 * n * variance.ln() + (lambda - 1.0) * data.iter().map(|&x| x.ln()).sum(); Ok(log_likelihood) }
45-61
: Consider making optimization parameters configurableThe hardcoded optimization parameters (-2.0, 2.0 bounds, 0.5 initial value, 100 max iterations) might not be optimal for all datasets. Consider making these configurable through a struct.
+#[derive(Debug, Clone)] +pub struct BoxCoxOptimizeParams { + pub lower_bound: f64, + pub upper_bound: f64, + pub init_param: f64, + pub max_iters: u64, +} + +impl Default for BoxCoxOptimizeParams { + fn default() -> Self { + Self { + lower_bound: -2.0, + upper_bound: 2.0, + init_param: 0.5, + max_iters: 100, + } + } +} -pub(crate) fn optimize_lambda(data: &[f64]) -> Result<f64, Error> { +pub(crate) fn optimize_lambda(data: &[f64], params: Option<BoxCoxOptimizeParams>) -> Result<f64, Error> { + let params = params.unwrap_or_default(); let cost = BoxCoxProblem { data }; - let init_param = 0.5; - let solver = BrentOpt::new(-2.0, 2.0); + let solver = BrentOpt::new(params.lower_bound, params.upper_bound); let result = Executor::new(cost, solver) - .configure(|state| state.param(init_param).max_iters(100)) + .configure(|state| state.param(params.init_param).max_iters(params.max_iters)) .run();
63-108
: Consider adding more test casesThe current test coverage is good, but consider adding these edge cases:
- Data with very large values to test numerical stability
- Data with very small values (close to zero)
- Data that results in different optimal lambda values
- Test with max iterations reached
#[test] fn test_boxcox_llf_large_values() { let data = &[1e10, 2e10, 3e10, 4e10, 5e10]; let lambda = 1.0; let got = box_cox_log_likelihood(data, lambda); assert!(got.is_ok()); } #[test] fn test_boxcox_llf_small_values() { let data = &[1e-10, 2e-10, 3e-10, 4e-10, 5e-10]; let lambda = 1.0; let got = box_cox_log_likelihood(data, lambda); assert!(got.is_ok()); }crates/augurs-forecaster/src/forecaster.rs (1)
128-139
: Improve test readability and coverageThe test could be improved in the following ways:
- Document why these specific expected values are correct
- Test additional properties of the power transformation
- Add assertions for the transformed data shape
#[test] fn test_forecaster_power() { let data = &[1.0_f64, 2.0, 3.0, 4.0, 5.0]; let transforms = vec![ Transform::power_transform(data), ]; let model = MSTLModel::new(vec![2], NaiveTrend::new()); let mut forecaster = Forecaster::new(model).with_transforms(transforms); forecaster.fit(data).unwrap(); let forecasts = forecaster.predict(4, None).unwrap(); + // The expected values represent the forecasted values after applying + // the power transformation and its inverse. The alternating pattern + // is due to the seasonal component with period 2. assert_all_approx_eq(&forecasts.point, &[5.084499064884572, 5.000000030329821, 5.084499064884572, 5.000000030329821]); + + // Verify that the transformation preserves the data shape + assert_eq!(forecasts.point.len(), 4); + + // Verify that the transformed values are within reasonable bounds + for value in forecasts.point.iter() { + assert!(*value > 0.0, "Transformed values should be positive"); + assert!(*value < 10.0, "Transformed values should be bounded"); + } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
crates/augurs-forecaster/src/forecaster.rs
(1 hunks)crates/augurs-forecaster/src/power_transforms.rs
(1 hunks)crates/augurs-forecaster/src/transforms.rs
(8 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- crates/augurs-forecaster/src/transforms.rs
🔇 Additional comments (2)
crates/augurs-forecaster/src/power_transforms.rs (2)
30-33
: LGTM!
The struct is well-designed with proper lifetime annotations for the borrowed data.
35-43
: LGTM!
The implementation correctly maps the log-likelihood to its negative value for minimization and properly propagates errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Outside diff range and nitpick comments (3)
crates/augurs-forecaster/src/power_transforms.rs (2)
6-34
: LGTM! Consider using a custom error type.The error handling is thorough and well-implemented. However, using a string literal for errors could be improved by defining a custom error type for better error handling across the codebase.
Consider creating a custom error type:
#[derive(Debug, Error)] pub enum TransformError { #[error("Data must not be empty")] EmptyData, #[error("All data must be greater than 0")] NonPositiveData, #[error("Variance must be positive")] NonPositiveVariance, }
90-103
: Consider reducing code duplication with BoxCoxProblem.The implementation is correct but shares significant code structure with BoxCoxProblem. Consider extracting common functionality into a trait or generic struct.
Example approach:
trait OptimizationProblem { fn compute_log_likelihood(&self, lambda: f64) -> Result<f64, Error>; } struct TransformProblem<'a, F> { data: &'a [f64], transform_fn: F, } impl<F> CostFunction for TransformProblem<'_, F> where F: Fn(&[f64], f64) -> Result<f64, Error>, { type Param = f64; type Output = f64; fn cost(&self, lambda: &Self::Param) -> Result<Self::Output, Error> { (self.transform_fn)(self.data, *lambda).map(|ll| -ll) } }crates/augurs-forecaster/src/transforms.rs (1)
756-772
: Enhance test coverage with edge cases.The current tests cover basic functionality but should be expanded to include:
- Edge cases (lambda = 0, lambda = 1)
- Error cases (invalid inputs)
- Property-based tests
Example additional tests:
#[test] fn box_cox_edge_cases() { // Test lambda = 0 (log transform) let data = vec![1.0, 2.0, 3.0]; let lambda = 0.0; let expected: Vec<_> = data.iter().map(|x| x.ln()).collect(); let actual: Vec<_> = data.into_iter().box_cox(lambda).collect(); assert_all_close(&expected, &actual); } #[test] fn box_cox_invalid_input() { let data = vec![-1.0, 0.0, 1.0]; let lambda = 0.5; let result: Vec<_> = data.into_iter().box_cox(lambda).collect(); assert!(result.iter().any(|&x| x.is_nan())); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
crates/augurs-forecaster/src/forecaster.rs
(1 hunks)crates/augurs-forecaster/src/power_transforms.rs
(1 hunks)crates/augurs-forecaster/src/transforms.rs
(8 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- crates/augurs-forecaster/src/forecaster.rs
🔇 Additional comments (5)
crates/augurs-forecaster/src/power_transforms.rs (3)
36-73
: LGTM! Well-structured implementation.
The function handles all edge cases appropriately, including NaN checks and proper error handling. The mathematical implementation is correct for both positive and negative values.
75-88
: LGTM! Clean and efficient implementation.
The struct and trait implementation are well-designed, with proper lifetime handling and error propagation.
123-138
: Reduce code duplication with optimize_box_cox_lambda.
The function is nearly identical to optimize_box_cox_lambda. Consider extracting the common optimization logic into a shared function.
crates/augurs-forecaster/src/transforms.rs (2)
58-71
: LGTM! Well-documented enum variants.
The BoxCox and YeoJohnson variants are well-structured with clear documentation explaining the lambda parameter and transformation behavior.
444-474
: LGTM! Robust implementation with proper error handling.
The functions include appropriate input validation and clear error messages. The implementations correctly handle edge cases and return Results instead of panicking.
/// An iterator adapter that applies the Box-Cox transformation to each item. | ||
#[derive(Clone, Debug)] | ||
struct BoxCox<T> { | ||
inner: T, | ||
lambda: f64, | ||
} | ||
|
||
impl<T> Iterator for BoxCox<T> | ||
where | ||
T: Iterator<Item = f64>, | ||
{ | ||
type Item = f64; | ||
fn next(&mut self) -> Option<Self::Item> { | ||
self.inner | ||
.next() | ||
.map(|x| box_cox(x, self.lambda).unwrap_or(f64::NAN)) | ||
} | ||
} | ||
|
||
trait BoxCoxExt: Iterator<Item = f64> { | ||
fn box_cox(self, lambda: f64) -> BoxCox<Self> | ||
where | ||
Self: Sized, | ||
{ | ||
BoxCox { | ||
inner: self, | ||
lambda, | ||
} | ||
} | ||
} | ||
|
||
impl<T> BoxCoxExt for T where T: Iterator<Item = f64> {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Reconsider error handling strategy in iterator implementation.
Converting errors to NaN using unwrap_or(f64::NAN)
could mask underlying issues and make debugging more difficult.
Consider propagating errors through the iterator:
pub struct BoxCoxIter<T> {
inner: T,
lambda: f64,
error: Option<&'static str>,
}
impl<T> Iterator for BoxCoxIter<T>
where
T: Iterator<Item = f64>,
{
type Item = Result<f64, &'static str>;
fn next(&mut self) -> Option<Self::Item> {
if let Some(err) = self.error {
return Some(Err(err));
}
self.inner.next().map(|x| box_cox(x, self.lambda))
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems reasonable, but maybe worth doing in another PR? since we'd have to change the output type for T for everything
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@edwardcqian, understood. It makes sense to address this in a separate PR to avoid widespread changes right now. Would you like me to open a GitHub issue to track this enhancement?
/// Returns the inverse Box-Cox transformation of the given value. | ||
fn inverse_box_cox(y: f64, lambda: f64) -> f64 { | ||
if lambda == 0.0 { | ||
y.exp() | ||
} else { | ||
(y * lambda + 1.0).powf(1.0 / lambda) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add domain validation to prevent invalid computations.
The function should validate that y * lambda + 1.0
is positive when computing the power function.
fn inverse_box_cox(y: f64, lambda: f64) -> Result<f64, &'static str> {
if lambda == 0.0 {
Ok(y.exp())
} else {
let value = y * lambda + 1.0;
if value <= 0.0 {
Err("Invalid domain for inverse Box-Cox transformation")
} else {
Ok(value.powf(1.0 / lambda))
}
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@edwardcqian should we be doing this? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't think we will reasonably get non-pos value here but doesn't hurt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left a couple of comments on other comments 😁 there are some other unresolved ones too, e.g. about removing pub
for some items
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (1)
crates/augurs-forecaster/src/transforms.rs (1)
58-64
: Enhance BoxCox documentation to cover edge cases.The documentation should clarify that Box-Cox transformation is only defined for positive values, similar to how the Yeo-Johnson documentation explains its behavior with negative values.
/// Box-Cox transform. + /// This transform is only defined for positive values. + /// For non-positive values, the transformation will return NaN. BoxCox { /// The lambda parameter for the Box-Cox transformation. /// If lambda == 0, the transformation is equivalent to the natural logarithm. /// Otherwise, the transformation is (x^lambda - 1) / lambda. lambda: f64, },
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
crates/augurs-forecaster/src/transforms.rs
(8 hunks)
🔇 Additional comments (4)
crates/augurs-forecaster/src/transforms.rs (4)
756-772
: Add more test cases for transformations.
Current tests only cover happy paths. Consider adding:
- Edge case where lambda = 0
- Error cases with negative or zero inputs
- Error cases in inverse transformation
- Property-based tests to verify that inverse(transform(x)) ≈ x
Example additional tests:
#[test]
fn box_cox_zero_lambda() {
let data = vec![1.0, 2.0, 3.0];
let lambda = 0.0;
let expected: Vec<_> = data.iter().map(|x| x.ln()).collect();
let actual: Vec<_> = data.into_iter().box_cox(lambda).collect();
assert_all_close(&expected, &actual);
}
#[test]
fn box_cox_negative_input() {
let data = vec![-1.0];
let lambda = 0.5;
let result: Result<f64, _> = box_cox(-1.0, lambda);
assert!(result.is_err());
}
129-145
:
Replace expect() with proper error handling.
The current implementation can panic at runtime. This is particularly problematic for a library function. Consider returning a Result instead.
- pub fn power_transform(data: &[f64]) -> Self {
+ pub fn power_transform(data: &[f64]) -> Result<Self, Box<dyn std::error::Error>> {
if data.iter().all(|&x| x > 0.0) {
- let lambda =
- optimize_box_cox_lambda(data).expect("Failed to optimize lambda (box-cox)");
- Self::BoxCox { lambda }
+ let lambda = optimize_box_cox_lambda(data)?;
+ Ok(Self::BoxCox { lambda })
} else {
- let lambda =
- optimize_yeo_johnson_lambda(data).expect("Failed to optimize lambda (yeo-johnson)");
- Self::YeoJohnson { lambda }
+ let lambda = optimize_yeo_johnson_lambda(data)?;
+ Ok(Self::YeoJohnson { lambda })
}
}
509-516
:
Add domain validation for inverse Box-Cox transformation.
The function should validate that y * lambda + 1.0
is positive when computing the power function.
-fn inverse_box_cox(y: f64, lambda: f64) -> f64 {
+fn inverse_box_cox(y: f64, lambda: f64) -> Result<f64, &'static str> {
if lambda == 0.0 {
- y.exp()
+ Ok(y.exp())
} else {
let value = y * lambda + 1.0;
+ if value <= 0.0 {
+ return Err("Invalid domain for inverse Box-Cox transformation");
+ }
- (y * lambda + 1.0).powf(1.0 / lambda)
+ Ok(value.powf(1.0 / lambda))
}
}
476-493
:
Improve error handling in transformation iterators.
Converting errors to NaN silently masks issues and makes debugging harder. Consider propagating errors through the iterator.
Example for BoxCox (apply similar changes to YeoJohnson):
struct BoxCox<T> {
inner: T,
lambda: f64,
+ error: Option<&'static str>,
}
impl<T> Iterator for BoxCox<T>
where
T: Iterator<Item = f64>,
{
- type Item = f64;
+ type Item = Result<f64, &'static str>;
fn next(&mut self) -> Option<Self::Item> {
+ if let Some(err) = self.error {
+ return Some(Err(err));
+ }
- self.inner.next().map(|x| box_cox(x, self.lambda).unwrap_or(f64::NAN))
+ self.inner.next().map(|x| box_cox(x, self.lambda))
}
}
Also applies to: 549-566
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (6)
crates/augurs-forecaster/src/power_transforms.rs (4)
6-34
: Consider using custom error types and extracting variance calculation.The error handling is good, but using custom error types would make the API more robust and maintainable.
Consider creating a custom error enum:
#[derive(Debug, Error)] pub enum TransformError { #[error("Data must not be empty")] EmptyData, #[error("All data must be greater than 0")] NonPositiveData, #[error("Variance must be positive")] NonPositiveVariance, }Also, consider extracting the variance calculation into a separate function for better readability:
fn calculate_variance(data: &[f64], mean: f64) -> f64 { data.iter() .map(|&x| (x - mean).powi(2)) .sum::<f64>() / data.len() as f64 }
36-73
: Extract shared functionality with box_cox_log_likelihood.There's code duplication in the variance calculation between this function and
box_cox_log_likelihood
.Consider extracting shared functionality:
fn validate_data_length(data: &[f64]) -> Result<(), TransformError> { if data.is_empty() { Err(TransformError::EmptyData) } else { Ok(()) } } fn calculate_mean_and_variance(data: &[f64]) -> Result<(f64, f64), TransformError> { let n = data.len() as f64; let mean = data.iter().sum::<f64>() / n; let variance = calculate_variance(data, mean); if variance <= 0.0 { Err(TransformError::NonPositiveVariance) } else { Ok((mean, variance)) } }
112-130
: Enhance documentation for optimization parameters.The function is well-structured, but the documentation could be more detailed about the optimization parameters and their impact.
Add more detailed documentation:
/// Optimize a parameter using Brent's method. /// /// # Parameters /// * `cost` - The cost function to minimize /// * `params` - Optimization parameters: /// * `initial_param` - Starting point for optimization /// * `lower_bound` - Lower bound for parameter search /// * `upper_bound` - Upper bound for parameter search /// * `max_iterations` - Maximum number of iterations /// /// # Returns /// * `Ok(f64)` - The optimal parameter value /// * `Err(Error)` - If optimization fails or no best parameter is found
162-220
: Enhance test coverage with property-based tests and edge cases.While the current tests are good, adding property-based tests and more edge cases would improve reliability.
Consider adding:
- Property-based tests using proptest:
use proptest::prelude::*; proptest! { #[test] fn box_cox_inverse_property(x in 0.1f64..100.0, lambda in -2.0f64..2.0) { let transformed = box_cox(x, lambda)?; let original = inverse_box_cox(transformed, lambda); prop_assert!((x - original).abs() < 1e-10); } }
- Edge cases:
#[test] fn test_box_cox_extreme_values() { let data = &[f64::MIN_POSITIVE, f64::MAX]; let lambda = 0.5; let result = optimize_box_cox_lambda(data); assert!(result.is_ok()); }crates/augurs-forecaster/src/transforms.rs (2)
Line range hint
58-145
: Improve error handling and documentation in power_transform.The
power_transform
method usesexpect()
which can panic, and the documentation could be more detailed.
- Return Result instead of using expect:
-pub fn power_transform(data: &[f64]) -> Self { +pub fn power_transform(data: &[f64]) -> Result<Self, Error> { if data.iter().all(|&x| x > 0.0) { - let lambda = optimize_box_cox_lambda(data).expect("Failed to optimize lambda (box-cox)"); - Self::BoxCox { lambda } + optimize_box_cox_lambda(data).map(|lambda| Self::BoxCox { lambda }) } else { - let lambda = optimize_yeo_johnson_lambda(data).expect("Failed to optimize lambda (yeo-johnson)"); - Self::YeoJohnson { lambda } + optimize_yeo_johnson_lambda(data).map(|lambda| Self::YeoJohnson { lambda }) } }
- Enhance documentation:
/// Create a power transform that automatically selects and optimizes the appropriate transformation. /// /// # Algorithm Selection /// - If all values are positive: Uses Box-Cox transformation /// - If any values are zero or negative: Uses Yeo-Johnson transformation /// /// # Errors /// Returns an error if: /// - The optimization algorithm fails to converge /// - The data is empty /// - The data has invalid values (NaN, Infinity)
766-800
: Add tests for error cases and edge values.The current tests cover the happy path well, but could benefit from more comprehensive testing.
Add tests for:
- Error cases:
#[test] fn test_box_cox_invalid_input() { let data = vec![-1.0, 0.0, 1.0]; let lambda = 0.5; let result: Vec<_> = data.into_iter().box_cox(lambda).collect(); assert!(result.iter().any(|&x| x.is_nan())); }
- Edge cases:
#[test] fn test_yeo_johnson_edge_cases() { let data = vec![f64::MIN_POSITIVE, 0.0, f64::MAX]; let lambda = 0.5; let result: Vec<_> = data.into_iter().yeo_johnson(lambda).collect(); assert!(!result.iter().any(|&x| x.is_nan())); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
crates/augurs-forecaster/src/forecaster.rs
(1 hunks)crates/augurs-forecaster/src/power_transforms.rs
(1 hunks)crates/augurs-forecaster/src/transforms.rs
(8 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- crates/augurs-forecaster/src/forecaster.rs
🔇 Additional comments (1)
crates/augurs-forecaster/src/transforms.rs (1)
444-474
: Implementation looks good!
The functions have proper error handling, clear input validation, and handle edge cases well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (1)
crates/augurs-forecaster/src/power_transforms.rs (1)
Line range hint
589-605
: Implement Error Handling ininverse_yeo_johnson
FunctionCurrently, the
inverse_yeo_johnson
function does not handle potential mathematical errors and directly returns anf64
. To ensure robustness and consistency withinverse_box_cox
, consider modifying the function to return aResult<f64, &'static str>
and handle possible domain errors explicitly.Apply this diff to implement error handling:
-fn inverse_yeo_johnson(y: f64, lambda: f64) -> f64 { +fn inverse_yeo_johnson(y: f64, lambda: f64) -> Result<f64, &'static str> { const EPSILON: f64 = 1e-6; if y >= 0.0 && lambda.abs() < EPSILON { // For lambda close to 0 (positive values) - Ok((y.exp()) - 1.0) + Ok((y.exp()) - 1.0) } else if y >= 0.0 { // For positive values (lambda not close to 0) let value = y * lambda + 1.0; if value <= 0.0 { Err("Invalid domain for inverse Yeo-Johnson transformation") } else { - Ok(value.powf(1.0 / lambda) - 1.0) + Ok(value.powf(1.0 / lambda) - 1.0) } } else if (lambda - 2.0).abs() < EPSILON { // For lambda close to 2 (negative values) - Ok(-(-y.exp() - 1.0)) + Ok(-(-y.exp() - 1.0)) } else { // For negative values (lambda not close to 2) let value = -((2.0 - lambda) * y) + 1.0; if value <= 0.0 { Err("Invalid domain for inverse Yeo-Johnson transformation") } else { - Ok(-((value.powf(1.0 / (2.0 - lambda))) - 1.0)) + Ok(-((value.powf(1.0 / (2.0 - lambda))) - 1.0)) } } }This change ensures that any invalid computations are caught and handled appropriately, improving the reliability of the transformation.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
crates/augurs-forecaster/src/forecaster.rs
(1 hunks)crates/augurs-forecaster/src/power_transforms.rs
(1 hunks)crates/augurs-forecaster/src/transforms.rs
(8 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- crates/augurs-forecaster/src/forecaster.rs
🔇 Additional comments (3)
crates/augurs-forecaster/src/transforms.rs (3)
487-491
: 🛠️ Refactor suggestion
Propagate Errors in BoxCox
Iterator Instead of Using NaN
Using unwrap_or(f64::NAN)
in the BoxCox
iterator can mask underlying errors and make debugging difficult. Consider modifying the iterator to return Result<f64, &'static str>
, allowing errors to be handled explicitly.
Apply this diff to refactor the iterator:
impl<T> Iterator for BoxCox<T>
where
T: Iterator<Item = f64>,
{
- type Item = f64;
+ type Item = Result<f64, &'static str>;
fn next(&mut self) -> Option<Self::Item> {
self.inner
.next()
- .map(|x| box_cox(x, self.lambda).unwrap_or(f64::NAN))
+ .map(|x| box_cox(x, self.lambda))
}
}
Similarly, update the trait and usage:
trait BoxCoxExt: Iterator<Item = f64> {
- fn box_cox(self, lambda: f64) -> BoxCox<Self>
+ fn box_cox(self, lambda: f64) -> BoxCox<Self>
+ where
+ Self: Sized,
+ {
+ BoxCox {
+ inner: self,
+ lambda,
+ }
+ }
}
This change ensures that errors are not silently converted to NaN
, improving the robustness of the transformation.
568-571
: 🛠️ Refactor suggestion
Propagate Errors in YeoJohnson
Iterator Instead of Using NaN
Similar to the BoxCox
iterator, the YeoJohnson
iterator uses unwrap_or(f64::NAN)
, which can conceal errors. Modify the iterator to return Result<f64, &'static str>
to handle errors explicitly.
Apply this diff to refactor the YeoJohnson
iterator:
impl<T> Iterator for YeoJohnson<T>
where
T: Iterator<Item = f64>,
{
- type Item = f64;
+ type Item = Result<f64, &'static str>;
fn next(&mut self) -> Option<Self::Item> {
self.inner
.next()
- .map(|x| yeo_johnson(x, self.lambda).unwrap_or(f64::NAN))
+ .map(|x| yeo_johnson(x, self.lambda))
}
}
And update the trait accordingly.
154-154
:
Correct the Self::Log
Transformation in the transform
Method
In the transform
method, the Self::Log
variant currently applies input.exp()
, which computes the exponential of the input. This seems incorrect because the Log
transformation should apply the natural logarithm. Please change it to input.log()
to correctly perform the logarithm transformation.
Apply this diff to fix the issue:
- Self::Log => Box::new(input.exp()),
+ Self::Log => Box::new(input.log()),
Likely invalid or redundant comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for making all those changes @edwardcqian 🥳
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (2)
crates/augurs-forecaster/src/transforms.rs (2)
59-65
: Add precondition to BoxCox doc comment.The doc comment should mention that the input values must be positive (x > 0) for the Box-Cox transformation to be valid.
/// Box-Cox transform. + /// + /// # Preconditions + /// - Input values must be positive (x > 0) + /// BoxCox {
779-813
: Consider adding edge case tests.While the current test coverage is good, consider adding tests for:
- Box-Cox with lambda = 0 (log transform case)
- Yeo-Johnson with lambda = 0 and lambda = 2 (special cases)
- Error cases (NaN, infinity)
Example test cases:
#[test] fn box_cox_lambda_zero() { let data = vec![1.0, 2.0, 3.0]; let lambda = 0.0; let expected: Vec<_> = data.iter().map(|x| x.ln()).collect(); let actual: Vec<_> = data.into_iter().box_cox(lambda).collect(); assert_all_close(&expected, &actual); } #[test] fn yeo_johnson_special_cases() { let data = vec![-1.0, 0.0, 1.0]; // Test lambda = 0 let expected_zero = vec![ -(-1.0 + 1.0).ln(), 0.0, (1.0 + 1.0).ln() ]; let actual_zero: Vec<_> = data.clone().into_iter().yeo_johnson(0.0).collect(); assert_all_close(&expected_zero, &actual_zero); // Test lambda = 2 let expected_two = vec![ -(-(-1.0) + 1.0).ln(), 0.0, ((1.0 + 1.0).powi(2) - 1.0) / 2.0 ]; let actual_two: Vec<_> = data.into_iter().yeo_johnson(2.0).collect(); assert_all_close(&expected_two, &actual_two); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
crates/augurs-forecaster/src/transforms.rs
(8 hunks)
🔇 Additional comments (5)
crates/augurs-forecaster/src/transforms.rs (5)
135-151
: LGTM! Well-implemented power transform selection.
The implementation correctly handles the selection between Box-Cox and Yeo-Johnson transformations based on data characteristics, with proper error handling and clear documentation.
515-527
: LGTM! Good domain validation.
The implementation correctly validates that y * lambda + 1.0
is positive before computing the power function, preventing potential domain errors.
463-480
: LGTM! Robust implementation of Yeo-Johnson transform.
The implementation includes:
- Proper error handling for NaN inputs
- Correct handling of both positive and negative values
- Clear implementation of the piecewise function
595-612
: LGTM! Well-implemented inverse Yeo-Johnson transform.
The implementation includes:
- Good use of EPSILON for floating-point comparisons
- Clear handling of all cases in the piecewise function
- Well-documented code with clear comments for each case
482-498
: 🛠️ Refactor suggestion
Consider propagating errors in BoxCox iterator.
The current implementation silently converts errors to NaN using unwrap_or(f64::NAN)
, which could mask underlying issues and make debugging more difficult.
Consider propagating errors through the iterator:
struct BoxCox<T> {
inner: T,
lambda: f64,
+ error: Option<&'static str>,
}
impl<T> Iterator for BoxCox<T>
where
T: Iterator<Item = f64>,
{
- type Item = f64;
+ type Item = Result<f64, &'static str>;
fn next(&mut self) -> Option<Self::Item> {
+ if let Some(err) = self.error {
+ return Some(Err(err));
+ }
- self.inner.next().map(|x| box_cox(x, self.lambda).unwrap_or(f64::NAN))
+ self.inner.next().map(|x| box_cox(x, self.lambda))
}
}
Likely invalid or redundant comment.
Happily, cargo-semver-checks caught the addition of some variants in #185, but it's a shame that had to be a minor bump. This marks the enum as non-exhaustive so future transform additions won't necessarily be considered breaking changes. It's unlikely users will ever have to match on this anyway so this shouldn't cause any problems (it could probably be a struct with an 'inner' enum, really).
Summary by CodeRabbit
New Features
Improvements
Transforms
struct with new transformation capabilities.Tests
Forecaster
module using power transformations.