Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(engine): add sheet_name parameter to excel_write() #568

Merged
merged 2 commits into from
Oct 14, 2024

Conversation

soma00333
Copy link
Contributor

@soma00333 soma00333 commented Oct 13, 2024

Overview

Add sheet_name parameter to excel_write()
Related issue: #567

What I've done

Fix example_excel.yml

Change to pass sheetName parameter.

Add parameter to excel_write()

Add parameter to excel_write().
If parameter is given, set sheet name, if not, use Sheet1.
Define an ExcelWriterParam struct in case the number of parameters increases.

What I haven't done

How I tested

Ran the following command.

cargo run --package reearth-flow-cli -- run --workflow ./runtime/examples/plateau/testdata/workflow/example_excel.yml
   Compiling reearth-flow-action-sink v0.0.2 (/Users/soma.utsumi/workspace/eukarya/reearth-flow/engine/runtime/action-sink)
   Compiling reearth-flow-cli v0.0.2 (/Users/soma.utsumi/workspace/eukarya/reearth-flow/engine/cli)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 6.63s
     Running `target/debug/reearth-flow run --workflow ./runtime/examples/plateau/testdata/workflow/example_excel.yml`
2024-10-13T13:44:06.041Z  INFO root{otel.name="ExcelExample" otel.kind="runner" workflow.id="00caad2a-9f7d-4189-b479-153fa9ea36dc"}: reearth_flow_runner::runner: Start workflow = "ExcelExample"
2024-10-13T13:44:06.052Z  INFO action{otel.name="Source Node" otel.kind="Source Node" workflow.id="a22dbf2e-e5f1-406f-82bd-c992a4b708c9"}: reearth_flow_runtime::executor::source_node: "FeatureCreator" finish source complete. elapsed = 1.333417ms
2024-10-13T13:44:06.052Z  INFO action{otel.name="FileWriter" otel.kind="Sink Node" workflow.id="a22dbf2e-e5f1-406f-82bd-c992a4b708c9" node.id="2ae560e9-0745-4913-bf2b-49d383ce43de"}: reearth_flow_runtime::executor::sink_node: "FileWriter" sink finish. elapsed = 1.428666ms
2024-10-13T13:44:06.079Z  INFO action{otel.name="FileWriter" otel.kind="Sink Node" workflow.id="a22dbf2e-e5f1-406f-82bd-c992a4b708c9" node.id="2ae560e9-0745-4913-bf2b-49d383ce43de"}: reearth_flow_runtime::executor::sink_node: "FileWriter" finish sink complete. elapsed = 26.443042ms
2024-10-13T13:44:07.311Z  INFO root{otel.name="ExcelExample" otel.kind="runner" workflow.id="00caad2a-9f7d-4189-b479-153fa9ea36dc"}: reearth_flow_runner::runner: Finish workflow = "ExcelExample", duration = 1.270766042s

Screenshot

image

Which point I want you to review particularly

Memo

Summary by CodeRabbit

  • New Features

    • Enhanced Excel writing functionality with customizable worksheet names.
    • Added support for writing Excel files alongside existing formats (CSV, TSV, JSON).
    • New property sheetName allows users to specify the name of the Excel sheet in configurations.
  • Bug Fixes

    • Improved handling of Excel-specific parameters during file writing operations.
  • Documentation

    • Updated schema to include the new sheetName property for better user guidance.

Copy link

netlify bot commented Oct 13, 2024

Deploy Preview for reearth-flow canceled.

Name Link
🔨 Latest commit 8a4ded2
🔍 Latest deploy log https://app.netlify.com/sites/reearth-flow/deploys/670c692df536fc0008219078

@soma00333 soma00333 force-pushed the feat/engine/addparameter-to-excelwrite branch from b716fbb to 741f85d Compare October 13, 2024 13:51
@soma00333 soma00333 marked this pull request as ready for review October 13, 2024 13:56
@soma00333 soma00333 requested a review from a team as a code owner October 13, 2024 13:56
Copy link
Contributor

coderabbitai bot commented Oct 13, 2024

Walkthrough

The changes primarily involve enhancements to the write_excel functionality in the excel.rs file, introducing a new struct ExcelWriterParam to encapsulate parameters, including an optional sheet_name. The write_excel function is updated to accept this struct, allowing for dynamic worksheet naming. Additionally, the writer.rs file is modified to support Excel writing alongside other formats by adding an Excel variant to the FileWriterParam enum. The YAML example and schema for FileWriter are also updated to include the new sheetName property.

Changes

File Path Change Summary
engine/runtime/action-sink/src/file/excel.rs - Added struct ExcelWriterParam with field sheet_name.
- Updated write_excel function to accept &ExcelWriterParam.
engine/runtime/action-sink/src/file/writer.rs - Added Excel variant to FileWriterParam enum with excel_property.
- Updated to_common_param and finish methods to handle Excel variant.
engine/runtime/examples/plateau/testdata/workflow/example_excel.yml - Added sheetName: "SampleSheet" property to FileWriter action.
engine/schema/actions.json - Added optional property sheetName to FileWriterParam.

Possibly related PRs

Suggested reviewers

  • miseyu

Poem

🐰 In the fields where the data flows,
A new sheet name, how it glows!
With Excel's charm, we write with glee,
Configured sheets, as sweet as can be.
Hopping through code, with joy we sing,
A rabbit's delight in every new thing! 🌼


📜 Recent review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 917cd44 and 8a4ded2.

⛔ Files ignored due to path filters (1)
  • engine/docs/mdbook/src/action.md is excluded by !**/*.md
📒 Files selected for processing (2)
  • engine/runtime/action-sink/src/file/excel.rs (1 hunks)
  • engine/schema/actions.json (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • engine/schema/actions.json
🧰 Additional context used
🔇 Additional comments (5)
engine/runtime/action-sink/src/file/excel.rs (5)

13-14: LGTM: New imports for serialization and schema generation.

The added imports for JsonSchema, Deserialize, and Serialize are appropriate for the new ExcelWriterParam struct, enabling serialization, deserialization, and JSON schema generation.


16-20: LGTM: Well-defined ExcelWriterParam struct.

The ExcelWriterParam struct is correctly defined with:

  • Appropriate derive macros for serialization, deserialization, debugging, cloning, and JSON schema generation.
  • An optional sheet_name field, providing flexibility for users.
  • Consistent use of camelCase for JSON serialization.

This structure will facilitate future expansions of parameters for the excel_write() function.


24-24: LGTM: Updated write_excel function signature.

The addition of the params: &ExcelWriterParam parameter to the write_excel function signature is correct and aligns with the PR objective. This change allows for the new sheet naming functionality while maintaining efficiency by using a reference.


31-36: LGTM: Improved worksheet naming logic.

The changes to the worksheet naming logic are well-implemented:

  • Correctly utilizes the new ExcelWriterParam struct.
  • Maintains backward compatibility by defaulting to "Sheet1" when no name is provided.
  • Properly handles the Option<String> with clone() and unwrap_or_else.

This implementation successfully adds the requested feature while preserving existing behavior.


13-36: Summary: Excellent implementation of the new sheet naming feature.

The changes in this file successfully implement the new sheet_name parameter for the excel_write() function:

  1. New imports for serialization and schema generation are added.
  2. A well-defined ExcelWriterParam struct is introduced to encapsulate the sheet_name parameter.
  3. The write_excel function signature is updated to include the new params argument.
  4. Worksheet naming logic is improved to use the new parameter while maintaining backward compatibility.

These changes align perfectly with the PR objectives and are implemented in a clean, efficient manner. The code maintains good practices and is ready for integration.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@soma00333 soma00333 requested a review from miseyu October 13, 2024 13:56
@soma00333 soma00333 enabled auto-merge (squash) October 13, 2024 13:57
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (2)
engine/runtime/examples/plateau/testdata/workflow/example_excel.yml (1)

Line range hint 1-41: Suggestions for enhancing the workflow example

While the current implementation is correct, here are some suggestions to improve the example:

  1. Consider expanding the sample data in the FeatureCreator to include more diverse entries or additional fields. This could help demonstrate the full capabilities of the Excel output.

  2. It might be helpful to add a comment explaining the usage of env.get("outputPath") for clarity, especially for new users.

  3. To showcase more complex workflows, you could consider adding intermediate nodes for data processing or validation between FeatureCreator and FileWriter.

Here's an example of how you might expand the FeatureCreator data:

creator: |
  [
    #{
      country: "Japan",
      city: "Tokyo",
      population: 37977000,
      area: 2194,
      density: 6158
    },
    #{
      country: "Japan",
      city: "Osaka",
      population: 14977000,
      area: 2722,
      density: 4639
    },
    #{
      country: "Japan",
      city: "Nagoya",
      population: 9552000,
      area: 5172,
      density: 2625
    }
  ]

And here's how you might add a comment for the output path:

output: |
  # outputPath is set in the environment and defines the directory for output files
  env.get("outputPath") + "sample.xlsx"

These changes would make the example more comprehensive and easier to understand for users.

engine/schema/actions.json (1)

Line range hint 1363-1392: Consider similar enhancements for other formats and maintain consistency.

While the addition of the sheetName property for Excel format is a good improvement, it's worth considering if similar flexibility could benefit other output formats. For example:

  1. For CSV/TSV: Consider adding an optional headerRow property to allow customization of the first row.
  2. For JSON: An optional rootElementName could allow customization of the root JSON object name.

Additionally, maintain consistency across formats where possible. For instance, if converter is applicable to formats other than JSON, consider adding it to those formats as well.

Would you like me to propose a more generalized schema structure that could accommodate format-specific options more flexibly?

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between cc83c2d and 741f85d.

⛔ Files ignored due to path filters (2)
  • engine/docs/mdbook/src/action.md is excluded by !**/*.md
  • engine/sample.xlsx is excluded by !**/*.xlsx
📒 Files selected for processing (4)
  • engine/runtime/action-sink/src/file/excel.rs (1 hunks)
  • engine/runtime/action-sink/src/file/writer.rs (4 hunks)
  • engine/runtime/examples/plateau/testdata/workflow/example_excel.yml (1 hunks)
  • engine/schema/actions.json (1 hunks)
🧰 Additional context used
🔇 Additional comments (8)
engine/schema/actions.json (2)

1383-1392: LGTM: New sheetName property added for Excel format in FileWriter.

The addition of the sheetName property to the Excel format option in the FileWriter action is a good enhancement. It allows users to specify a custom sheet name when writing to Excel files, which increases flexibility and usability.

A few observations:

  1. The sheetName property is correctly defined as optional (can be null).
  2. It uses the #/definitions/Expr reference, allowing for dynamic sheet naming through expressions.
  3. The property is added only to the Excel format option, maintaining backwards compatibility for other formats.

Line range hint 1-2112: Overall, well-structured schema with appropriate enhancement.

The actions.json file presents a comprehensive and well-organized schema for various data processing actions. The addition of the sheetName property to the FileWriter action for Excel format is an appropriate enhancement that integrates seamlessly into the existing structure.

Key points:

  1. The schema maintains a consistent structure across different actions.
  2. Each action is clearly defined with appropriate parameters, input/output ports, and categories.
  3. The new sheetName property adheres to the established pattern for optional properties.

The change improves the functionality of the FileWriter action without disrupting the overall schema design or backwards compatibility.

engine/runtime/action-sink/src/file/writer.rs (4)

18-18: Add necessary import for ExcelWriterParam

The import of ExcelWriterParam is essential for the new Excel functionality introduced.


113-114: Include Excel-specific properties in FileWriterParam

Adding excel_property to the Excel variant of FileWriterParam allows for Excel-specific parameters, aligning with the existing pattern used in the Json variant.


126-128: Update to_common_param method to handle Excel variant

The to_common_param method now includes the Excel variant, ensuring that the common properties are correctly returned for Excel file operations.


165-167: Handle Excel variant in finish method

The finish method is updated to handle the Excel variant, invoking write_excel with the appropriate parameters. This integration ensures that Excel files are written correctly using the new functionality.

engine/runtime/action-sink/src/file/excel.rs (2)

16-20: Definition of ExcelWriterParam struct is appropriate

The struct is correctly defined with the necessary derives for serialization, deserialization, debugging, cloning, and JSON schema generation. The sheet_name field is properly declared with the correct visibility.


31-36: Setting worksheet name with sheet_name parameter works correctly

The code appropriately retrieves the sheet_name from params, defaulting to "Sheet1" if none is provided, and sets it using set_name. This enhances the flexibility of the Excel writing functionality.

@@ -33,6 +33,7 @@ graphs:
format: excel
output: |
env.get("outputPath") + "sample.xlsx"
sheetName: "SampleSheet"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Schema Missing sheetName Property

The JSON schema does not include the sheetName property for the FileWriter action. Please update the schema to include this property to ensure consistency and proper validation.

🔗 Analysis chain

LGTM: Sheet name property added correctly.

The sheetName property has been added to the FileWriter action with the value "SampleSheet". This change aligns with the PR objectives of allowing users to specify the name of the sheet when writing to an Excel file.

A few observations:

  1. The indentation is correct, maintaining the YAML structure.
  2. The property name sheetName matches the camelCase convention used in the rest of the file.
  3. The value "SampleSheet" is appropriately quoted as a string.

To ensure this change is reflected in the schema, let's verify the JSON schema file:

If the result is true, it confirms that the schema has been updated to include the new sheetName property.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that the JSON schema includes the new sheetName property
# Expected: The schema should include a sheetName property for the FileWriter action

jq '.definitions.FileWriter.properties.with.properties | keys | contains(["sheetName"])' < <(curl -s https://raw.githubusercontent.com/reearth/reearth-flow/main/engine/schema/workflow.json)

Length of output: 275


Script:

#!/bin/bash
# Description: Verify that the JSON schema includes the new sheetName property
# Expected: The schema should include a sheetName property for the FileWriter action

# Download the JSON schema and save it to a temporary file
curl -s https://raw.githubusercontent.com/reearth/reearth-flow/main/engine/schema/workflow.json -o workflow.json

# Check if the sheetName property exists in the FileWriter definition
jq '.definitions.FileWriter.properties.with.properties | has("sheetName")' workflow.json

# Clean up the temporary file
rm workflow.json

Length of output: 230


pub(super) fn write_excel(
output: &Uri,
params: &ExcelWriterParam,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Incomplete update of write_excel function calls

Found a call to write_excel in writer.rs that is missing the required params argument:

  • engine/runtime/action-sink/src/file/writer.rs: write_excel(&output, excel_property, &self.buffer, storage_resolver)

Please update all calls to include the params argument to align with the updated function signature and prevent compile-time errors.

🔗 Analysis chain

Function signature of write_excel updated correctly

The write_excel function now includes the params parameter, aligning with the added functionality for specifying the sheet name.

Please ensure that all calls to write_excel have been updated to include the new params argument to prevent any compile-time errors.

You can run the following script to verify:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that all calls to `write_excel` have been updated.

# Search for calls to `write_excel` that may not have been updated
rg --type rust 'write_excel\(' -A 2

Length of output: 499

#[derive(Serialize, Deserialize, Debug, Clone, JsonSchema)]
#[serde(rename_all = "camelCase")]
pub struct ExcelWriterParam {
pub(super) sheet_name: Option<Expr>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

The type Expr is assumed to be the target of execution of eval engine(https://github.com/reearth/reearth-flow/blob/main/engine/runtime/eval-expr/src/engine.rs). If it is just a string, then String is fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate your comment😊
Fixed with 8a4ded2 👍

@miseyu
Copy link
Contributor

miseyu commented Oct 13, 2024

@soma00333
I just wanted to comment on one point.
Please check it. 🙇

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (3)
engine/runtime/action-sink/src/file/excel.rs (3)

16-20: LGTM: Well-defined ExcelWriterParam struct.

The ExcelWriterParam struct is well-designed with appropriate derive macros and an optional sheet_name field. The camelCase renaming is consistent with JSON conventions.

Consider adding a doc comment to explain the purpose of this struct and its field:

/// Parameters for Excel writing operations.
#[derive(Serialize, Deserialize, Debug, Clone, JsonSchema)]
#[serde(rename_all = "camelCase")]
pub struct ExcelWriterParam {
    /// The name of the sheet. If not provided, defaults to "Sheet1".
    pub(super) sheet_name: Option<String>,
}

24-36: LGTM: Proper implementation of customizable sheet names.

The changes to the write_excel function correctly implement the new feature of customizable sheet names. The use of params.sheet_name.clone().unwrap_or_else(|| "Sheet1".to_string()) is a good approach to handle the Option<String>.

Consider using as_deref() to avoid cloning the String:

let sheet_name = params.sheet_name.as_deref().unwrap_or("Sheet1");
worksheet
    .set_name(sheet_name)
    .map_err(crate::errors::SinkError::file_writer)?;

This approach is more efficient as it avoids allocating a new String when the default value is used.


Line range hint 1-438: Overall assessment: Well-implemented feature addition.

The changes to add a customizable sheet name feature are well-implemented and consistent with the existing codebase. The new ExcelWriterParam struct and the modifications to the write_excel function are appropriate and don't introduce any significant issues.

As the ExcelWriterParam struct is designed to be extensible, consider the following for future development:

  1. If more parameters are added in the future, you might want to implement a builder pattern for ExcelWriterParam to make it easier to construct with optional parameters.
  2. Consider adding a Default implementation for ExcelWriterParam to provide sensible defaults for all fields.

These suggestions are not necessary for the current implementation but could be helpful as the feature set grows.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 741f85d and 917cd44.

⛔ Files ignored due to path filters (1)
  • engine/docs/mdbook/src/action.md is excluded by !**/*.md
📒 Files selected for processing (2)
  • engine/runtime/action-sink/src/file/excel.rs (1 hunks)
  • engine/schema/actions.json (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • engine/schema/actions.json
🧰 Additional context used
🔇 Additional comments (1)
engine/runtime/action-sink/src/file/excel.rs (1)

13-14: LGTM: New imports for serialization and schema generation.

The added imports for JsonSchema, Serialize, and Deserialize are appropriate for the new ExcelWriterParam struct, enabling JSON schema generation and serialization/deserialization.

@soma00333 soma00333 force-pushed the feat/engine/addparameter-to-excelwrite branch from 917cd44 to 8a4ded2 Compare October 14, 2024 00:43
@soma00333 soma00333 requested a review from miseyu October 14, 2024 00:49
Copy link
Contributor

@miseyu miseyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much.
LGTM!!

Please self merge

@soma00333 soma00333 merged commit 6a36e5a into main Oct 14, 2024
17 checks passed
@soma00333 soma00333 deleted the feat/engine/addparameter-to-excelwrite branch October 14, 2024 02:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants