Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add dataframe operations component #5341

Merged
merged 10 commits into from
Dec 19, 2024
Merged

feat: add dataframe operations component #5341

merged 10 commits into from
Dec 19, 2024

Conversation

rodrigosnader
Copy link
Contributor

@rodrigosnader rodrigosnader commented Dec 18, 2024

This pull request adds a new component to handle various operations on DataFrames in the langflow library. The main changes include the addition of a new class, DataFrameOperationsComponent, which supports multiple DataFrame operations and dynamically updates its configuration based on the selected operation.

New Component Addition:

Key features of the new component:

  • Supports a variety of DataFrame operations (e.g., Add Column, Drop Column, Filter, Sort, etc.)
  • Dynamically updates input fields based on the operation selected
  • Includes methods to perform each supported operation on a DataFrame

@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Dec 18, 2024
@ogabrielluiz ogabrielluiz changed the title add dataframe operations component feat: add dataframe operations component Dec 18, 2024
@dosubot dosubot bot added the enhancement New feature or request label Dec 18, 2024
@ogabrielluiz ogabrielluiz requested a review from Copilot December 18, 2024 17:58
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

src/backend/base/langflow/components/processing/dataframe_operations.py:207

  • The columns_to_select input should be split by a delimiter (e.g., comma) before stripping. Use columns = [col.strip() for col in self.columns_to_select.split(',')] to ensure correct column selection.
columns = [col.strip() for col in self.columns_to_select]

@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Dec 18, 2024
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Dec 18, 2024
rodrigosnader and others added 7 commits December 18, 2024 17:30
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
….py`

* **Import modules**
  - Import `pytest` and `pandas` for testing DataFrame operations

* **Define test cases**
  - Define test cases for edge cases like empty DataFrames and invalid column names
  - Include tests for operations like "Head", "Tail", and "Replace Value"
  - Use `pytest.mark.parametrize` to test multiple operations with different inputs
  - Add detailed assertions to verify the correctness of DataFrame operations
…tions.py`. This deletion includes all unit tests related to various DataFrame operations such as adding, dropping, filtering, and renaming columns, as well as handling edge cases like empty DataFrames and invalid operations. The removal streamlines the test suite by eliminating outdated or redundant tests.
- Introduced a new test file  for organizing test components.
- Updated import paths for  to reflect the new module structure.
- Refactored test cases to use  for better readability and maintainability.
- Enhanced assertions in tests for various DataFrame operations, including handling of empty DataFrames and invalid operations.
- Improved code formatting for consistency and clarity.
…intainability

- Consolidated import statements for clarity.
- Renamed variable `df` to `dataframe_copy` for better understanding.
- Streamlined the `perform_operation` method by replacing `elif` with `if` statements for clearer logic flow.
- Enhanced error message for unsupported operations to improve debugging.

These changes aim to enhance the code structure and make future modifications easier.
…ons.py`

- Modified expected values in parameterized tests for various DataFrame operations, including "Add Column", "Filter", "Sort", "Head", "Tail", and "Replace Value" to reflect new test scenarios.
- Adjusted assertions to ensure they correctly validate the output of operations, particularly for lists of expected values.
- Enhanced error handling in the test for invalid operations to provide clearer feedback on unsupported operation types.

These changes improve the accuracy and robustness of the unit tests for DataFrame operations.
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Dec 18, 2024
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Dec 19, 2024
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Dec 19, 2024
@ogabrielluiz ogabrielluiz added this pull request to the merge queue Dec 19, 2024
Merged via the queue into main with commit 62c13ad Dec 19, 2024
40 of 41 checks passed
@ogabrielluiz ogabrielluiz deleted the rn/dataframe branch December 19, 2024 16:24
anovazzi1 pushed a commit that referenced this pull request Dec 19, 2024
* add dataframe operations component

* populate entire new column with value

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* [autofix.ci] apply automated fixes

* Add unit tests for DataFrame operations in `test_dataframe_operations.py`

* **Import modules**
  - Import `pytest` and `pandas` for testing DataFrame operations

* **Define test cases**
  - Define test cases for edge cases like empty DataFrames and invalid column names
  - Include tests for operations like "Head", "Tail", and "Replace Value"
  - Use `pytest.mark.parametrize` to test multiple operations with different inputs
  - Add detailed assertions to verify the correctness of DataFrame operations

* [autofix.ci] apply automated fixes

* Remove test cases for DataFrame operations from `test_dataframe_operations.py`. This deletion includes all unit tests related to various DataFrame operations such as adding, dropping, filtering, and renaming columns, as well as handling edge cases like empty DataFrames and invalid operations. The removal streamlines the test suite by eliminating outdated or redundant tests.

* Add unit tests for DataFrame operations in

- Introduced a new test file  for organizing test components.
- Updated import paths for  to reflect the new module structure.
- Refactored test cases to use  for better readability and maintainability.
- Enhanced assertions in tests for various DataFrame operations, including handling of empty DataFrames and invalid operations.
- Improved code formatting for consistency and clarity.

* Refactor DataFrameOperationsComponent for improved readability and maintainability

- Consolidated import statements for clarity.
- Renamed variable `df` to `dataframe_copy` for better understanding.
- Streamlined the `perform_operation` method by replacing `elif` with `if` statements for clearer logic flow.
- Enhanced error message for unsupported operations to improve debugging.

These changes aim to enhance the code structure and make future modifications easier.

* Update unit tests for DataFrame operations in `test_dataframe_operations.py`

- Modified expected values in parameterized tests for various DataFrame operations, including "Add Column", "Filter", "Sort", "Head", "Tail", and "Replace Value" to reflect new test scenarios.
- Adjusted assertions to ensure they correctly validate the output of operations, particularly for lists of expected values.
- Enhanced error handling in the test for invalid operations to provide clearer feedback on unsupported operation types.

These changes improve the accuracy and robustness of the unit tests for DataFrame operations.

* Refactor DataFrameOperationsComponent methods to return DataFrame instances consistently

---------

Co-authored-by: Gabriel Luiz Freitas Almeida <gabriel@langflow.org>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request lgtm This PR has been approved by a maintainer size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants