Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clean up docstrings: TextCleaner #8202

Merged
merged 2 commits into from
Aug 13, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 10 additions & 11 deletions haystack/components/preprocessors/text_cleaner.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,14 @@
@component
class TextCleaner:
"""
A PreProcessor component to clean text data.
Cleans the text strings.
dfokina marked this conversation as resolved.
Show resolved Hide resolved

It can remove substrings matching a list of regular expressions, convert text to lowercase, remove punctuation,
and remove numbers.
It can remove substrings matching a list of regular expressions, convert text to lowercase,
remove punctuation, and remove numbers.
Use it to clean up text data before evaluation.

This is useful to clean up text data before evaluation.
### Usage example

Usage example:
```python
from haystack.components.preprocessors import TextCleaner

Expand All @@ -38,13 +38,12 @@ def __init__(
remove_numbers: bool = False,
):
"""
Initialize the TextCleaner component.
Initializes the TextCleaner component.

:param remove_regexps: A list of regular expressions. If provided, it removes substrings
matching these regular expressions from the text.
:param convert_to_lowercase: If True, converts all characters to lowercase.
:param remove_punctuation: If True, removes punctuation from the text.
:param remove_numbers: If True, removes numerical digits from the text.
:param remove_regexps: A list of regex patterns to remove matching substrings from the text.
:param convert_to_lowercase: If `True`, converts all characters to lowercase.
:param remove_punctuation: If `True`, removes punctuation from the text.
:param remove_numbers: If `True`, removes numerical digits from the text.
"""
self._remove_regexps = remove_regexps
self._convert_to_lowercase = convert_to_lowercase
Expand Down