markdown
AmazonReviewScraperWorkflow
The AmazonReviewScraperWorkflow is a Yeager component designed to scrape Amazon product reviews and return a spreadsheet URL containing the scraped data. This component is part of a larger Yeager Workflow that performs specific data scraping and transformation tasks.
product_name
(str): Name of the Amazon product for which the reviews will be scraped.
spreadsheet_url
(str): URL of the spreadsheet containing the scraped reviews.
Both input and output data are modeled using Pydantic's BaseModel to define and validate the required fields.
args
(InputProduct): An instance of the InputProduct model containing theproduct_name
.callbacks
(typing.Any): Callback functions for various events within the transform method. Default value isNone
.
The transform
method of the AmazonReviewScraperWorkflow component takes the args
and callbacks
as input and performs the following steps:
- Invokes the transform method of the base
AbstractWorkflow
class using the inputargs
andcallbacks
. This returns a dictionary containing the results of the data transformation tasks. - Retrieves the spreadsheet URL from the
results_dict
using thespreadsheet_created
key. - Creates an instance of the
OutputProductReviews
model with thespreadsheet_url
. - Returns the
OutputProductReviews
instance containing the spreadsheet URL.
dotenv
: Library used to load environment variables from a.env
file.fastapi
: Library used to build the API for handling input and output.pydantic
: Library used for data validation and serialization through data models (InputProduct and OutputProductReviews).
The component doesn't explicitly make external API calls within the given source code. However, it is part of a larger Yeager Workflow that may require external API calls to access Amazon product reviews and create the spreadsheet containing the scraped data.
Errors are not explicitly handled within the provided source code of this component. However, any exceptions that occur during the execution of the transform
method will propagate to the caller for handling. Usage of Pydantic's BaseModel for input and output validation ensures that invalid data will be automatically rejected.