-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add consumer complaints generation dataset (#3685)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Justin Zhao <justinxzhao@gmail.com>
- Loading branch information
1 parent
4b3ccad
commit 2772e9a
Showing
1 changed file
with
28 additions
and
0 deletions.
There are no files selected for viewing
28 changes: 28 additions & 0 deletions
28
ludwig/datasets/configs/consumer_complaints_generation.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
version: 1.0 | ||
name: consumer_complaints_generation | ||
download_urls: https://predibase-public-us-west-2.s3.us-west-2.amazonaws.com/datasets/consumer_complaints_gen_tutorial.csv | ||
train_filenames: consumer_complaints_gen_tutorial.csv | ||
description: | | ||
The dataset contains different information of complaints that customers have made about a multiple products and | ||
services in the financial sector, such us Credit Reports, Student Loans, Money Transfer, etc. The date of each | ||
complaint ranges from November 2011 to May 2019. The dataset has been modified to be used for text generation. | ||
We have added a structured JSON field that contains a company generated response to the raised complaint. The idea | ||
is to fine-tune an LLM to generate this output JSON field. | ||
columns: | ||
- name: Date received | ||
type: Date | ||
- name: Generated Company Response | ||
type: text | ||
- name: Complaint ID | ||
type: number | ||
- name: Issue | ||
type: text | ||
- name: Product | ||
type: text | ||
- name: Structured JSON Output | ||
type: text | ||
- name: Consumer complaint narrative | ||
type: text | ||
output_features: | ||
- name: Structured JSON Output | ||
type: text |