Skip to content

Commit

Permalink
Add consumer complaints generation dataset (#3685)
Browse files Browse the repository at this point in the history
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Justin Zhao <justinxzhao@gmail.com>
  • Loading branch information
3 people authored Oct 10, 2023
1 parent 4b3ccad commit 2772e9a
Showing 1 changed file with 28 additions and 0 deletions.
28 changes: 28 additions & 0 deletions ludwig/datasets/configs/consumer_complaints_generation.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
version: 1.0
name: consumer_complaints_generation
download_urls: https://predibase-public-us-west-2.s3.us-west-2.amazonaws.com/datasets/consumer_complaints_gen_tutorial.csv
train_filenames: consumer_complaints_gen_tutorial.csv
description: |
The dataset contains different information of complaints that customers have made about a multiple products and
services in the financial sector, such us Credit Reports, Student Loans, Money Transfer, etc. The date of each
complaint ranges from November 2011 to May 2019. The dataset has been modified to be used for text generation.
We have added a structured JSON field that contains a company generated response to the raised complaint. The idea
is to fine-tune an LLM to generate this output JSON field.
columns:
- name: Date received
type: Date
- name: Generated Company Response
type: text
- name: Complaint ID
type: number
- name: Issue
type: text
- name: Product
type: text
- name: Structured JSON Output
type: text
- name: Consumer complaint narrative
type: text
output_features:
- name: Structured JSON Output
type: text

0 comments on commit 2772e9a

Please sign in to comment.