Skip to content

Commit

Permalink
support training with only positive pairs (DatasetFormats.C)
Browse files Browse the repository at this point in the history
  • Loading branch information
SeanLee97 committed Feb 7, 2024
1 parent 21e9032 commit ee91924
Showing 1 changed file with 2 additions and 0 deletions.
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -229,6 +229,8 @@ We support two dataset formats:

2) `DatasetFormats.B`: it is a triple format with three columns: `text`, `positive`, and `negative`. `positive` and `negative` store the positive and negative samples of `text`.

3) `DatasetFormats.C`: it is a pair format with two columns: `text`, `positive`. `positive` store the positive sample of `text`.

You need to prepare your data into huggingface `datasets.Dataset` in one of the formats in terms of your supervised data.

### 2. Train
Expand Down

0 comments on commit ee91924

Please sign in to comment.