Skip to content

Commit

Permalink
Revert "quick fix on concatenating text to support more datasets (hug…
Browse files Browse the repository at this point in the history
…gingface#8474)"

This reverts commit 677c6d9.
  • Loading branch information
fabiocapsouza authored Nov 15, 2020
1 parent 408656c commit a3bad48
Show file tree
Hide file tree
Showing 3 changed files with 3 additions and 3 deletions.
2 changes: 1 addition & 1 deletion examples/language-modeling/run_clm.py
Original file line number Diff line number Diff line change
Expand Up @@ -254,7 +254,7 @@ def tokenize_function(examples):
tokenize_function,
batched=True,
num_proc=data_args.preprocessing_num_workers,
remove_columns=column_names,
remove_columns=[text_column_name],
load_from_cache_file=not data_args.overwrite_cache,
)

Expand Down
2 changes: 1 addition & 1 deletion examples/language-modeling/run_mlm.py
Original file line number Diff line number Diff line change
Expand Up @@ -292,7 +292,7 @@ def tokenize_function(examples):
tokenize_function,
batched=True,
num_proc=data_args.preprocessing_num_workers,
remove_columns=column_names,
remove_columns=[text_column_name],
load_from_cache_file=not data_args.overwrite_cache,
)

Expand Down
2 changes: 1 addition & 1 deletion examples/language-modeling/run_plm.py
Original file line number Diff line number Diff line change
Expand Up @@ -279,7 +279,7 @@ def tokenize_function(examples):
tokenize_function,
batched=True,
num_proc=data_args.preprocessing_num_workers,
remove_columns=column_names,
remove_columns=[text_column_name],
load_from_cache_file=not data_args.overwrite_cache,
)

Expand Down

0 comments on commit a3bad48

Please sign in to comment.