Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update run_glue for do_predict with local test data (#9442) #9486

Conversation

forest1988
Copy link
Contributor

@forest1988 forest1988 commented Jan 8, 2021

What does this PR do?

Currently, run_glue.py cannot use the test set (do_predict) unless we give it a GLUE task name.
This PR will allow us to use the local test dataset.

As commented in #9442, I tried to achieve the functionality with only simple changes.

  • It still works with only the local train and valid files (in other words, this PR does not break the current operation.).
  • If we add --do_predict with out adding specific params, we will get an error statement saying that we need either the GLUE task name or the path of the local test file.

Fixes #9442

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

@sgugger

Thank you for your kind comments on the issue.
I have tried to keep it simple and hope there is no problem as an example script.

@forest1988
Copy link
Contributor Author

forest1988 commented Jan 8, 2021

Error messages of the CircleCI are:

-- Docs: https://docs.pytest.org/en/stable/warnings.html
=========================== short test summary info ============================
FAILED tests/test_pipelines_conversational.py::SimpleConversationPipelineTests::test_history_cache
FAILED tests/test_pipelines_conversational.py::SimpleConversationPipelineTests::test_integration_torch_conversation
==== 2 failed, 4207 passed, 1744 skipped, 734 warnings in 190.84s (0:03:10) ====
FAILED tests/test_pipelines_conversational.py::SimpleConversationPipelineTests::test_history_cache
==== 1 failed, 4178 passed, 1774 skipped, 735 warnings in 260.31s (0:04:20) ====

I'm sorry but I'd like to ask you if run_glue.py is related to the conversation pipeline.

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The failures seems spurious indeed. I've left some comment to move the test dataset creation a bit, but it overall looks good to me.

Comment on lines 420 to 428
if data_args.task_name is None and data_args.test_file is not None:
extension = data_args.test_file.split(".")[-1]
assert extension in ["csv", "json"], "`test_file` should be a csv or a json file."
if data_args.test_file.endswith(".csv"):
# Loading a dataset from a local csv file
test_dataset = load_dataset("csv", data_files={"test": data_args.test_file})
else:
# Loading a dataset from a local json file
test_dataset = load_dataset("json", data_files={"test": data_args.test_file})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we put those lines earlier, with the validation dataset? This way the map will be done with the other dataset. I think we can do something nice by creating data_files={"train": data_args.train_file, "validation": data_args.validation_file} and then adding the keys test if the test_file is passed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your comment! I've reflected the review.
The nested if statements in the code have increased, but I think the readability may have improved in terms of "whether to use GLUE task or to use local files". What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also added the logger output to make sure that the local files a user wants to use are loaded correctly. If this is superfluous, please let me know and I will remove it.

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Left two more small comments and it should be good to merge!

if training_args.do_predict:
if data_args.test_file is not None:
extension = data_args.test_file.split(".")[-1]
assert extension in ["csv", "json"], "`test_file` should be a csv or a json file."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extension will need to be the same one as for the training and validation file, so we should adapt this assert to test that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reflecting the comments, assert now checks that the test file has the same extension as the train file.
Also, I thought there was no check if the validation file has the same extension as the train file, so I modified that. Is this change OK?

Comment on lines 234 to 236
datasets = load_dataset(
"json", data_files=data_files
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
datasets = load_dataset(
"json", data_files=data_files
)
datasets = load_dataset("json", data_files=data_files)

Can fit in one line now :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! 
It may be that the old code before applying the auto-format was left in place.
I have applied the auto-format in b2936c3, could you please check if it is fit in one line?

Comment on lines +224 to +225
for key in data_files.keys():
logger.info(f"load a local file for {key}: {data_files[key]}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could info to log, thanks for adding!

@sgugger sgugger requested a review from LysandreJik January 11, 2021 14:12
Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks for adding!

@sgugger sgugger merged commit eabad8f into huggingface:master Jan 13, 2021
@forest1988
Copy link
Contributor Author

@sgugger @LysandreJik
Thank you for reviewing and merging!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[examples/text-classification] do_predict for the test set of local datasets
3 participants