-
Notifications
You must be signed in to change notification settings - Fork 27.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[examples] add main_process_first
context manager to datasets map calls
#12363
Comments
Can I take this? |
Yes, thank you, @bhadreshpsavani |
Hi @stas00 and @sgugger, transformers/examples/pytorch/question-answering/utils_qa.py Lines 416 to 425 in 9a75459
Shall we use logger.info() instead print() like we did in below codetransformers/examples/pytorch/question-answering/utils_qa.py Lines 228 to 237 in 9a75459
or is it intensionally written like this? Because of this when we run the
|
good catch, @bhadreshpsavani! Please feel free to make a separate PR if you don't want to mix this with this particular change. |
Hi @stas00 and @sgugger,
we are getting
fix is, predict_dataset.remove_columns("label") shall we change it? it is also present at below line
|
yes, except you now need to assign the return value since this is no longer an inplace edit. Therefore in both places it'll be now be:
with the right x of course. thank you for fixing it. reference: https://huggingface.co/docs/datasets/processing.html#removing-one-or-several-columns-remove-columns |
I have committed changes in the open PR for the fix of this warning! |
We need to replay this addition that has been modelled in
run_translation.py
in #12351 to all other pytorch examplesThe actual changes for the model example are:
https://github.com/huggingface/transformers/pull/12351/files#diff-09777f56cee1060a535a72ce99a6c96cdb7f330c8cc3f9dcca442b3f7768237a
(just
run_translation.py
)Here is a time-saver:
I noticed other scripts may have other
datasets.map
calls, which get automatically rewritten by the scripts above, so please review the changes to see if thedesc
needs to be modified. But we want to use the context manager on all of these calls, it's possible that the perl rewrite scripts didn't catch some.templates/adding_a_new_example_script/\{\{cookiecutter.directory_name\}\}/run_\{\{cookiecutter.example_shortcut\}\}.py
can do via perl or manually or whatever other way works for you.
And please validate that scripts still work, by either running:
or running each script manually as explained in its corresponding
README.md
file.This issue is open to all and should be very simple to complete, the main effort is to validate.
And thank you for your contribution!
The text was updated successfully, but these errors were encountered: