Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NLP -> HF Datasets #1137

Merged
merged 5 commits into from
Oct 19, 2020
Merged

NLP -> HF Datasets #1137

merged 5 commits into from
Oct 19, 2020

Conversation

zphang
Copy link
Collaborator

@zphang zphang commented Oct 7, 2020

Switch from nlp to datasets==1.1.2.

In code, try to refer to HF Datasets / hf_datasets rather than just "datasets" to minimize confusion.

Tested with the following command:

BASE_PATH=/path/to/download
STATUS_PATH=${BASE_PATH}/status
mkdir -p ${STATUS_PATH}
python jiant/scripts/download_data/runscript.py \
    download \
    --benchmark GLUE \
    --output_path ${BASE_PATH}/tasks/ && \
    touch ${STATUS_PATH}/glue_done
python jiant/scripts/download_data/runscript.py \
    download \
    --benchmark SUPERGLUE \
    --output_path ${BASE_PATH}/tasks/ && \
    touch ${STATUS_PATH}/superglue_done
# Doesn't include PAN-X because it requires an external file
python jiant/scripts/download_data/runscript.py \
    download \
    --tasks xnli pawsx udpos xquad mlqa tydiqa bucc2018 tatoeba \
    --output_path ${BASE_PATH}/tasks/ && \
    touch ${STATUS_PATH}/xtreme_done
# Datasets
python jiant/scripts/download_data/runscript.py \
    download \
    --tasks snli commonsenseqa hellaswag cosmosqa socialiqa scitail\
    --output_path ${BASE_PATH}/tasks/ && \
    touch ${STATUS_PATH}/hf_datasets_done
# Files
python jiant/scripts/download_data/runscript.py \
    download \
    --tasks squad_v1 squad_v2 abductive_nli swag qamr qasrl\
    --output_path ${BASE_PATH}/tasks/ && \
    touch ${STATUS_PATH}/dl_datasets_done

@pep8speaks
Copy link

pep8speaks commented Oct 7, 2020

Hello @zphang! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 2:101: E501 line too long (118 > 100 characters)
Line 26:101: E501 line too long (101 > 100 characters)
Line 39:101: E501 line too long (106 > 100 characters)
Line 47:101: E501 line too long (126 > 100 characters)

You can repair most issues by installing black and running: black -l 100 ./*. If you contribute often, have a look at the 'Contributing' section of the README for instructions on doing this automatically.

Comment last updated at 2020-10-19 17:50:30 UTC

@zphang
Copy link
Collaborator Author

zphang commented Oct 7, 2020

(Not a high priority to merge)

@zphang zphang closed this Oct 7, 2020
@zphang zphang reopened this Oct 7, 2020
Copy link
Collaborator

@jeswan jeswan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@jeswan jeswan merged commit c3387a3 into nyu-mll:master Oct 19, 2020
leo-liuzy pushed a commit to leo-liuzy/dynamic_jiant that referenced this pull request Nov 11, 2020
Co-authored-by: jeswan <57466294+jeswan@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants