Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Text classification datasets with new torchtext dataset abstraction #701

Merged
merged 48 commits into from
Apr 21, 2020

Commits on Feb 28, 2020

  1. new dataset design

    Guanheng Zhang committed Feb 28, 2020
    Configuration menu
    Copy the full SHA
    9d8aa61 View commit details
    Browse the repository at this point in the history
  2. remove doc

    Guanheng Zhang committed Feb 28, 2020
    Configuration menu
    Copy the full SHA
    1031efb View commit details
    Browse the repository at this point in the history
  3. minor

    Guanheng Zhang committed Feb 28, 2020
    Configuration menu
    Copy the full SHA
    371c23e View commit details
    Browse the repository at this point in the history

Commits on Mar 2, 2020

  1. Configuration menu
    Copy the full SHA
    29e5d0a View commit details
    Browse the repository at this point in the history
  2. revise build_vocab func in torchtext.experimental.datasets.new_text_c…

    …lassification
    Guanheng Zhang committed Mar 2, 2020
    Configuration menu
    Copy the full SHA
    8187b84 View commit details
    Browse the repository at this point in the history
  3. flake8

    Guanheng Zhang committed Mar 2, 2020
    Configuration menu
    Copy the full SHA
    6272f57 View commit details
    Browse the repository at this point in the history
  4. docs

    Guanheng Zhang committed Mar 2, 2020
    Configuration menu
    Copy the full SHA
    15e2a94 View commit details
    Browse the repository at this point in the history

Commits on Mar 3, 2020

  1. switch transforms to torch.nn.Module

    Guanheng Zhang committed Mar 3, 2020
    Configuration menu
    Copy the full SHA
    6611a09 View commit details
    Browse the repository at this point in the history
  2. add default None to tokenizer_name in TokenizerTransform

    Guanheng Zhang committed Mar 3, 2020
    Configuration menu
    Copy the full SHA
    7c31d2e View commit details
    Browse the repository at this point in the history
  3. jit support for Dict[str, int] vocab in VocabTransform

    Guanheng Zhang committed Mar 3, 2020
    Configuration menu
    Copy the full SHA
    c5d487a View commit details
    Browse the repository at this point in the history
  4. remove F821

    Guanheng Zhang committed Mar 3, 2020
    Configuration menu
    Copy the full SHA
    7c9c969 View commit details
    Browse the repository at this point in the history
  5. add functional.py file

    Guanheng Zhang committed Mar 3, 2020
    Configuration menu
    Copy the full SHA
    4689b29 View commit details
    Browse the repository at this point in the history
  6. minor fix to have split tokenizer scriptable

    Guanheng Zhang committed Mar 3, 2020
    Configuration menu
    Copy the full SHA
    fe90c51 View commit details
    Browse the repository at this point in the history
  7. add functional.py file

    Guanheng Zhang committed Mar 3, 2020
    Configuration menu
    Copy the full SHA
    d9ef2ee View commit details
    Browse the repository at this point in the history

Commits on Mar 12, 2020

  1. add a wrapper to support one-command data loading

    Guanheng Zhang committed Mar 12, 2020
    Configuration menu
    Copy the full SHA
    e98ae46 View commit details
    Browse the repository at this point in the history
  2. add raw file

    Guanheng Zhang committed Mar 12, 2020
    Configuration menu
    Copy the full SHA
    8ce6779 View commit details
    Browse the repository at this point in the history

Commits on Mar 13, 2020

  1. flake8

    Guanheng Zhang committed Mar 13, 2020
    Configuration menu
    Copy the full SHA
    8291ebc View commit details
    Browse the repository at this point in the history

Commits on Mar 19, 2020

  1. update raw text classification dataset docs

    Guanheng Zhang committed Mar 19, 2020
    Configuration menu
    Copy the full SHA
    1864e7d View commit details
    Browse the repository at this point in the history
  2. minor docs

    Guanheng Zhang committed Mar 19, 2020
    Configuration menu
    Copy the full SHA
    64cbde6 View commit details
    Browse the repository at this point in the history
  3. add ngrams

    Guanheng Zhang committed Mar 19, 2020
    Configuration menu
    Copy the full SHA
    51d1b8e View commit details
    Browse the repository at this point in the history
  4. add label transform

    Guanheng Zhang committed Mar 19, 2020
    Configuration menu
    Copy the full SHA
    855e701 View commit details
    Browse the repository at this point in the history

Commits on Mar 20, 2020

  1. combine imdb and text classification datasets

    Guanheng Zhang committed Mar 20, 2020
    Configuration menu
    Copy the full SHA
    a955579 View commit details
    Browse the repository at this point in the history
  2. add more attributes to dataset API

    Guanheng Zhang committed Mar 20, 2020
    Configuration menu
    Copy the full SHA
    fa3565b View commit details
    Browse the repository at this point in the history
  3. update text classification datasets docs

    Guanheng Zhang committed Mar 20, 2020
    Configuration menu
    Copy the full SHA
    94870df View commit details
    Browse the repository at this point in the history
  4. remove two transforms

    Guanheng Zhang committed Mar 20, 2020
    Configuration menu
    Copy the full SHA
    74f50b6 View commit details
    Browse the repository at this point in the history
  5. add get_vocab in text_classification

    Guanheng Zhang committed Mar 20, 2020
    Configuration menu
    Copy the full SHA
    55e4848 View commit details
    Browse the repository at this point in the history
  6. minor fix

    Guanheng Zhang committed Mar 20, 2020
    Configuration menu
    Copy the full SHA
    db66774 View commit details
    Browse the repository at this point in the history
  7. Add TextSequential

    Guanheng Zhang committed Mar 20, 2020
    Configuration menu
    Copy the full SHA
    5a20115 View commit details
    Browse the repository at this point in the history
  8. swithc text classification to TextSequential

    Guanheng Zhang committed Mar 20, 2020
    Configuration menu
    Copy the full SHA
    650928a View commit details
    Browse the repository at this point in the history

Commits on Mar 23, 2020

  1. fix flake8 error

    Guanheng Zhang committed Mar 23, 2020
    Configuration menu
    Copy the full SHA
    2447837 View commit details
    Browse the repository at this point in the history
  2. add vocab to dataset

    Guanheng Zhang committed Mar 23, 2020
    Configuration menu
    Copy the full SHA
    be20884 View commit details
    Browse the repository at this point in the history
  3. add docs strings for transforms.

    Guanheng Zhang committed Mar 23, 2020
    Configuration menu
    Copy the full SHA
    a6bc30a View commit details
    Browse the repository at this point in the history
  4. move raw datasets to a separate folder

    Guanheng Zhang committed Mar 23, 2020
    Configuration menu
    Copy the full SHA
    3821282 View commit details
    Browse the repository at this point in the history
  5. .flake8 file

    Guanheng Zhang committed Mar 23, 2020
    Configuration menu
    Copy the full SHA
    9b97ac2 View commit details
    Browse the repository at this point in the history
  6. move raw text folder

    Guanheng Zhang committed Mar 23, 2020
    Configuration menu
    Copy the full SHA
    b565565 View commit details
    Browse the repository at this point in the history
  7. move transforms to experimental.datasets.text_classification

    Guanheng Zhang committed Mar 23, 2020
    Configuration menu
    Copy the full SHA
    ebe87f7 View commit details
    Browse the repository at this point in the history
  8. Fix IMDB

    Guanheng Zhang committed Mar 23, 2020
    Configuration menu
    Copy the full SHA
    e382503 View commit details
    Browse the repository at this point in the history

Commits on Apr 1, 2020

  1. remove some transforms in experimental text classification

    Guanheng Zhang committed Apr 1, 2020
    Configuration menu
    Copy the full SHA
    c711c34 View commit details
    Browse the repository at this point in the history

Commits on Apr 9, 2020

  1. switch raw dataset to iterable style

    Guanheng Zhang committed Apr 9, 2020
    Configuration menu
    Copy the full SHA
    9a0c3ac View commit details
    Browse the repository at this point in the history
  2. add squential_transforms

    Guanheng Zhang committed Apr 9, 2020
    Configuration menu
    Copy the full SHA
    c6f6a42 View commit details
    Browse the repository at this point in the history
  3. Merge branch 'master' into new_dataset_design

    Guanheng Zhang committed Apr 9, 2020
    Configuration menu
    Copy the full SHA
    f1d394c View commit details
    Browse the repository at this point in the history
  4. add get_iterator func

    Guanheng Zhang committed Apr 9, 2020
    Configuration menu
    Copy the full SHA
    7404519 View commit details
    Browse the repository at this point in the history
  5. flake8

    Guanheng Zhang committed Apr 9, 2020
    Configuration menu
    Copy the full SHA
    bc2c83a View commit details
    Browse the repository at this point in the history
  6. support partial cache for raw text classification dataset

    Guanheng Zhang committed Apr 9, 2020
    Configuration menu
    Copy the full SHA
    644b759 View commit details
    Browse the repository at this point in the history

Commits on Apr 13, 2020

  1. Merge branch 'master' into new_dataset_design

    Guanheng Zhang committed Apr 13, 2020
    Configuration menu
    Copy the full SHA
    2f93dec View commit details
    Browse the repository at this point in the history

Commits on Apr 14, 2020

  1. change None arguments

    Guanheng Zhang committed Apr 14, 2020
    Configuration menu
    Copy the full SHA
    aa15019 View commit details
    Browse the repository at this point in the history
  2. change import raw path

    Guanheng Zhang committed Apr 14, 2020
    Configuration menu
    Copy the full SHA
    793349c View commit details
    Browse the repository at this point in the history

Commits on Apr 21, 2020

  1. Configuration menu
    Copy the full SHA
    33053e8 View commit details
    Browse the repository at this point in the history