Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Enabling SST2 dataset usage in fbcode (#1426)
* include pytorch 1.5.0-rc1 for CI test * bump up the version * Set up ShipIt fbshipit-source-id: bb7d2eb52240c7223b57c3c9624e61d116e77e39 * Re-sync with internal repository (#749) * 20200429 pytorch/text import Summary: [20:45:34: cpuhrsch@devvm3140 pytorch]$ ./fb_build/import_text.sh Reviewed By: pbelevich Differential Revision: D21320577 fbshipit-source-id: ac2148b9f0d58e5538443c879845bfb4f6ca7202 * 20200430 torchtext import script to include additional meta files Summary: ./fb_build/import_text.sh Reviewed By: zhangguanheng66 Differential Revision: D21343124 fbshipit-source-id: c08ecad2cc6f439fa40130aeaf91383be9403fe8 * torchtext flake8, github, travis metafiles Summary: See title Reviewed By: pbelevich Differential Revision: D21344211 fbshipit-source-id: a8bcf7f3ab9bb2c2853e27f612e82caa341d3651 * Import torchtext 20200520 and update build Summary: Import torchtext up to #786 Reviewed By: cpuhrsch Differential Revision: D21483116 fbshipit-source-id: bc8ab38db9dc9ce4a8734ca8ea991c20e4ef0882 * Import torchtext 20200528 Summary: Import up to #798 Addresses T67599333 Reviewed By: zhangguanheng66 Differential Revision: D21764935 fbshipit-source-id: f44d1db637799f2e95f420a8099fbf19545c7cbd * 20200604 torchtext github import Summary: Import from github master Reviewed By: zhangguanheng66 Differential Revision: D21886238 fbshipit-source-id: a8f098e299466dd1701fe7ceb6a97c2a2fc54b9d * Import torchtext 20200605 Summary: Import from github master Reviewed By: zhangguanheng66 Differential Revision: D21907519 fbshipit-source-id: f22370d97796da5f2cb9f76f506c80f18fefea7f * Back out "Import torchtext 20200605" Summary: Original commit changeset: f22370d97796 Reviewed By: zhangguanheng66 Differential Revision: D21964222 fbshipit-source-id: c316836596fc3e232e63abc59e172f237b551cc5 * Import torchtext 2020/06/22 Summary: Import from github torchtext/master Reviewed By: zhangguanheng66, cpuhrsch Differential Revision: D22168183 fbshipit-source-id: 7d96ade64f18942d9bd19437011be2f65f0b2a5e * Fix torch.testing._internal module not found Reviewed By: Nayef211 Differential Revision: D22315715 fbshipit-source-id: 6b8b8544b0aa458cf5e7e9ca380d0dc85c98189f * Import torchtext 2020/07/07 Summary: Import from github torchtext/master Reviewed By: cpuhrsch Differential Revision: D22420576 fbshipit-source-id: 4d2c19d7f1db8f698894ca406c1c44b2ad8e0506 * remediation of S205607 fbshipit-source-id: 5113fe0c527595e4227ff827253b7414abbdf7ac * remediation of S205607 fbshipit-source-id: 798decc90db4f13770e97cdce3c0df7d5421b2a3 * Import torchtext 2020/07/21 Summary: Import from github torchtext/master Reviewed By: zhangguanheng66 Differential Revision: D22641140 fbshipit-source-id: 8190692d059a937e25c5f93506581086f389c291 * Remove .python3 markers Reviewed By: ashwinp-fb Differential Revision: D22955630 fbshipit-source-id: f00ef17a905e4c7cd9196c8924db39f9cdfe8cfa * Import torchtext 2020/08/06 Summary: Import from github torchtext/master Reviewed By: zhangguanheng66 Differential Revision: D22989210 fbshipit-source-id: 083464e188b758a8746123f4dd2197cc7edc4bc4 * Import torchtext 2020/08/18 Summary: Import from github torchtext/master Reviewed By: cpuhrsch Differential Revision: D23190596 fbshipit-source-id: 1568a25a5bd6431bcef3c6539f64a3ab1f5bccd7 * Import torchtext from 8aecbb9 Reviewed By: hudeven Differential Revision: D23451795 fbshipit-source-id: 73e6130c16716919c77862cef4ca4c8048428670 * Import torchtext 9/4/2020 Reviewed By: Nayef211 Differential Revision: D23539397 fbshipit-source-id: 88dce59418a3071cbc9e944cf0a4cf2117d7d9f7 * Import github torchtext on 9/9/2020 Reviewed By: cpuhrsch Differential Revision: D23616189 fbshipit-source-id: 365debc987326145eead7456ed48517fe55cac96 * Add property support for ScriptModules (#42390) Summary: Pull Request resolved: pytorch/pytorch#42390 **Summary** This commit extends support for properties to include ScriptModules. **Test Plan** This commit adds a unit test that has a ScriptModule with a user-defined property. `python test/test_jit_py3.py TestScriptPy3.test_module_properties` Test Plan: Imported from OSS Reviewed By: eellison, mannatsingh Differential Revision: D22880298 Pulled By: SplitInfinity fbshipit-source-id: 74f6cb80f716084339e2151ca25092b6341a1560 * sync with OSS torchtext 9/15/20 Reviewed By: cpuhrsch Differential Revision: D23721167 fbshipit-source-id: 13b32091c422a3ed0ae299595d69a7afa7136638 * Import Github torchtext on 9/28/2020 Reviewed By: cpuhrsch Differential Revision: D23962265 fbshipit-source-id: 0d042878fe9119aa725e982ab7d5e96e7c885a59 * Enable @unused syntax for ignoring properties (#45261) Summary: Pull Request resolved: pytorch/pytorch#45261 **Summary** This commit enables `unused` syntax for ignoring properties. Inoring properties is more intuitive with this feature enabled. `ignore` is not supported because class type properties cannot be executed in Python (because they exist only as TorchScript types) like an `ignored` function and module properties that cannot be scripted are not added to the `ScriptModule` wrapper so that they may execute in Python. **Test Plan** This commit updates the existing unit tests for class type and module properties to test properties ignored using `unused`. Test Plan: Imported from OSS Reviewed By: navahgar, Krovatkin, mannatsingh Differential Revision: D23971881 Pulled By: SplitInfinity fbshipit-source-id: 8d3cc1bbede7753d6b6f416619e4660c56311d33 * Import Github torchtext on 10/11/2020 Reviewed By: cpuhrsch Differential Revision: D24242037 fbshipit-source-id: 605d81412c320373f1158c51dbb120e7d70d624d * make duplicate def() calls an error in the dispatcher. Updating all fb operators to use the new dispatcher registration API (#47322) Summary: Pull Request resolved: pytorch/pytorch#47322 Updating all call-sites of the legacy dispatcher registration API in fbcode to the new API. I migrated all call sites that used the legacy dispatcher registration API (RegisterOperators()) to use the new API (TORCH_LIBRARY...). I found all call-sites by running `fbgs RegisterOperators()`. This includes several places, including other OSS code (nestedtensor, torchtext, torchvision). A few things to call out: For simple ops that only had one registered kernel without a dispatch key, I replaced them with: ``` TORCH_LIBRARY_FRAGMENT(ns, m) { m.def("opName", fn_name); } ``` For ops that registered to a specific dispatch key / had multiple kernels registered, I registered the common kernel (math/cpu) directly inside a `TORCH_LIBRARY_FRAGMENT` block, and registered any additional kernels from other files (e.g. cuda) in a separate `TORCH_LIBRARY_IMPL` block. ``` // cpu file TORCH_LIBRARY_FRAGMENT(ns, m) { m.def("opName(schema_inputs) -> schema_outputs"); m.impl("opName", torch::dispatch(c10::DispatchKey::CPU, TORCH_FN(cpu_kernel))); } // cuda file TORCH_LIBRARY_IMPL(ns, CUDA, m) { m.impl("opName", torch::dispatch(c10::DispatchKey::CUDA, TORCH_FN(cuda_kernel))); } ``` Special cases: I found a few ops that used a (legacy) `CPUTensorId`/`CUDATensorId` dispatch key. Updated those to use CPU/CUDA- this seems safe because the keys are aliased to one another in `DispatchKey.h` There were a handful of ops that registered a functor (function class) to the legacy API. As far as I could tell we don't allow this case in the new API, mainly because you can accomplish the same thing more cleanly with lambdas. Rather than delete the class I wrote a wrapper function on top of the class, which I passed to the new API. There were a handful of ops that were registered only to a CUDA dispatch key. I put them inside a TORCH_LIBRARY_FRAGMENT block, and used a `def()` and `impl()` call like in case two above. Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D24714803 Pulled By: bdhirsh fbshipit-source-id: c809aad8a698db3fd0d832f117f833e997b159e1 * Revert D24714803: make duplicate def() calls an error in the dispatcher. Updating all fb operators to use the new dispatcher registration API Differential Revision: D24714803 Original commit changeset: c809aad8a698 fbshipit-source-id: fb2ada65f9fc00d965708d202bd9d050f13ef467 * Import torchtext on Nov 20, 2020 Summary: Import torchtext on the commit of 633548a allow-large-files Reviewed By: cpuhrsch Differential Revision: D25127691 fbshipit-source-id: 3a617f5f4849df452f8a102a77ce11a1bce5af1f * Updating all call-sites of the legacy dispatcher registration API in fbcode to the new API. (#48178) Summary: Pull Request resolved: pytorch/pytorch#48178 I migrated all call sites that used the legacy dispatcher registration API (RegisterOperators()) to use the new API (TORCH_LIBRARY...). I found all call-sites by running `fbgs RegisterOperators()`. This includes several places, including other OSS code (nestedtensor, torchtext, torchvision). A few things to call out: For simple ops that only had one registered kernel without a dispatch key, I replaced them with: ``` TORCH_LIBRARY_FRAGMENT(ns, m) { m.def("opName", fn_name); } ``` For ops that registered to a specific dispatch key / had multiple kernels registered, I registered the common kernel (math/cpu) directly inside a `TORCH_LIBRARY_FRAGMENT` block, and registered any additional kernels from other files (e.g. cuda) in a separate `TORCH_LIBRARY_IMPL` block. ``` // cpu file TORCH_LIBRARY_FRAGMENT(ns, m) { m.def("opName(schema_inputs) -> schema_outputs"); m.impl("opName", torch::dispatch(c10::DispatchKey::CPU, TORCH_FN(cpu_kernel))); } // cuda file TORCH_LIBRARY_IMPL(ns, CUDA, m) { m.impl("opName", torch::dispatch(c10::DispatchKey::CUDA, TORCH_FN(cuda_kernel))); } ``` Special cases: I found a few ops that used a (legacy) `CPUTensorId`/`CUDATensorId` dispatch key. Updated those to use CPU/CUDA- this seems safe because the keys are aliased to one another in `DispatchKey.h` There were a handful of ops that registered a functor (function class) to the legacy API. As far as I could tell we don't allow this case in the new API, mainly because you can accomplish the same thing more cleanly with lambdas. Rather than delete the class I wrote a wrapper function on top of the class, which I passed to the new API. There were a handful of ops that were registered only to a CUDA dispatch key. I put them inside a TORCH_LIBRARY_FRAGMENT block, and used a `def()` and `impl()` call like in case two above. Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D25056090 Pulled By: bdhirsh fbshipit-source-id: 8f868b45f545e5da2f21924046e786850eba70d9 * Import torchtext from github into fbcode on 1/11/2021 Reviewed By: cpuhrsch Differential Revision: D25873762 fbshipit-source-id: 0d34d36aeb8e7e2ce72fcf345c5e7e713ef3663c * Import torchtext from github #1121 d56fffe Summary: Import torchtext from github #1121 d56fffe Reviewed By: zhangguanheng66 Differential Revision: D25976268 fbshipit-source-id: 81589f8988a54cc12f17f0a6f298a915e829a830 * Import the hidden files in torchtext github repo Reviewed By: mthrok Differential Revision: D26001386 fbshipit-source-id: f822f0f32232d3006ef629937520dee6c0faf414 * add a newline mark to config.yml file (#1128) Reviewed By: zhangguanheng66 Differential Revision: D26369003 fbshipit-source-id: 09ca48f9705d8663b06e6a329a6b64b24f9c148e * Replace model with full name when spacy load is used (#1140) Reviewed By: zhangguanheng66 Differential Revision: D26369005 fbshipit-source-id: b1e6b5d77810bb8f67d14b8a1c7ec0a9f4831cab * Fix the num_lines argument of the setup_iter func in RawTextIterableDataset (#1142) Reviewed By: zhangguanheng66 Differential Revision: D26368999 fbshipit-source-id: 4b50e5d9e5fbdf633e8b3f0072223eed050af793 * Fix broken CI tests due to spacy 3.0 release (#1138) Reviewed By: zhangguanheng66 Differential Revision: D26368998 fbshipit-source-id: 84e883562a9a3d0fe47b54823b22f7b2cd82fca4 * Switch data_select in dataset signature to split (#1143) Reviewed By: zhangguanheng66 Differential Revision: D26369006 fbshipit-source-id: 608f42fa180db9ebcfaaeadc6b8cdd29393262af * Add offset arg in the raw text dataset (#1145) Reviewed By: zhangguanheng66 Differential Revision: D26368996 fbshipit-source-id: 52741015139c302b7b0ddf8c8f50ab45a609fd2f * switch to_ivalue to __prepare_scriptable__ (#1080) Reviewed By: zhangguanheng66 Differential Revision: D26368995 fbshipit-source-id: 0352c04e422c835350bd42df35d4054d543fee36 * Pass an embedding layer to the constructor of the BertModel class (#1135) Reviewed By: zhangguanheng66 Differential Revision: D26369001 fbshipit-source-id: f5a67a2a812d568073505ec4d181f6e418eb4a3f * add __next__ method to RawTextIterableDataset (#1141) Reviewed By: zhangguanheng66 Differential Revision: D26368997 fbshipit-source-id: f5ef78f5f4a224db497f47f774eaddedd0498b4b * Add func to count the total number of parameters in a model (#1134) Reviewed By: zhangguanheng66 Differential Revision: D26369000 fbshipit-source-id: c687c0f0c2697dbd9c17a79a1291a2e279bbd1b8 * Retire the legacy code in torchtext library and fix the dependency of the downstream libraries Summary: This diff is doing: 1) move the legacy code in torchtext to the legacy folder; 2) for the downstream libraries in fbcode, if they are using the legacy code, add "legacy" to the path. Reviewed By: cpuhrsch Differential Revision: D23718437 fbshipit-source-id: 1660868aaa95ac6555ad6793dda5ce02a9acdc08 * Sync torchtext GH<->fbcode until GH commit 1197514 Summary: Import recent torchtext changes up until GH commit 1197514 Reviewed By: zhangguanheng66 Differential Revision: D26824967 fbshipit-source-id: fc4be4f94a8f748ce2ed5e776e30a42422cbcab9 * 20210304[2] Sync torchtext GH<->fbcode until GH commit 2764143 Summary: Sync up until commit in title Reviewed By: zhangguanheng66 Differential Revision: D26829429 fbshipit-source-id: a059a36d83b3803dfed9198d0e474e0e75f94f17 * 20210308 Sync torchtext GH <-> fbcode Summary: Import latest GH changes Reviewed By: zhangguanheng66 Differential Revision: D26888371 fbshipit-source-id: cc27f51fd89ad86b8bcfb8f286ad874ab01b1fd6 * Re-name raw_datasets.json file with jsonl extension Reviewed By: cpuhrsch Differential Revision: D26923978 fbshipit-source-id: c87c7776445e05d452f6b38244bf4cdaba45bdec * 20210329 Sync torchtext up to GH commit eb5e39d Summary: Sync torchtext up to GH commit eb5e39d Reviewed By: parmeet Differential Revision: D27400885 fbshipit-source-id: 1f8f92ca42ba36d070db6740b3bb4c148f69586b * Import torchtext #1267 93b03e4 Summary: Imported latest from github Master PR#1267 Reviewed By: cpuhrsch Differential Revision: D27503970 fbshipit-source-id: 853ff895ba42b1feb7442abe1c87478e43d62e5b * Import torchtext #1266 ba0bf52 Summary: Import torchtext from github Reviewed By: parmeet Differential Revision: D27803909 fbshipit-source-id: 9cb0f15858b1417cb5868d5651513eb2df998fbe * Import torchtext #1287 fab63ed Reviewed By: parmeet Differential Revision: D27922562 fbshipit-source-id: 3c18cd9e2583e03471461ad8a22ac6b0ceb596a2 * Import torchtext #1293 d2a0776 Summary: Importing torchtext from github for regular sync. Reviewed By: cpuhrsch Differential Revision: D27983819 fbshipit-source-id: 5806421d788afaa872f5320b5f4cbcd913e103ea * Import torchtext #1291 0790ce6 Reviewed By: parmeet Differential Revision: D28101664 fbshipit-source-id: a8643b3ecf85de2cb815dcfa5789a4a5d246d80f * adding __contains__ method to experimental vocab (#1297) Reviewed By: cpuhrsch Differential Revision: D28111696 fbshipit-source-id: fef195941492493a399adb37339cfa64795e22a0 * Import torchtext #1292 ede6ce6 Summary: This diff syncs torchtext GH with fbcode Reviewed By: cpuhrsch Differential Revision: D28321356 fbshipit-source-id: 7736f0d100941627b58424911a1329b1ce66c123 * Added APIs for default index and removed unk token (#1302) Reviewed By: parmeet Differential Revision: D28478153 fbshipit-source-id: bfcaffe8fe48e96d8df454f7df0d25ec39d5d4a6 * Swapping experimental Vocab and retiring current Vocab into legacy (#1289) Summary: allow-large-files to commit wikitext103_vocab.pt Reviewed By: cpuhrsch Differential Revision: D28478152 fbshipit-source-id: c2a871439f054024b95c05f7664a84028aacaca3 * Import torchtext #1313 36e33e2 Summary: Importing from Github Reviewed By: cpuhrsch Differential Revision: D28572929 fbshipit-source-id: 2e7b00aadeda6ab0596ef23295f41c5b0fa246e7 * Adding API usage logging Summary: Adding API usage logging for Vocab module Reviewed By: colin2328 Differential Revision: D28585537 fbshipit-source-id: 38975b523fb597412fbcb18ef831bfb4834cb420 * Import torchtext #1314 99557ef Reviewed By: parmeet Differential Revision: D28683381 fbshipit-source-id: 7bfbf445dd512f0ce21c34096cf3f08332d90138 * Import torchtext #1325 57a1df3 Reviewed By: NicolasHug Differential Revision: D28994054 fbshipit-source-id: 4c679f56ef37b18f6d2acaaaed8518facbeaa41c * Import torchtext #1328 ca514f6 Summary: Import torchtext #1328 ca514f6 Reviewed By: NicolasHug Differential Revision: D29120370 fbshipit-source-id: 229586f3470bd61bfb2f6a390d79e45d4eae3b4d * up the priority of numpy array comparisons in self.assertEqual (#59067) (#1340) * Re-sync with internal repository (#1343) * up the priority of numpy array comparisons in self.assertEqual (#59067) Summary: Fixes pytorch/pytorch#58988. Pull Request resolved: pytorch/pytorch#59067 Reviewed By: jbschlosser Differential Revision: D28986642 Pulled By: heitorschueroff fbshipit-source-id: 3ef2d26b4010fc3519d0a1a020ea446ffeb46ba0 * Import torchtext #1300 0435df1 Summary: Import torchtext #1300 0435df1 Reviewed By: parmeet Differential Revision: D29371832 fbshipit-source-id: 624280ddfa787a4e7628e60fa673cb9df0a66641 * Import torchtext #1345 8cf471c Summary: Import from github Reviewed By: hudeven Differential Revision: D29441995 fbshipit-source-id: 27731ce2714c16180d11bfb26af5d5a2dba408b1 * Import torchtext #1352 7ab50af Summary: Import from github Reviewed By: NicolasHug Differential Revision: D29537684 fbshipit-source-id: 25b1fc1e6d9f930e83f5f2939788b90b083aeaa2 * Enabling torchtext datasets access via manifold and iopath Summary: We would like to add and access torchtext datasets on manifold. This Diff unifies the dataset download from external links and through manifold for internal access. This is enabled via io_path package. The main idea is to plugin the download hooks in the download_from_url function. The download hooks will delegate the download to appropriate Path Handler. In OSS we have enabled download via https and google drive. Internally, we replace the download hook to download data from manifold. We have created a _download_hooks.py file under /fb/ folder which will replace the corresponding file in OSS. The file under /fb/ folder converts the http/https URL paths into corresponding manifold paths and download the data from there. Reviewed By: hudeven Differential Revision: D28892389 fbshipit-source-id: 3b66544dd2345075e2e7c524f344db04aa2a24e3 * Import torchtext #1361 05cb992 Summary: Import from github Reviewed By: hudeven Differential Revision: D29856211 fbshipit-source-id: 6332f9bdf3cf4eef572c5423db15101ea904d825 * Import torchtext #1365 c57b1fb Summary: Import torchtext #1365 c57b1fb Reviewed By: parmeet Differential Revision: D29940816 fbshipit-source-id: 6b2495b550a7e6b6110b0df12de51a87b0d31c1c * Moving Roberta building blocks to torchtext Summary: This is the first step in moving Roberta Model from pytext_lib into PyTorch Text Library. Here we moved the Roberta building blocks into pytorch/text/fb/nn/modules. The code-base is organized according to WIP document https://docs.google.com/document/d/1c0Fs-v97pndLrT3bdfGRGeUeEC38UcDpibvgOXkbS-g/edit#heading=h.3ybcf0ic42yp Reviewed By: hudeven Differential Revision: D29671800 fbshipit-source-id: d01daa99e0a5463716660722381db9a0eeb083f8 * Enabling torchtext availability in @mode/opt Summary: More details on context and solution: D29973934 Note that in this implementation, we rely on over-riding behavior of _init_extention() function. This is in similar spirit where we over-ride behavior of download hooks to accommodate necessary changes needed to enable functionality on fbcode. Reviewed By: mthrok Differential Revision: D30494836 fbshipit-source-id: b2b015263fa1bca2ef4d4214909e469df3fbe327 * Import torchtext #1382 aa12e9a Summary: Import torchtext #1382 aa12e9a Reviewed By: parmeet Differential Revision: D30584905 fbshipit-source-id: fba23cd19f31fc7826114dd2eb402c8f7b0553df * Simplify cpp extension initialization process Summary: Simplifying the cpp extension initialization process by following torchaudio's implementation in D30633316 Reviewed By: mthrok Differential Revision: D30652618 fbshipit-source-id: f80ac150fa50b1edc22419b21412f64e77064c5d * fixed bug with incorrect variable name in dataset_utils.py Summary: - ValueError was outputting `fn` instead of `func` - Similar fix done in torchdata https://github.com/facebookexternal/torchdata/pull/167 Reviewed By: ejguan Differential Revision: D31149667 fbshipit-source-id: 2c1228287d513895f8359cb97935252f0087d738 * Import torchtext #1410 0930843 Summary: Import latest from github Reviewed By: Nayef211 Differential Revision: D31745899 fbshipit-source-id: e4ac5c337bcbd1a8809544add7679dd3da242999 * Import torchtext #1406 1fb2aed Summary: Import latest from github Reviewed By: Nayef211 Differential Revision: D31762288 fbshipit-source-id: f439e04f903d640027660cb969d6d9e00e7ed4a0 * Import from github 10/18/21 Summary: Syncing torchtext github main branch to fbcode Reviewed By: parmeet Differential Revision: D31841825 fbshipit-source-id: 9c1a05295e6557ff411e56eb719cb439d5c424ba * Import torchtext #1420 0153ead Summary: Import latest from github Reviewed By: Nayef211 Differential Revision: D31871772 fbshipit-source-id: 989f5a453ef7680592df27e4174f465d11a2fbf8 * Import torchtext #1421 bcc1455 Summary: Syncing torchtext github main branch to fbcode Reviewed By: parmeet Differential Revision: D31873514 fbshipit-source-id: 1a964a67ce7ee73f5acf3a1e3f8118028c2dd46e * Enable OSS torchtext XLMR Base/Large model on fbcode Summary: Enable access to open-source torchtext XLMR base/large implementation by: 1) Uploading models/transform weights on manifold 2) Patching public URL with manifold URL (similar to what we have for datasets) Note that we didn't enabled model tests since it takes relatively long to download huge models weights from manifold. We would rely on Open-source signals when making changes to model implementation, and we need to ensure the any update in weights on AWS cloud is also replicated on manifold. Reviewed By: hudeven Differential Revision: D31844166 fbshipit-source-id: 62a4e9a3a8580ab93c3beb3af69be7361f1cc937 * enabling SST2 dataset usage in fbcode Summary: Enable access to open-source torchtext SST2 dataset by: - Uploading SST2 dataset on manifold - Swapping public URL with manifold URL in fbcode by implementing a dummy `HTTPReader` wrapper class - The wrapper class does URL mapping and calls `IoPathFileLoaderDataPipe` on the manifold URL - Enabled SST2Dataset unit tests within fbcode Reviewed By: parmeet Differential Revision: D31876606 fbshipit-source-id: fdde14a67cce835da216b296e1a0024e1d1fc7a9 * Fixed imoporting is_module_available Co-authored-by: Guanheng Zhang <zhangguanheng@devfair0197.h2.fair> Co-authored-by: Christian Puhrsch <cpuhrsch@devfair0129.h2.fair> Co-authored-by: cpuhrsch <cpuhrsch@fb.com> Co-authored-by: Moto Hira <moto@fb.com> Co-authored-by: George Guanheng Zhang <zhangguanheng@fb.com> Co-authored-by: Stanislau Hlebik <stash@fb.com> Co-authored-by: Andres Suarez <asuarez@fb.com> Co-authored-by: Meghan Lele <meghanl@fb.com> Co-authored-by: Brian Hirsh <hirsheybar@fb.com> Co-authored-by: Vasilis Vryniotis <vvryniotis@fb.com> Co-authored-by: Jeff Hwang <jeffhwang@fb.com> Co-authored-by: Parmeet Singh Bhatia <parmeetbhatia@fb.com> Co-authored-by: Artyom Astafurov <asta@fb.com> Co-authored-by: Nicolas Hug <nicolashug@fb.com> Co-authored-by: Heitor Schueroff <heitorschueroff@fb.com> Co-authored-by: Facebook Community Bot <facebook-github-bot@users.noreply.github.com> Co-authored-by: Philip Meier <github.pmeier@posteo.de> Co-authored-by: Vincent Quenneville-Belair <vincentqb@fb.com> Co-authored-by: Yao-Yuan Yang <yyyang@fb.com> Co-authored-by: nayef211 <n63ahmed@edu.uwaterloo.ca>
- Loading branch information