You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A new abstraction has been described in 0.5.0 release note. Currently, we are working on retiring a few legacy codes in torchtext in the next releases. This issue will track the progress of the relevant work. Here are a few steps that users could expect:
Step 1: Retire legacy codes in torchtext.data and torchtext.datasets
The following components will be retired from source code soon. We have added a few deprecation warning messages in 0.7.0 release (link). Users can still find them in torchtext.legacy and the original constructors will raise error when calling them.
A few legacy datasets above have been re-written and are currently available in torchtext.experimental.datasets. They will be released to the core library:
Step 3: Retire legacy vocab/vector and release the new data processing building blocks
We also re-written the vocabulary and word vectors as high performance building blocks with the JIT support. We will retire the following components
torchtext.vocab.Vocab
torchtext.vocab.Vectors along with GloVe, FastText, CharNGram.
After this, the new vocabulary and vector building blocks in the experimental folder will be moved to the core library.
torchtext.experimental.vectors
torchtext.experimental.vocab
We also have some transforms that will be released to the core library.
torchtext.experimental.transforms
In general, we understand this is the special time for the torchtext library because we have to handle the legacy code and new building blocks at the same time. We really appreciate the efforts from the OSS community. Users should use the code in the three categories with the following expectations:
legacy folder - we will accept bug fix but not new features
torchtext main folder - we officially support via the stable release and carefully handle BC breaking.
experimental folder - experimental components available via nightly release channel. Users might experience BC breaking without warning messages.
The text was updated successfully, but these errors were encountered:
A new abstraction has been described in 0.5.0 release note. Currently, we are working on retiring a few legacy codes in torchtext in the next releases. This issue will track the progress of the relevant work. Here are a few steps that users could expect:
Step 1: Retire legacy codes in
torchtext.data
andtorchtext.datasets
The following components will be retired from source code soon. We have added a few deprecation warning messages in 0.7.0 release (link). Users can still find them in
torchtext.legacy
and the original constructors will raise error when calling them.torchtext.data.field
- RawField, Field, ReversibleField, SubwordField, NestedField, LabelFieldtorchtext.data.iterator
- BucketIterator, Iterator, BPTTIteratortorcthtext.data.dataset
- Dataset, TabularDatasettorchtext.data.example
- Exampletorchtext.data.pipeline
- Pipelinetorchtext.data.batch
- BatchAt the same time, the datasets in
torchtext.datasets
are based on the legacy code above so they will be moved to the legacy folder:LanguageModelingDataset
,WikiText2
,WikiText103
,PennTreebank
SNLI
,MultiNLI
,XNLI
SST
TranslationDataset
,Multi30k
,IWSLT
,WMT14
SequenceTaggingDataset
,UDPOS
,CoNLL2000Chunking
TREC
IMDB
BABI20
Step 2: Release the new datasets
A few legacy datasets above have been re-written and are currently available in
torchtext.experimental.datasets
. They will be released to the core library:LanguageModelingDataset
,WikiText2
,WikiText103
,PennTreebank
,WMTNewsCrawl
AG_NEWS
,SogouNews
,DBpedia
,YelpReviewPolarity
,YelpReviewFull
,YahooAnswers
,AmazonReviewPolarity
,AmazonReviewFull
,IMDB
UDPOS
,CoNLL2000Chunking
Multi30k
,IWSLT
,WMT14
SQuAD1
,SQuAD2
Step 3: Retire legacy vocab/vector and release the new data processing building blocks
We also re-written the vocabulary and word vectors as high performance building blocks with the JIT support. We will retire the following components
torchtext.vocab.Vocab
torchtext.vocab.Vectors
along withGloVe
,FastText
,CharNGram
.After this, the new vocabulary and vector building blocks in the
experimental
folder will be moved to the core library.torchtext.experimental.vectors
torchtext.experimental.vocab
We also have some transforms that will be released to the core library.
torchtext.experimental.transforms
In general, we understand this is the special time for the torchtext library because we have to handle the legacy code and new building blocks at the same time. We really appreciate the efforts from the OSS community. Users should use the code in the three categories with the following expectations:
legacy
folder - we will accept bug fix but not new featurestorchtext
main folder - we officially support via the stable release and carefully handle BC breaking.experimental
folder - experimental components available via nightly release channel. Users might experience BC breaking without warning messages.The text was updated successfully, but these errors were encountered: