At text classification example #945

atuzhykov · 2020-03-10T10:11:19Z

What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)

How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)

Please review
https://github.com/eclipse/deeplearning4j/blob/master/CONTRIBUTING.md before opening a pull request.

Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

l2 1e-6 > 1e-3 Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

l2 1e-3 > 1e-6 Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

) Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

) Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

…r1e-4 Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>

AlexDBlack

Looking good, just a few minor improvements to make.
Can we also make a backup of the branch, then flatten + sign on this branch as described here: https://deeplearning4j.org/eclipse-contributors

Otherwise I'm happy with this 👍

AlexDBlack · 2020-03-11T06:03:21Z

...in/java/org/deeplearning4j/examples/nlp/sentencepiecernnexample/SentencePieceRNNExample.java

+ * As far model is predisposed to overfitting we also add l2 regularization and dropout for certain layers.
+ * To prepare reviews we use BertIterator, which is MultiDataSetIterator for training BERT (Transformer) models.
+ * We congigure BertIterator for supervised sequence classification:
+ * 0. As tokenizer we use BertWordPieceTokenizerFactory with provided BERT BASE UNCASED vocabulary.


Maybe let's improve this slightly, add another line under 0.:
BertIterator and BertWordPieceTokenizer implement the Word Piece sub-word tokenization algorithm, with a vocabulary size of 30522 tokens.

AlexDBlack · 2020-03-11T06:05:20Z

...in/java/org/deeplearning4j/examples/nlp/sentencepiecernnexample/SentencePieceRNNExample.java

+        int listenerFrequency = 20;
+        net.setListeners(new StatsListener(statsStorage, listenerFrequency), new ScoreIterationListener(50));
+        //Attach the StatsStorage instance to the UI: this allows the contents of the StatsStorage to be visualized
+        uiServer.attach(statsStorage);


Maybe let's comment out the UI by default, as it adds some overhead (slows down training a bit). Users can uncomment it if they want to run it with UI. That would look like this:

/* //Uncomment this section to run the example with the user interface UIServer uiServer = UIServer.getInstance(); //Configure where the network information (gradients, activations, score vs. time etc) is to be stored //Then add the StatsListener to collect this information from the network, as it trains StatsStorage statsStorage = new FileStatsStorage(new File(System.getProperty("java.io.tmpdir"), "ui-stats-" + System.currentTimeMillis() + ".dl4j")); int listenerFrequency = 20; net.setListeners(new StatsListener(statsStorage, listenerFrequency), new ScoreIterationListener(50)); //Attach the StatsStorage instance to the UI: this allows the contents of the StatsStorage to be visualized uiServer.attach(statsStorage); */ net.setListeners(new ScoreIterationListener(50));

AlexDBlack · 2020-03-11T06:05:22Z

...in/java/org/deeplearning4j/examples/nlp/sentencepiecernnexample/SentencePieceRNNExample.java

+            net.fit(train);
+
+            // Get and print accuracy, precision, recall & F1 and confusion matrix
+            Evaluation eval = net.doEvaluation(test, new Evaluation[]{new Evaluation()})[0];


For MultiLayerNetwork, we can use net.evaluate(test)

AlexDBlack · 2020-03-11T06:05:43Z

pom.xml

@@ -28,7 +28,7 @@
    <properties>
        <!-- Change the nd4j.backend property to nd4j-cuda-9.2-platform,nd4j-cuda-10.0-platform or nd4j-cuda-10.1-platform to use CUDA GPUs -->
        <nd4j.backend>nd4j-native-platform</nd4j.backend>
-<!--        <nd4j.backend>nd4j-cuda-10.2-platform</nd4j.backend>-->
+<!--        <nd4j.backend>nd4j-cuda-10.0-platform</nd4j.backend>-->


Leave this commented out with 10.2

Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>

…ation class Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>

atuzhykov added 30 commits February 20, 2020 20:04

examples added + changed nd4j backend in pom.xml to run on DGX1

de5f44c

Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

examples added + changed nd4j backend in pom.xml to run on DGX1

d65fa9a

Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

other small changes

c2967df

Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

small fix to match cuda version with container

9e87835

Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

small fix to match cuda version with container

41530e1

Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

lr 1e-3 > 4e-3 (as multiplying batchsize*k, lr*sqrt(k))

3198021

l2 1e-6 > 1e-3 Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

lr 1e-3 > 4e-3 (as multiplying batchsize*k, lr*sqrt(k))

9a1ba54

l2 1e-3 > 1e-6 Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

experiment0

95aa639

Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

experiment1 (notes belong to commit name are here http://tiny.cc/yashkz)

7501626

Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

experiment2 (notes belong to commit name are here http://tiny.cc/yashkz)

091c386

Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

experiment3 (notes belong to commit name are here http://tiny.cc/yashkz)

86a6518

Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

experiment4 (notes belong to commit name are here http://tiny.cc/yashkz)

d669d6f

Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

experiment5 (notes belong to commit name are here http://tiny.cc/yashkz)

578a186

Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

experiment6 (notes belong to commit name are here http://tiny.cc/yashkz)

fe42967

Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

experiment7 (notes belong to commit name are here http://tiny.cc/yashkz)

302f7bb

Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

experiment8 (notes belong to commit name are here http://tiny.cc/yashkz)

bb933c5

Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

experiment9 (notes belong to commit name are here http://tiny.cc/yashkz)

000f7d8

Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

experiment9 (notes belong to commit name are here http://tiny.cc/yashkz)

79014f8

Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

experiment10 (notes belong to commit name are here http://tiny.cc/yashkz

4a5cffd

) Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

experiment10 (notes belong to commit name are here http://tiny.cc/yashkz

8ef7519

) Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

experiment11 (notes belong to commit name are here http://tiny.cc/yashkz

b99d59a

) Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

experiment12 (notes belong to commit name are here http://tiny.cc/yashkz

1aabba1

) Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

experiment12 (notes belong to commit name are here http://tiny.cc/yashkz

dc3f3b3

) Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

experiment12 (notes belong to commit name are here http://tiny.cc/yashkz

f16d8ac

) Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

experiment12 (notes belong to commit name are here http://tiny.cc/yashkz

36ae8ee

) Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

experiment13 (notes belong to commit name are here http://tiny.cc/yashkz

0de07d8

) Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

experiment14 (notes belong to commit name are here http://tiny.cc/yashkz

f2eece6

) Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

baseline conf + LengthHandling.FIXED_LENGTH=256

ff91a96

Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

baselineconf+LengthHandling.FIXED_LENGTH=256+Bidirectional_lstm

2e99c55

Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

baselineconf+LengthHandling.FIXED_LENGTH=256+Bidirectional_lstm+lr1e-4

e107c2c

Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

atuzhykov and others added 27 commits March 5, 2020 01:03

experiment10 (notes belong to commit name are here http://tiny.cc/yashkz

b2f6510

) Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

experiment10 (notes belong to commit name are here http://tiny.cc/yashkz

c5de18f

) Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

experiment11 (notes belong to commit name are here http://tiny.cc/yashkz

145e1fd

) Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

experiment12 (notes belong to commit name are here http://tiny.cc/yashkz

2a72f0a

) Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

experiment12 (notes belong to commit name are here http://tiny.cc/yashkz

3f07a38

) Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

experiment12 (notes belong to commit name are here http://tiny.cc/yashkz

6256e85

) Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

experiment12 (notes belong to commit name are here http://tiny.cc/yashkz

da14588

) Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

experiment13 (notes belong to commit name are here http://tiny.cc/yashkz

cb26a75

) Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

experiment14 (notes belong to commit name are here http://tiny.cc/yashkz

f0e1241

) Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

baseline conf + LengthHandling.FIXED_LENGTH=256

c4f9d9a

Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

baselineconf+LengthHandling.FIXED_LENGTH=256+Bidirectional_lstm

02d47fd

Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

baselineconf+LengthHandling.FIXED_LENGTH=256+Bidirectional_lstm+lr1e-4

61b63f8

Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

baselineconf+LengthHandling.FIXED_LENGTH=256+Bidirectional_lstm_256

820eda5

Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

baselineconf+LengthHandling.FIXED_LENGTH=256+Bidirectional_lstm_256_l…

2c757c0

…r1e-4 Signed-off-by: atuzhykov <andrewtuzhykov@gmail.com>

base_conf+bidir_LSTM_256_layersize_Adam_lr1e-3_SGD_lr1e-3_for_EmbdLayer

53efeec

base_conf+bidir_LSTM_256_layersize_Adam_lr1e-3_SGD_lr1e-3_for_EmbdLayer

1bbb9e0

base_conf+bidir_LSTM_256_layersize_Nadam_lr1e-3

880cd30

base_conf+bidir_LSTM_256_layersize_Nadam_lr1e-3

9555d55

Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>

base_conf+bidir_LSTM_256_layersize_Nadam_lr1e-3

8aab2fb

Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>

base_conf+3x_bidir_LSTM_256_layersize_Nadam_lr1e-3

542db86

Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>

base_conf+3xbidir_LSTM_256_layersize_Adam_lr1e-3_l21e-5

46dffc0

Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>

base_conf+3x_bidir_LSTM_256_layersize_Adam_Sheduled_lr

c5a979a

Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>

base_conf+2x_bidir_LSTM_256_Adam_lr1e-3_lstm_dropout_075

6e390eb

Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>

prefinal examples

7945508

Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>

prefinal

74162ff

Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>

changed package and class name, added trained model URL

13a2392

Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>

fixed required changes

5b1b710

Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>

AlexDBlack suggested changes Mar 11, 2020

View reviewed changes

atuzhykov added 2 commits March 11, 2020 12:00

fixed new round of required changes

a22956d

Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>

small issue belong to match BertIterator and DataSetIterator in Evalu…

5e7df4f

…ation class Signed-off-by: Andrii Tuzhykov <andrewtuzhykov@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

At text classification example #945

At text classification example #945

Uh oh!

atuzhykov commented Mar 10, 2020

Uh oh!

AlexDBlack left a comment

Uh oh!

AlexDBlack Mar 11, 2020

Uh oh!

AlexDBlack Mar 11, 2020

Uh oh!

AlexDBlack Mar 11, 2020

Uh oh!

AlexDBlack Mar 11, 2020

Uh oh!

At text classification example #945

Are you sure you want to change the base?

At text classification example #945

Uh oh!

Conversation

atuzhykov commented Mar 10, 2020

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

AlexDBlack left a comment

Choose a reason for hiding this comment

Uh oh!

AlexDBlack Mar 11, 2020

Choose a reason for hiding this comment

Uh oh!

AlexDBlack Mar 11, 2020

Choose a reason for hiding this comment

Uh oh!

AlexDBlack Mar 11, 2020

Choose a reason for hiding this comment

Uh oh!

AlexDBlack Mar 11, 2020

Choose a reason for hiding this comment

Uh oh!