CS224n_exercise

Stanford 2019 CS224n:Natural Language Processing with Deep Learning: course link

Schedule

Due Date	Assignment	Done
1/15	Assignment1: Exploring Word Vector	V
1/22	Assignment2: Word2vec	V
1/29	Assignment3: Dependency Parsing	V
2/7	Assignment4: Neural Machine Traslation with RNNs	V
2/22	Assignment5: NMT system with Convolution encoder and LSTM decoder (requires Stanford login)	--

Issue

(NOTE) 2019/2/9: PASS all test in sanity_check.py (1d, 1e, 1f) , not finish the VM section in GPU.

(CLOSE) 2019/1/20: TA just update the test result of this example.

The DIFFERENCE is: if you test the results before running on Stanford Sentiment Treebank as I did, you will get LOSS= 16.15119285363322; once you run python run.py and turn back running python word2vec.py, you will get LOSS = 14.3018669327. Be careful!

--Original topic--

In assignment2/word2vec.py, my result of loss, gradCenterVec, gradOutsideVecs using Skip-Gram with negSamplingLossAndGradient, not close to the expected value as TA given, can anyone give me some advice? Is there any tiny problem that I ignored? Thanks.

Skip-Gram with negSamplingLossAndGradient
Your Result:
Loss: 16.15119285363322
Gradient wrt Center Vectors (dJ/dV):
 [[ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [-4.54650789 -1.85942252  0.76397441]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]]
 Gradient wrt Outside Vectors (dJ/dU):
 [[-0.69148188  0.31730185  2.41364029]
 [-0.22716495  0.10423969  0.79292674]
 [-0.45528438  0.20891737  1.58918512]
 [-0.31602611  0.14501561  1.10309954]
 [-0.80620296  0.36994417  2.81407799]]

Expected Result: Value should approximate these:
Loss: 14.3018669327
Gradient wrt Center Vectors (dJ/dV):
 [[ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [-3.86035429 -2.8660339  -0.9739887 ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]]
 Gradient wrt Outside Vectors (dJ/dU):
 [[-0.30559455  0.14022886  1.06668785]
 [-0.12708467  0.05831563  0.44359323]
 [-0.45528438  0.20891737  1.58918512]
 [-0.73739425  0.33836976  2.57389893]
 [-0.64496237  0.29595533  2.25126239]]

Furthermore, I search the concept of negative sampling in NLP and I am considering of the problem as following: the way to calculate loss using the positive samples (bcz they used the 1 - negative probability) , make sense or not? Yes

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
assignment1		assignment1
assignment2		assignment2
assignment3		assignment3
assignment4		assignment4
assignment5		assignment5
images		images
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CS224n_exercise

Schedule

Issue

About

Releases

Packages

Languages

youngmihuang/cs224n_exercise

Folders and files

Latest commit

History

Repository files navigation

CS224n_exercise

Schedule

Issue

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages