-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
model accuracy is very low,i don't know why? #74
Comments
may i ask you where did you find zhihu-word2vec-title-desc.bin-100 file. |
i get it from author Baidu cloud sharing,we can also use google word2vec to generate it. |
I use create_voabulary funtion to generate vocab_label.pik to substitute zhihu-word2vec-title-desc.bin-100 file. |
link:https://pan.baidu.com/s/1orPKC0cahrIW0CUvPxts1g |
Thank you very much. Thank you for sharing. But after many days of trying, I found it is too hard to understand the code for me.I had given up learning the project so couldn't share with you my results. |
Hello, my F1 score is very low on single label classification as below: |
hi, which kinds of data and trining size you use?
…________________________________
发件人: lreaderl <notifications@github.com>
发送时间: 2018年8月6日 0:11
收件人: brightmart/text_classification
抄送: Subscribed
主题: Re: [brightmart/text_classification] model accuracy is very low,i don't know why? (#74)
Hello, my F1 score is very low on single label classification as below:
Epoch 19 Validation Loss:2.709 F1 Score:0.282 Precision:0.169 Recall:0.846
Have you find any solution to that?
―
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub<#74 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/ASuYMDD_MNZlJrFH23OnnS_VRVZ28d8kks5uNxk1gaJpZM4Vi0yW>.
|
My dataset has 19 classes, with about 100000 training samples. And the average length of training data is about 150. |
@f20500909 ok,thanks. |
@brightmart Does the length of training sample affect the accuracy of the model?one of my datasets,the average length of the sample is 10,but i pad_sequences them to 20,50 or 100 when i train model,accuracy is low. |
if implement correctly, pad should have mini impact to performance, as you can mask out the embedding from pad token. |
I load the zhihu-word2vec-title-desc.bin-100 as the wordvector file,train-zhihu4-only-title-all.txt as the trainning file,set multi_label_flag=false,use_embedding=true,
a01_FastText
a03_TextRNN
a04_TextRCNN
a05_HierarchicalAttentionNetwork
a06_Seq2seqWithAttention
these models can run,but the accuracy is very low,i don't know why.
and predict,also set multi_label_flag=false,use_embedding=true,there will be more than one prediction label,i need you help.thanks.
The text was updated successfully, but these errors were encountered: