Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

效果不理想,是要更新词库吗? #14

Open
ahumoon7421 opened this issue Dec 27, 2017 · 4 comments
Open

效果不理想,是要更新词库吗? #14

ahumoon7421 opened this issue Dec 27, 2017 · 4 comments

Comments

@ahumoon7421
Copy link

Loading model cost 1.286 seconds.
Prefix dict has been built succesfully.
2017-12-27 14:20:24.445937: I C:\tf_jenkins\home\workspace\rel-win\M\windows\PY
35\tensorflow\core\platform\cpu_feature_guard.cc:137] Your CPU supports instruct
ions that this TensorFlow binary was not compiled to use: AVX AVX2

hello
WARN:词汇不在服务区
你好
WARN:词汇不在服务区
呵呵

哈哈


WARN:词汇不在服务区

@HCIS2020
Copy link

question 和 answer就各有1000个样本, 所以效果比较有限

这个版本采用的是TF的seq2seq函数,目前应该有one-hot的的问题吧,支持Word2Vector的版本什么时候更新

@cfso2475
Copy link

cfso2475 commented Apr 5, 2018

感觉是这个参数的问题。
min_freq = 10

默认的值为10导致好多词没有进词表,也就是训练的序列本身和question以及answer的文本差异比较大。
按照现有的1000条文本,词频都不高,临时改成1可能好一些。

@alige32
Copy link

alige32 commented Sep 14, 2018

一个是楼上说的min_freq的问题,2、3效果会比较好,1太多低频词反而有副作用。另外size可根据过滤后词总数适当调高,基于这1000条样本的话10、12效果都是不错的。

@Z1hgq
Copy link

Z1hgq commented Mar 10, 2019

1000条3的效果比较好

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants