Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

取作者全部数据集训练OK,但是取数据集中的前3000条训练数据为新的训练数据,验证及测试集数据不变,训练出错。 #77

Open
smartcatdog opened this issue Sep 22, 2019 · 3 comments

Comments

@smartcatdog
Copy link

执行

mv example.train example.train.bak
head -2000 example.train.bak >> example.train

然后训练
python main.py --train=True --clean=True
报错:

Traceback (most recent call last):
  File "main.py", line 219, in main
    train()
  File "main.py", line 145, in train
    test_sentences, char_to_id, tag_to_id, FLAGS.lower
  File "ChineseNER/loader.py", line 110, in prepare_dataset
    tags = [tag_to_id[w[-1]] for w in s]
  File "ChineseNER/loader.py", line 110, in <listcomp>
    tags = [tag_to_id[w[-1]] for w in s]
KeyError: 'S-PER'

这是因为?

@smartcatdog smartcatdog changed the title 取作者数据集中的前3000条训练数据为新的训练数据,验证及测试集数据不变,训练出错。 取作者全部数据集训练OK,但是取数据集中的前3000条训练数据为新的训练数据,验证及测试集数据不变,训练出错。 Sep 22, 2019
@Zhenguoshen
Copy link

我也是这个问题,但是我是自己的数据集出现这个问题,然后我按照你说的改了下作者训练数据集,也出现了这样的错误,你解决了吗

@hx-317
Copy link

hx-317 commented Nov 9, 2020

这个问题怎么解决呢

@aSmallsheep
Copy link

aSmallsheep commented Dec 2, 2020

执行

mv example.train example.train.bak
head -2000 example.train.bak >> example.train

然后训练
python main.py --train=True --clean=True
报错:

Traceback (most recent call last):
  File "main.py", line 219, in main
    train()
  File "main.py", line 145, in train
    test_sentences, char_to_id, tag_to_id, FLAGS.lower
  File "ChineseNER/loader.py", line 110, in prepare_dataset
    tags = [tag_to_id[w[-1]] for w in s]
  File "ChineseNER/loader.py", line 110, in <listcomp>
    tags = [tag_to_id[w[-1]] for w in s]
KeyError: 'S-PER'

这是因为?

取作者数据集中的前3000条训练数据为新的训练数据,会不会出现标签覆盖不全的情况。

因为你取了作者训练集前3000条,导致训练集中未覆盖到“S-PER”这个实体类别,验证及测试集数据却有“S-PER”这个标签。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants