Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

请问可以开源其他子任务的代码吗? #11

Open
winder-source opened this issue Oct 26, 2022 · 11 comments
Open

请问可以开源其他子任务的代码吗? #11

winder-source opened this issue Oct 26, 2022 · 11 comments

Comments

@winder-source
Copy link

作者您好,我目前在研究相关的方向,对您这篇文章比较感兴趣,想请问您可以开源其他子任务的代码吗,貌似目前的代码只涉及一个下游任务,想再复现一下这篇论文

@lyhuohuo
Copy link
Collaborator

lyhuohuo commented Oct 27, 2022

您好,其他子任务只需要在标签形式和测试端稍作修改就可以了。对于属性词抽取子任务,将标签序列中的情感去掉即可。对于情感分类子任务,训练的时候使用完整的span-情感序列,测试的时候给定所有的真实span进行情感标签的生成即可。

@lyhuohuo
Copy link
Collaborator

相关代码已经上传

@winder-source
Copy link
Author

非常谢谢您!

@winder-source
Copy link
Author

您好,我跑sh 15MASC_pretrain.sh的时候遇到了一些问题,
1.首先是这块

Traceback (most recent call last):
  File "twitter_sc_training.py", line 454, in <module>
    main(0, args)
  File "twitter_sc_training.py", line 172, in main
    start_idx=args.start_idx)
TypeError: __init__() got an unexpected keyword argument 'is_sample'

然后为了解决这个问题,我把参数注释掉了

    train_dataset = Twitter_Dataset(args.dataset[0][1],
                                    split='train')
                                    # is_sample=args.is_sample,
                                    # sample_num=args.sample_num,
                                    # start_idx=args.start_idx)

我把这块的参数注释掉了,因为Twitter_Dataset函数初始化的参数没有is_sample,sample_num,start_idx,为什么要加这三个参数呢?

class Twitter_Dataset(data.Dataset):
    def __init__(self, infos, split):
        self.infos = json.load(open(infos, 'r'))

2.改完后又出现另外的问题

Traceback (most recent call last):
  File "twitter_sc_training.py", line 454, in <module>
    main(0, args)
  File "twitter_sc_training.py", line 205, in main
    res_dev = eval_utils.eval(args, model, dev_loader, metric, device)
  File "/lyldata/VLP-MABSA-2/src/eval_utils.py", line 18, in eval
    for key, value in batch['TWITTER_SC'].items()
KeyError: 'TWITTER_SC'

看了一下貌似是collation.py和tokenization_new.py没有相关的代码?请问这部分代码可以补上吗?

@lyhuohuo
Copy link
Collaborator

这是我在跑小样本实验的时候加入的参数,非常抱歉代码没有更新完善,我这两天会更新。

@winder-source
Copy link
Author

好的!等您更新

@lyhuohuo
Copy link
Collaborator

lyhuohuo commented Nov 1, 2022

已经更新完毕

@winder-source
Copy link
Author

Traceback (most recent call last):
File "twitter_sc_training.py", line 450, in
main(0, args)
File "twitter_sc_training.py", line 80, in main
tokenizer = ConditionTokenizer(args=args)
File "/lyldata/VLP-MABSA-2/src/data/tokenization_new.py", line 43, in init
pretrained_model_name, )
File "/root/anaconda3/envs/VLP-MABSA-env/lib/python3.7/site-packages/transformers/tokenization_utils_base.py", line 1591, in from_pretrained
list(cls.vocab_files_names.values()),
OSError: Model name './E2E-MABSA' was not found in tokenizers model name list (facebook/bart-base, facebook/bart-large, facebook/bart-large-mnli, facebook/bart-large-cnn, facebook/bart-large-xsum, yjernite/bart_eli5). We assumed './E2E-MABSA' was a path, a model identifier, or url to a directory containing vocabulary files named ['vocab.json', 'merges.txt'] but couldn't find such vocabulary files at this path or url.

@lyhuohuo
Copy link
Collaborator

lyhuohuo commented Nov 2, 2022

将这里的路径改为facebook/bart-base或者可以从huggingface上下载bart-base的模型文件,将路径改为下载下来的路径就可以了

@winder-source
Copy link
Author

好的谢谢!

@NanZhang257
Copy link

Traceback (most recent call last):回溯(最近一次调用最后一次): File "twitter_sc_training.py", line 450, in 文件“twitter_sc_training.py”,第 450 行,在 main(0, args)主(0, 阿格斯) File "twitter_sc_training.py", line 80, in main文件“twitter_sc_training.py”,第 80 行,主 tokenizer = ConditionTokenizer(args=args)tokenizer = 条件 Tokenizer(args=args) File "/lyldata/VLP-MABSA-2/src/data/tokenization_new.py", line 43, in 文件“/lyldata/VLP-MABSA-2/src/data/tokenization_new.py”,第 43 行,在init初始化 pretrained_model_name, )pretrained_model_name,) File "/root/anaconda3/envs/VLP-MABSA-env/lib/python3.7/site-packages/transformers/tokenization_utils_base.py", line 1591, in from_pretrained文件“/root/anaconda3/envs/VLP-MABSA-env/lib/python3.7/site-packages/transformers/tokenization_utils_base.py”,第 1591 行,from_pretrained list(cls.vocab_files_names.values()),列表(cls.vocab_files_names.values()), OSError: Model name './E2E-MABSA' was not found in tokenizers model name list (facebook/bart-base, facebook/bart-large, facebook/bart-large-mnli, facebook/bart-large-cnn, facebook/bart-large-xsum, yjernite/bart_eli5). We assumed './E2E-MABSA' was a path, a model identifier, or url to a directory containing vocabulary files named ['vocab.json', 'merges.txt'] but couldn't find such vocabulary files at this path or url.OSError:在分词器模型名称列表中找不到模型名称“./E2E-MABSA”(facebook/bart-base、facebook/bart-large、facebook/bart-large-mnli、facebook/bart-large-cnn、facebook/bart-large-xsum、yjernite/bart_eli5)。我们假设 './E2E-MABSA' 是包含名为 ['vocab.json', 'merges.txt'] 的词汇文件的目录的路径、模型标识符或 URL,但在此路径或 url 中找不到此类词汇文件。

您好,请问一下这个问题解决了吗?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants