Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tplinker_plus.py 中的decode_rel有错误 #72

Open
ZJL0111 opened this issue Apr 27, 2022 · 0 comments
Open

tplinker_plus.py 中的decode_rel有错误 #72

ZJL0111 opened this issue Apr 27, 2022 · 0 comments

Comments

@ZJL0111
Copy link

ZJL0111 commented Apr 27, 2022

感谢作者分享代码,在利用训练好该模型进行预标注的过程中,发现tplinker_plus.py 中的decode_rel有错误

head link
for sp in matrix_spots:
...........
# recover the positons in the original text
for ent in ent_list:
ent["char_span"] = [ent["char_span"][0] + char_offset, ent["char_span"][1] + char_offset]
ent["tok_span"] = [ent["tok_span"][0] + tok_offset, ent["tok_span"][1] + tok_offset]

实体的span恢复,应该放在上述循环外,否则解码会出错,例如下

文本总长2001,输出实体的char_pan却出现了[2853, 2866]这种,,,

'relation_list': [{'subject': 'SAR444245', 'object': 'every 3 weeks', 'subj_tok_span': [405, 410], 'obj_tok_span': [418, 421], 'subj_char_span': [1165, 1174], 'obj_char_span': [1193, 1206], 'predicate': '/Drug/FREQUENCY/Drug-FREQUENCY'}], 

'entity_list': [
{'type': 'Drug', 'text': 'SAR444245', 'tok_span': [663, 668], 'char_span': [2326, 2335]}, 
{'type': 'Drug', 'text': 'pembrolizumab', 'tok_span': [669, 676], 'char_span': [2340, 2353]}, 
{'type': 'Drug', 'text': 'SAR444245', 'tok_span': [705, 710], 'char_span': [2455, 2464]}, 
{'type': 'Drug', 'text': 'pembrolizumab', 'tok_span': [711, 718], 'char_span': [2469, 2482]}, 
{'type': 'FREQUENCY', 'text': 'every 3 weeks', 'tok_span': [718, 721], 'char_span': [2483, 2496]}, 
{'type': 'Drug', 'text': 'SAR444245', 'tok_span': [746, 751], 'char_span': [2583, 2592]}, 
{'type': 'Drug', 'text': 'pembrolizumab', 'tok_span': [752, 759], 'char_span': [2597, 2610]}, 
{'type': 'FREQUENCY', 'text': 'every 3 weeks', 'tok_span': [759, 762], 'char_span': [2611, 2624]}, 
{'type': 'Drug', 'text': 'pembrolizumab', 'tok_span': [793, 800], 'char_span': [2725, 2738]}, 
{'type': 'Drug', 'text': 'pembrolizumab', 'tok_span': [834, 841], 'char_span': [2853, 2866]}]}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant