-
Notifications
You must be signed in to change notification settings - Fork 27.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding RelationExtraction head to layoutLMv2 and layoutXLM models #15451
Comments
Hi, That's great to read :) it was a bit unclear to me how to use the model at inference time (the authors only provided a script for training and evaluation, i.e. when labels are available). Can you show how to use the model when you don't have any labels available? More specifically, what are the I assume that the model needs all possible entities, as well as all possible relations in order to classify them pairwise. In that case, we can add it. There was already an effort to do this (see #15173). |
Hey Niels, I've added to the bottom of this notebook here an inference example (please ignore the accuracy, I didn't spend much time finetuning). For running the inference we just require an empty relations dict as we calculate what all possible relations could be based on the entity labels (the current model only links between entities with labels 1 (the key) and 2 (the value)). We do however require all the entities to be labelled with the start token index, end token index and a label so we would probably suggest to users in the docs to run LayoutLMv2ForTokenClassification first and then run this based on the results of that. I'm not really experienced enough with the library to review the previous effort but I think there may be a few things missing there. In terms of going forward would you prefer if I made a new PR from my branch or tried to modify that PR to conform? |
Also just forgot to add but on the detailed form that entities and relations should be in I put it all in model input and output docstring: |
Awesome work!! I'll have to deep dive a bit into this, but looks very nice. Let's add the head model to the library. I guess we can continue our discussion on #15173? Btw, Detectron2 now has support for torch 1.10, you can now easily install it in Colab using:
|
Ahh that's so much easier for detectron thanks for that :) . Also stoked to hear that we can integrate this. There's a few things that I thought I should mention where I'm not sure where to put them so I thought I'd just comment them here and get some advice from you. Questions:
Notes:
|
Also one more general LayoutLMv2/XLM question based on what I saw when writing the dataset. From my understanding the current processor/ feature extractor splits on words, tokenizes and then returns a flattened list of the tokens along with the original bounding boxes duplicated for where there was multiple tokens. With character based languages I think this may cause some issues hence why the original authors did the processing differently in the XFUN dataset code. I believe that if we split by words most software will split the characters on their own, if we pass this result to the processor/ feature extractor then the tokenizer can't run correctly as it can't group multiple characters together into a single token id. And if we pass in a whole line at once the processor/ tokenizer will create the token ids correctly but will just duplicate the bounding box of the entire line over and over. Is my understanding correct? And if so do you think we could create a different way of using the processor/ feature extractor where you can pass in a whole line along with the bounding boxes for each character in that line and then use the offset mappings from the tokenizer to remap the bounding boxes correctly? |
I'm experimenting with LayoutLMv2 and LayoutLMForRelationExtraction. I referred to https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/LayoutLMv2/FUNSD/True_inference_with_LayoutLMv2ForTokenClassification_%2B_Gradio_demo.ipynb for entity detection/predictions using LayoutLMv2 Can someone help me how can I convert these predictions from LayoutLMv2 to entity dict ( which is input to LayoutLMv2ForRelationExtraction) |
This is weird, might be a bug. cc @sgugger
You mean using
This is fine for me!
Yes this seems very useful.
It really requires running with lr warmup (based on my testing I'd recommend linear warmup [0, 5e-5] with about 15% of total steps) otherwise it is even more sensitive, collapsing in just over 1/2 the runs. Is there somewhere we can put this as advice for users who may not have dealt with things like this before? Yes we usually have a tips section on each model's documentation page, e.g. LayoutLMv2's one can be found here (right below the abstract of the paper). We can link to additional notebooks for more info.
We do have data collators in the library, you can find them here. Alternatively, we can include code in the feature extractor as long as we don't break existing code of our users. Maybe the data collator could be the best option here.
Yes sure, lot's of people have been asking for this, so let's add a clean notebook with additiona documentation such that people really know how the model works. Feel free to open a new PR! |
No this is not a public-facing argument, and it's for configurations only anyway. It's not used anywhere in the code for pretrained models, so I don't see why it should be needed. You can check every other model in the library and see for yourself that it's not been added :-) |
@sgugger I can reproduce the error with
gives:
|
Looking into it, thanks for the repro! |
The problem should be fixed on master. We'll make a patch release on Monday with the fix. |
@R0bk Thank you for the great work. There were a lot of missing points I had for RE inference, now mostly clarified. But I still having difficulty to understand the 'entities' and 'relations' (such as 'start_index' and 'end_index'). Could you give an example of what they represent in a given sentence? I couldn't find a clear answer in the original paper and in other reference papers authors mention. You added this docstring, but it would be great if you exemplify those Here is the only info from the paper that mention about RE process:
and
Thank you |
So you mean that we need to train 2 models , one is for token classification,one uses the results of the previous model to do the relation Extraction? |
Did you get the answer for your question? |
I'm pretty sure the answer to this question is yes ;) |
Not yet @NielsRogge, Can you please help here |
hi, I saw your amazing work https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/LayoutLMv2/FUNSD/True_inference_with_LayoutLMv2ForTokenClassification_%2B_Gradio_demo.ipynb#scrollTo=AttFR_dMNVEL |
I've implemented @NielsRogge's comments in #15173 in my own fork. I'm happy to open a PR, or to let someone else take it from here. |
@quasimik Great work! Could you provide a step by step of how we use your new class |
my aim is also to predict key-value pairs according to colab notebbok ,we have to train both token classification and entity detection model first,and then use that ouput as input in am i right |
Hello guys, any update on this new component? |
Hi @R0bk thanks for the amazing work! I was able to train a RE component with custom data using your fork and the collab notebook that you provided and the results looks very promising! Though at the moment I'm just able to train the model with entities of types 1 & 2, if I set other types of entities inside the "label" field of the "entities" key lists, I got an error. I tried to comment the line that you suggested: https://github.com/R0bk/transformers/blob/d9fe818083017d49487a3a45ca99f52123d68628/src/transformers/models/layoutlmv2/modeling_layoutlmv2.py#L1431 but it didn't work. Can you please point me on some direction on this? Kind regards. |
hi, I'm having trouble understanding how to use this... In other words how do I use this notebook step by step? |
Hello, On my side, I do not see how we get the ids of tokens where entities start/end for the inference part? Could you please share more details on this part? Thanks! |
Hi @R0bk thanks for this work, this helped me train on my data for different usecases and get better results until recently where I happened to update the transformers module in my environment by mistake and then getting again back to your version is giving me RuntimeError: CUDA out of memory. even if my batch_size is 1. For the same data, I was able to train it for RE before. Not sure how to fix the problem tried creating a fresh environment still the problem persists. Environment details: Kindly suggest what could be the problem or possibly if I've missed something in the new environment Thanks! |
Hi @R0bk ,@NielsRogge Thanks for the amazing work |
Any updates on model head addition for inference? The output for LayoutLMV2 is not in line with the input for RE. Can these 2 heads be combined for RE task? |
Hi @Isha09Garg, were you able to use LAyoutLMv2 for RE task? (on FUNSD or other datasets) |
Hi, has anyone tried to implement RE head on LayoutLM V1? |
is the relation extraction module only created for layoutxlm? or can i also use it for layoutlm v2 and v3 |
🌟 New model head addition
Relation Extraction Head for LayoutLMv2/XLM
Addition description
Hey all,
I've see a bunch of different requests across huggingface issues [0], unilm issues [0][1] and on @NielsRogge Transformer Tutorials issues [0][1] about adding the relation extraction head from layoutlmv2 to the huggingface library. As the model is quite difficult to use in it's current state I was going to write my own layer ontop but I saw in this issue that it may be a good idea to add it to transformers as a separate layoutlmv2/xlm head and thought it would be a good way to contribute back to a library I use so much.
I've gone ahead and added it under my own branch and got it successfully working with the library. Here is a colab using my branch of transformers if you want to test it yourself.
Before I add tests/ write more docs I just wanted to post here first to see if there's interest in potentially merging this in. If there is interest I have a few questions that it would be helpful to get some info on to ensure that I've correctly done the integration.
The text was updated successfully, but these errors were encountered: