-
Notifications
You must be signed in to change notification settings - Fork 27.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LayoutLM Token Classification not learning #8524
Comments
Is there any update on this issue? |
Hi there! I have been investigating the model by making integration tests, and turns out it outputs the same tensors as the original repository on the same input data, so there are no issues (tested this both for the base model - However, the model is poorly documented in my opinion, I needed to first look at the original repository to understand everything. I made a demo notebook that showcases how to fine-tune HuggingFace's Let me know if this helps you! |
I have experienced the same issue, I realized that model files from here are different than the weights in the original repo. I was using weights from the original repo and the model couldn't load them at the start of the training. So, I was starting from a random model instead of a pre-trained one. That's why it is not learning much in a down-stream task. I solved the issue by using model files from huggingface |
This issue has been automatically marked as stale and been closed because it has not had recent activity. Thank you for your contributions. If you think this still needs to be addressed please comment on this thread. |
Environment info
transformers
version: 3.4.0Information
Model I am using (Bert, XLNet ...): LayoutLMForTokenClassification
The problem arises when using: my own scripts
The tasks I am working on is: my own task
NER task. I've reproduced the implementation of Dataset, compute metrics (and other helper functions) as in the original repo microsoft/layoutlm repo
When initially trying with the original repo and training script the model managed to learn and provided reasonable results after very few epochs. After implementing with Huggingface the model doesn't learn at all even after a much higher number of epochs.
To reproduce
Model loading and trainer configuration:
Expected behavior
Similar results to the original repo as the given the same parameters to the trainer and the Dataset being the same after processing the data.
Is this due to the ongoing integration of this model? Is the setup wrong?
The text was updated successfully, but these errors were encountered: