Fine-tune DeBERTa v3 language model, worthwhile endeavour? #151

shensmobile · 2024-03-05T21:04:06Z

Hey everyone, I've been using RoBERTa for the past year or so but have been looking into DeBERTa as well. My typical workflow with RoBERTa is to fine-tune the MLM using ~3mil medical reports to domain adapt before training on down-stream tasks. I've found that this greatly improved performance of the downstream models.

With DeBERTa, I presume that I can't use my existing code for fine-tuning the MLM since DeBERTa doesn't use MLM, it uses RTD. The pre-training scripts here seem to be for training a model from scratch (which I don't think I have good enough data or compute power/time to do efficiently).

I presume that if I wanted to fine-tune the RTD language model, I would use the "deberta-v3-X-continue" option in rtd.sh? If so, do you guys think this would be worth my time? Or should I just fine tune my downstream tasks on the supplied pre-trained models?

StephennFernandes · 2024-03-11T06:22:02Z

given that you have a significantly good amount of training data, i believe this could be a really good endevour as the DebERTa-v3 architecture and training procedure is insanely great. good h-param search and a nice continual pretraining should give great results. do let me know how it goes.

shensmobile · 2024-03-13T01:14:55Z

Would I use the deberta-v3-X-continue in rtd.sh or pretrain a model from scratch using my dataset?

StephennFernandes · 2024-03-13T04:52:22Z

do continual pretraining, i mean use the deberta-v3-X-continue. all medical domain LM are a result of continual pretraining

priamai · 2024-04-28T08:52:28Z

Hi all, I am in the exact same boat here. What is that rtd.sh is mentioned? I mean I know is a bash file but where is it ? Would be nice to see a python script that shows how the domain adaptation should be run and how to save the model.

fmobrj · 2024-05-10T14:53:13Z

do continual pretraining, i mean use the deberta-v3-X-continue. all medical domain LM are a result of continual pretraining

Hi, @StephennFernandes. How are you doing? Have you managed to sucessfully pretrain or continue pretraining a deberta V3 model in another language? Back when we were talking, my discriminator couldnt get better.

Best regards, Fabio.

zkalson · 2024-11-17T04:06:23Z

Hey all, I'm going down a similar path of continual pre-training. Any insight on how to make the model compatible with the huggingface transformers library?

neavo · 2025-01-02T09:53:53Z

Hey everyone, I've been using RoBERTa for the past year or so but have been looking into DeBERTa as well. My typical workflow with RoBERTa is to fine-tune the MLM using ~3mil medical reports to domain adapt before training on down-stream tasks. I've found that this greatly improved performance of the downstream models.

I attempted to continue pre-training mDeberta v3 using the MLM task in a manner similar to other BERT-like models. From the results, it seems that the MLM task can achieve effects similar to those of other BERT-like models. Therefore, regardless of whether further pre-training with the RTD task can yield better outcomes, the MLM task might be a safe choice to fall back on. Give it a try!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fine-tune DeBERTa v3 language model, worthwhile endeavour? #151

Fine-tune DeBERTa v3 language model, worthwhile endeavour? #151

shensmobile commented Mar 5, 2024

StephennFernandes commented Mar 11, 2024

shensmobile commented Mar 13, 2024

StephennFernandes commented Mar 13, 2024

priamai commented Apr 28, 2024

fmobrj commented May 10, 2024

zkalson commented Nov 17, 2024

neavo commented Jan 2, 2025 •

edited

Loading

Fine-tune DeBERTa v3 language model, worthwhile endeavour? #151

Fine-tune DeBERTa v3 language model, worthwhile endeavour? #151

Comments

shensmobile commented Mar 5, 2024

StephennFernandes commented Mar 11, 2024

shensmobile commented Mar 13, 2024

StephennFernandes commented Mar 13, 2024

priamai commented Apr 28, 2024

fmobrj commented May 10, 2024

zkalson commented Nov 17, 2024

neavo commented Jan 2, 2025 • edited Loading

neavo commented Jan 2, 2025 •

edited

Loading