Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add to_static for electra #7575

Merged
merged 3 commits into from
Dec 11, 2023
Merged

Conversation

MayYouBeProsperous
Copy link
Contributor

@MayYouBeProsperous MayYouBeProsperous commented Dec 4, 2023

PR types

Others

PR changes

Others

Description

为 electra 模型接入动转静

训练环境

cd model_zoo/electra
export CUDA_VISIBLE_DEVICES="0"
export DATA_DIR=./BookCorpus/

训练结果

  • 动态图

    运行命令

    python -u ./run_pretrain.py \
        --model_type electra \
        --model_name_or_path electra-small \
        --input_dir $DATA_DIR \
        --output_dir ./pretrain_model/ \
        --train_batch_size 8 \
        --learning_rate 5e-4 \
        --max_seq_length 128 \
        --weight_decay 1e-2 \
        --adam_epsilon 1e-6 \
        --warmup_steps 10000 \
        --num_train_epochs 4 \
        --logging_steps 100 \
        --save_steps 10000 \
        --max_steps -1 \
        --device gpu
    

    运行结果
    image

  • 静态图 SOT模式
    运行命令

    export GRAPH_SIZE=0 COST_MODEL=False ENABLE_FALL_BACK=True
    python -u ./run_pretrain.py \
        --model_type electra \
        --model_name_or_path electra-small \
        --input_dir $DATA_DIR \
        --output_dir ./pretrain_model/ \
        --train_batch_size 8 \
        --learning_rate 5e-4 \
        --max_seq_length 128 \
        --weight_decay 1e-2 \
        --adam_epsilon 1e-6 \
        --warmup_steps 10000 \
        --num_train_epochs 4 \
        --logging_steps 100 \
        --save_steps 10000 \
        --max_steps -1 \
        --device gpu \
        --to_static True

    运行结果
    image

  • 静态图 AST模式
    运行命令

    export COST_MODEL=False ENABLE_FALL_BACK=False
    python -u ./run_pretrain.py \
        --model_type electra \
        --model_name_or_path electra-small \
        --input_dir $DATA_DIR \
        --output_dir ./pretrain_model/ \
        --train_batch_size 8 \
        --learning_rate 5e-4 \
        --max_seq_length 128 \
        --weight_decay 1e-2 \
        --adam_epsilon 1e-6 \
        --warmup_steps 10000 \
        --num_train_epochs 4 \
        --logging_steps 100 \
        --save_steps 10000 \
        --max_steps -1 \
        --device gpu \
        --to_static True

    运行结果
    image

Copy link

paddle-bot bot commented Dec 4, 2023

Thanks for your contribution!

Copy link

codecov bot commented Dec 4, 2023

Codecov Report

Attention: 154 lines in your changes are missing coverage. Please review.

Comparison is base (10ff94a) 58.23% compared to head (ded8a2d) 57.85%.
Report is 61 commits behind head on develop.

❗ Current head ded8a2d differs from pull request most recent head 434d280. Consider uploading reports for the commit 434d280 to get more accurate results

Files Patch % Lines
paddlenlp/trainer/training_args.py 3.48% 83 Missing ⚠️
paddlenlp/transformers/llama/modeling_auto.py 8.97% 71 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #7575      +/-   ##
===========================================
- Coverage    58.23%   57.85%   -0.38%     
===========================================
  Files          579      582       +3     
  Lines        85819    86480     +661     
===========================================
+ Hits         49973    50032      +59     
- Misses       35846    36448     +602     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@2742195759 2742195759 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里只有 electra 一个模型支持了动转静吗?还是说这个是所有模型的训练入口?

@MayYouBeProsperous
Copy link
Contributor Author

这里只有 electra 一个模型支持了动转静吗?还是说这个是所有模型的训练入口?

这里只有 electra 的,打算一个模型提交一个 PR ,这样可以吗?

@2742195759
Copy link
Contributor

可以的

Copy link
Contributor

@2742195759 2742195759 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@w5688414 w5688414 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@w5688414 w5688414 merged commit e6acb0e into PaddlePaddle:develop Dec 11, 2023
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants