Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Driver] Torchbench training performance on Rolling is slower than LTS #1164

Open
mengfei25 opened this issue Dec 12, 2024 · 0 comments
Open
Assignees
Milestone

Comments

@mengfei25
Copy link
Contributor

🐛 Describe the bug

Observed with torchbench training performance on Rolling driver and LTS driver, looks like Rolling is slower than LTS.
Overall it is ~20% gap. The following is the < 50% models

Category Name Eager Inductor
torchbench_amp_bf16_training functorch_dp_cifar10 0.440394435 0.376553014
torchbench_amp_bf16_training lennard_jones 0.410815577 0.406666522
torchbench_amp_bf16_training resnet18 0.569801695 0.414078304
torchbench_amp_bf16_training drq 0.419031071 0.416605194
torchbench_amp_bf16_training dcgan 0.473041413 0.45275431
torchbench_amp_bf16_training mobilenet_v3_large 0.601874092 0.458255372
torchbench_amp_bf16_training phlippe_resnet 0.565532503 0.474469922
torchbench_amp_bf16_training basic_gnn_gcn 0.837236487 0.474493124
torchbench_amp_bf16_training basic_gnn_gin 0.67058437 0.475682344
torchbench_amp_bf16_training speech_transformer 0.524068457 0.497168054
torchbench_amp_bf16_training soft_actor_critic 0.497953362 0.498692661
torchbench_amp_fp16_training drq 0.40930045 0.399965199
torchbench_amp_fp16_training phlippe_resnet 0.546009712 0.432983013
torchbench_amp_fp16_training nanogpt 0.461748452 0.448994252
torchbench_amp_fp16_training lennard_jones 0.463453479 0.455919189
torchbench_amp_fp16_training resnet18 0.58230154 0.46118426
torchbench_amp_fp16_training dcgan 0.46315022 0.468631482
torchbench_amp_fp16_training mnasnet1_0 0.696290041 0.489389997
torchbench_amp_fp16_training LearningToPaint 0.761002479 0.4967825
torchbench_bfloat16_training lennard_jones 0.3521912 0.363774254
torchbench_bfloat16_training phlippe_resnet 0.534909223 0.408322118
torchbench_bfloat16_training drq 0.430409868 0.429119883
torchbench_bfloat16_training nanogpt 0.440041578 0.430549223
torchbench_bfloat16_training mobilenet_v3_large 0.598219707 0.450442757
torchbench_bfloat16_training soft_actor_critic 0.46347127 0.452779549
torchbench_bfloat16_training phlippe_densenet 0.569334382 0.476015257
torchbench_bfloat16_training resnet18 0.617031303 0.490115099
torchbench_bfloat16_training functorch_dp_cifar10 0.609626518 0.498543827
torchbench_float16_training lennard_jones 0.37388514 0.35987522
torchbench_float16_training phlippe_resnet 0.468794127 0.400084215
torchbench_float16_training dcgan 0.382458895 0.416399596
torchbench_float16_training functorch_dp_cifar10 0.540425171 0.417395998
torchbench_float16_training mobilenet_v3_large 0.583812604 0.434171476
torchbench_float16_training resnet18 0.570007512 0.441501998
torchbench_float16_training drq 0.457905383 0.453155682
torchbench_float16_training squeezenet1_1 0.699713952 0.455786032
torchbench_float16_training nanogpt 0.424120811 0.469937885
torchbench_float16_training soft_actor_critic 0.473865774 0.476388316
torchbench_float16_training timm_efficientnet 0.711824661 0.490151596
torchbench_float32_training functorch_dp_cifar10 0.517967004 0.411734615
torchbench_float32_training nanogpt 0.439565001 0.41192107
torchbench_float32_training drq 0.414325806 0.43835296
torchbench_float32_training phlippe_resnet 0.54522133 0.45738061

Versions

Device: PVC 1100
Driver: Rolling 24.39.31294 / LTS 23.43.27642.52
PyTorch: 20241202 nightly wheel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants