You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Triton changed its kernel launch API recently. This issue is to adapt inductor side call site to make it work with both old and new triton APIs. We merged 88abff6 in #825, which causes benchmark failures:
$ ./inductor_xpu_test.sh huggingface float32 inference accuracy xpu 0 static 1 0 AlbertForMaskedLM
Testing model AlbertForMaskedLM
loading model: 0it [01:22, ?it/s]
xpu eval AlbertForMaskedLM
skipping cudagraphs for unknown reason
ERROR:common:function takes exactly 18 arguments (23 given)
Traceback (most recent call last):
File "/cache/pytorch-3.10-22ce6c6508d1d13b263d4c8b1fd6b98505983e92-4/benchmarks/dynamo/common.py", line 2144, in check_accuracy
new_result = optimized_model_iter_fn(model_copy, example_inputs)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 328, in _fn
return fn(*args, **kwargs)
File "/cache/pytorch-3.10-22ce6c6508d1d13b263d4c8b1fd6b98505983e92-4/benchmarks/dynamo/common.py", line 1908, in run_n_iterations
self.model_iter_fn(mod, inputs, collect_outputs=False)
File "/cache/pytorch-3.10-22ce6c6508d1d13b263d4c8b1fd6b98505983e92-4/benchmarks/dynamo/huggingface.py", line 550, in forward_pass
def forward_pass(self, mod, inputs, collect_outputs=True):
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 328, in _fn
return fn(*args, **kwargs)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/torch/_dynamo/external_utils.py", line 17, in inner
return fn(*args, **kwargs)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/torch/_functorch/aot_autograd.py", line 3905, in forward
return compiled_fn(full_args)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/torch/_functorch/aot_autograd.py", line 1482, in g
return f(*args)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/torch/_functorch/aot_autograd.py", line 2533, in runtime_wrapper
all_outs = call_func_with_args(
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/torch/_functorch/aot_autograd.py", line 1506, in call_func_with_args
out = normalize_as_list(f(args))
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/torch/_functorch/aot_autograd.py", line 1594, in rng_functionalization_wrapper
return compiled_fw(args)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 944, in wrapper
return optimized_function(args_new)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/torch/_inductor/codecache.py", line 378, in __call__
return self.get_current_callable()(inputs)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/torch/_inductor/codecache.py", line 405, in _run_from_cache
return compiled_graph.compiled_artifact(inputs)
File "/tmp/torchinductor_runner/73/c73tbo7eozox67v4sg7jsajlqbh5f4quu4bemhxsp6piqsmow2cv.py", line 659, in call
triton_per_fused_add_embedding_native_layer_norm_0.run(arg31_1, constant0, constant7, constant8, constant1, constant2, buf3, 512, [128](https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/8584252287/job/23524321946?pr=825#step:20:129), grid=grid(512), stream=stream0)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/torch/_inductor/triton_heuristics.py", line 513, in run
return launcher(
File "<string>", line 8, in launcher
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/triton/backends/intel/driver.py", line 386, in __call__
self.launch(*args, **kwargs)
TypeError: function takes exactly 18 arguments (23 given)
libcudart.so.12: cannot open shared object file: No such file or directory
libcudart.so.12: cannot open shared object file: No such file or directory
TorchDynamo optimized model failed to run because of following error
fail_to_run
The text was updated successfully, but these errors were encountered:
Triton changed its kernel launch API recently. This issue is to adapt inductor side call site to make it work with both old and new triton APIs. We merged 88abff6 in #825, which causes benchmark failures:
The text was updated successfully, but these errors were encountered: