runtime JIT execution of the same kernel incurs high overhead(binder function)

### Describe the issue

as mentioned :https://github.com/triton-lang/triton/pull/3503
It is very expensive to call binder() and pack call args.
`bound_args, specialization, options = binder(*args, **kwargs)`
When repeatedly calling a compiled kernel, JIT will still repeatedly call the function to generate key for comparison, which brings certain overhead. Can we simplify this？
https://github.com/triton-lang/triton/blob/64a07f85ff3738438028d71c8429b5e44e83903f/python/triton/runtime/jit.py#L282



### Environment details

Hardware independent
The latest version of Triton

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

runtime JIT execution of the same kernel incurs high overhead(binder function) #6064

Describe the issue

Environment details

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

runtime JIT execution of the same kernel incurs high overhead(binder function) #6064

Description

Describe the issue

Environment details

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions