-
Notifications
You must be signed in to change notification settings - Fork 712
Closed
Labels
good first issueGood for newcomersGood for newcomers
Description
🐛 Describe the bug
Hit this issue when I was debugging the gemma-3 error in optimum-executorch (CI link).
Upon checking the complained op_overload, the function object is actually a sym_max which is introduced via this line
if self.is_sliding[layer_idx]:
query_length = cache_position.shape[0]
...
local_mask_kv_length = max(query_length, self.sliding_window)
return local_mask_kv_length, local_mask_kv_offset
in the upstream transformers in the HybridCache class, which is expected as now we set the cache_poistion dim to be dynamic in huggingface/optimum-executorch#73.
The export works fine to have the sym_max in the exported graph (test_hybrid_cache_exportability passed in the transformers), but when further lowering the graph to executorch, the call to_executorch failed.
On chatting with @kimishpatel , it seems like we are missing implementation for sym_max as an op.
Versions
trunk
Metadata
Metadata
Assignees
Labels
good first issueGood for newcomersGood for newcomers
Type
Projects
Status
Done