Are benchmarks fair? #618

SaeedNajafi · 2024-11-11T17:24:00Z

It seems that your codebase has a separate implementation for Multi-head attention (MHA module) along with a separate implementation for kv caching and even the generation function is different than HF's generation.

While you are loading HF's models, you are relying on HF implementations. Could this introduce discrepancies in benchmarks?
Is it possible to build a transformer model using only your codebase relying on the local implementation of kv cache and MHA implementations?

mamba/benchmarks/benchmark_generation_mamba_simple.py

Line 41 in 442fab4

    
           model = AutoModelForCausalLM.from_pretrained(args.model_name, device_map={"": device}, torch_dtype=dtype)

SaeedNajafi closed this as completed Nov 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Are benchmarks fair? #618

Are benchmarks fair? #618

SaeedNajafi commented Nov 11, 2024 •

edited

Loading

Are benchmarks fair? #618

Are benchmarks fair? #618

Comments

SaeedNajafi commented Nov 11, 2024 • edited Loading

SaeedNajafi commented Nov 11, 2024 •

edited

Loading