-
Notifications
You must be signed in to change notification settings - Fork 214
Issues: mit-han-lab/llm-awq
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[BUG] GPU memory used is much more in v0.2.7 than v0.2.5 while quantizing models.
#247
opened Dec 18, 2024 by
GodHforever
AWQ quantization doesn't work in many opensource LLM in terms of inference efficiency
#243
opened Dec 10, 2024 by
loulianzhang
Inquiry about GPU memory usage of VILA 1.5-3b AWQ model for 12 frames video.
#240
opened Nov 18, 2024 by
gj-raza
RuntimeError: CUDA error: no kernel image is available for execution on the device
#238
opened Nov 15, 2024 by
new-Sunset-shimmer
Could you explain me how can I change the percentage of kept salient weights in FP16?
#237
opened Nov 15, 2024 by
akylbekmaxutov
Cannot clone from Efficient-Large-Model/VILA.git, Dependency Issues with alternative
#236
opened Nov 14, 2024 by
rossgreer
[QST] Why does awq write its own int3/int4 GEMM kernels instead of using CUTLASS
#235
opened Nov 11, 2024 by
SimpleTheoryOfTypes
Unable to run Gradio demo: VILA with TinyChat on a local GPU server
#234
opened Nov 4, 2024 by
mitraavi
How to convert the AWQ model after the quantization into safetensors
#232
opened Oct 31, 2024 by
vladimiralbrekhtccr
Regarding the issues encountered with w_bit 3 quantification
#231
opened Oct 30, 2024 by
langxinspieder
AttributeError: 'LlamaConfig' object has no attribute 'rope_theta'
#222
opened Sep 30, 2024 by
lvtao65535
Unsupported NVHPC compiler found. nvc++ is the only NVHPC compiler
#220
opened Sep 17, 2024 by
SimWangArizona
"Expected all tensors to be on the same device" when running "Perform AWQ search" on Llama3
#219
opened Sep 10, 2024 by
charlesyju
Batch Processing not implemented for LlavaStreamGenerator
#216
opened Aug 12, 2024 by
rahulthakur319
NotImplementedError: <class 'transformers_modules.modeling_chatglm.ChatGLMForConditionalGeneration'>
#214
opened Aug 8, 2024 by
lihaofd
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.