GPTQModel v1.4.2
What's Changed
⚡ MacOS gpu
(MPS) + cpu
inference and quantization support
⚡ Added Cohere 2 model support
- Build Changes by @Qubitium in #855
- Fix MacOS support by @Qubitium in #861
- check device_map on from_quantized() by @ZX-ModelCloud in #865
- call patch for TestTransformersIntegration by @CSY-ModelCloud in #867
- Add MacOS gpu acceleration via MPS by @Qubitium in #864
- [MODEL] add cohere2 support by @CL-ModelCloud in #869
- check device_map by @ZX-ModelCloud in #872
- set PYTORCH_ENABLE_MPS_FALLBACK for macos by @CSY-ModelCloud in #873
- check device_map int value by @ZX-ModelCloud in #876
- Simplify by @Qubitium in #877
- [FIX] device_map={"":None} by @ZX-ModelCloud in #878
- set torch_dtype to float16 for XPU by @CSY-ModelCloud in #875
- remove IPEX device check by @ZX-ModelCloud in #879
- [FIX] call normalize_device() by @ZX-ModelCloud in #881
- [FIX] get_best_device() wrong usage by @ZX-ModelCloud in #882
Full Changelog: v1.4.1...v1.4.2