Closed
Description
Recently came across this repo that's doing fp8 inference https://github.com/aredden/flux-fp8-api/blob/main/float8_quantize.py
It's getting popular enough that we should consider just making this a setting for autoquant