You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Float8 was moved to torchao in #551, and currently the CI that we have for float8 is running on:
a. CPU nightly (skips all cuda related tests)
b. CUDA nightly (skips all cuda related tests which require torch._scaled_mm, because the default machines used for this do not have a high enough CUDA capability version.
We should enable float8 CI on sm89 machines, which have cuda capability 8.9. The performance will not be representative, but we can at least test correctness.
), we should update that everywhere to test for capability 8.9, something like below:
# old
is_H100 = torch.cuda.is_available() and torch.cuda.get_device_capability() >= (9, 0)
# new
is_cuda_8_9 = torch.cuda.is_available() and torch.cuda.get_device_capability() >= (8, 9)
The text was updated successfully, but these errors were encountered:
@seemethere we could also alternatively move our CI jobs to use L4 instances which are cheaper than A10G and also support fp8. Last I tried to move our CI to use L4 i got timeouts while looking for runners so I suspect the L4 pool isn't big enough but this feels like a free efficiency win. wdyt?
Float8 was moved to torchao in #551, and currently the CI that we have for float8 is running on:
a. CPU nightly (skips all cuda related tests)
b. CUDA nightly (skips all cuda related tests which require torch._scaled_mm, because the default machines used for this do not have a high enough CUDA capability version.
We should enable float8 CI on sm89 machines, which have cuda capability 8.9. The performance will not be representative, but we can at least test correctness.
Pointers:
ao/test/float8/test_base.py
Line 57 in 00b76c4
The text was updated successfully, but these errors were encountered: