Skip to content

Commit

Permalink
fix missing sync
Browse files Browse the repository at this point in the history
  • Loading branch information
ngc92 committed Feb 1, 2025
1 parent e65cfac commit 471411d
Showing 1 changed file with 2 additions and 0 deletions.
2 changes: 2 additions & 0 deletions torchao/prototype/low_bit_optim/cpu_offload.py
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,8 @@ def step(self, closure=None):
with getattr(torch, self.device).stream(self.stream):
p_device.copy_(p_host, non_blocking=True)

# make sure param H2D finishes before the next forward pass
self.stream.synchronize()
self.queue.clear()
return loss

Expand Down

0 comments on commit 471411d

Please sign in to comment.