Update Float8Tensor for GRPO training in unsloth #3158

andrewor14 · 2025-10-12T22:21:23Z

Summary: Support a few extra ops called during GRPO loop in unsloth/vllm for Float8Tensor.

Test Plan:

python test/quantization/quantize_/workflows/float8/test_float8_tensor.py -k test_fp8_matmul_variants
python test/quantization/quantize_/workflows/float8/test_float8_tensor.py -k test_to_dtype_layout
python test/quantization/quantize_/workflows/float8/test_float8_tensor.py -k test_has_compatible_shallow_copy_type
python test/quantization/quantize_/workflows/float8/test_float8_tensor.py -k test_transpose

pytorch-bot · 2025-10-12T22:21:28Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3158

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 092ca75 with merge base f3fc5e7 ():

NEW FAILURE - The following job has failed:

Run 1xL4 Tests / test (SM-89, linux.g6.4xlarge.experimental.nvidia.gpu, --pre torch --index-url https://download.p... / linux-job (gh)
test/integration/test_integration.py::TestExport::test_export_float8

This comment was automatically generated by Dr. CI and updates every 15 minutes.

jerryzh168 · 2025-10-29T22:29:15Z

test/quantization/quantize_/workflows/float8/test_float8_tensor.py

+        output_tensor = torch.matmul(input_tensor, weight_tensor.t())
+        output_tensor_fp8 = torch.matmul(input_tensor_fp8, weight_tensor_fp8.t())


this is not used through the quantize_ API?

if this can be accessed through quantize_ API then we can merge the test with test_linear_variants I think

I don't think this can be accessed through the quantize_ API unfortunately, nn.Linear will dispatch to F.linear first

jerryzh168 · 2025-10-29T22:31:40Z

torchao/quantization/quantize_/workflows/float8/float8_tensor.py

+    output_tensor, input_tensor, weight_tensor = (
+        args[0],
+        args[1],
+        args[2] if len(args) > 2 else None,


why is weight tensor optional?

should we add assert for the last 2 kwargs as well https://github.com/pytorch/pytorch/blob/82ff07c7884d478ddd5d638bebbb938e55c9bebf/aten/src/ATen/native/native_functions.yaml#L7214

also I thought one of mat1 and mat2 should be bias_tensor?

yeah the first tensor is the bias, also added the asserts

torchao/quantization/quantize_/workflows/float8/float8_tensor.py

vkuzo · 2025-10-30T17:10:36Z

torchao/quantization/quantize_/workflows/float8/float8_tensor.py

+    if is_transposed:
+        return _float8_linear_impl(input_tensor, weight_tensor.t())
+    else:
+        return torch.matmul(input_tensor, weight_tensor.dequantize())


mm to matmul is also going to a higher level thing, better to call torch.mm here

_float8_mm_impl seems confusing, IMO this should be refactor to cleanly override individual torch or aten ops and ensure that the logic of when to do weight-only vs dynamic quant is consistent everywhere

torchao/quantization/quantize_/workflows/float8/float8_tensor.py

vkuzo

plz clean up _float8_mm_impl

andrewor14 marked this pull request as draft October 12, 2025 22:21

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 12, 2025

andrewor14 added the topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories) label Oct 12, 2025

andrewor14 force-pushed the unsloth-fp8-rl-test branch from ed3c237 to 19500bf Compare October 12, 2025 22:40

andrewor14 changed the title ~~[HACK] Update Float8Tensor for GRPO training in unsloth~~ [draft] Update Float8Tensor for GRPO training in unsloth Oct 13, 2025

andrewor14 added 5 commits October 23, 2025 09:52

temp

7cca51c

runs without errors

1af4219

Fix Float8Tensor view bug

2eb3dfb

clean up a bit

30ba34a

Dequantize fp8 rowwise in backward

b30977e

andrewor14 force-pushed the unsloth-fp8-rl-test branch from 3d4cb8d to 7e0749d Compare October 23, 2025 18:21

Add tests

c0f4b4e

andrewor14 force-pushed the unsloth-fp8-rl-test branch from 7e0749d to c0f4b4e Compare October 23, 2025 23:14

Only support matmul(x, w.t()) for now

9d27057

andrewor14 force-pushed the unsloth-fp8-rl-test branch from 345bb63 to 9d27057 Compare October 29, 2025 16:32

andrewor14 added 2 commits October 29, 2025 13:06

Relax matmul support

6ed8e04

No need to contiguous

a2c07ef

andrewor14 changed the title ~~[draft] Update Float8Tensor for GRPO training in unsloth~~ Update Float8Tensor for GRPO training in unsloth Oct 29, 2025

andrewor14 requested a review from jerryzh168 October 29, 2025 20:15

andrewor14 marked this pull request as ready for review October 29, 2025 20:15

fix per row matmul test

43c7259

jerryzh168 reviewed Oct 29, 2025

View reviewed changes

torchao/quantization/quantize_/workflows/float8/float8_tensor.py Outdated Show resolved Hide resolved

jerryzh168 reviewed Oct 29, 2025

View reviewed changes

torchao/quantization/quantize_/workflows/float8/float8_tensor.py Outdated Show resolved Hide resolved

jerryzh168 reviewed Oct 29, 2025

View reviewed changes

torchao/quantization/quantize_/workflows/float8/float8_tensor.py Outdated Show resolved Hide resolved

vkuzo reviewed Oct 30, 2025

View reviewed changes

torchao/quantization/quantize_/workflows/float8/float8_tensor.py Outdated Show resolved Hide resolved

vkuzo reviewed Oct 30, 2025

View reviewed changes

address comments

615de5e

vkuzo reviewed Oct 30, 2025

View reviewed changes

torchao/quantization/quantize_/workflows/float8/float8_tensor.py Show resolved Hide resolved

vkuzo requested changes Oct 30, 2025

View reviewed changes

address comments

092ca75

andrewor14 force-pushed the unsloth-fp8-rl-test branch from 060b217 to 092ca75 Compare October 30, 2025 23:46

andrewor14 requested a review from vkuzo October 30, 2025 23:46

		output_tensor = torch.matmul(input_tensor, weight_tensor.t())
		output_tensor_fp8 = torch.matmul(input_tensor_fp8, weight_tensor_fp8.t())

Update Float8Tensor for GRPO training in unsloth #3158

Are you sure you want to change the base?

Update Float8Tensor for GRPO training in unsloth #3158

Conversation

andrewor14 commented Oct 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Oct 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3158

❌ 1 New Failure

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vkuzo left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

andrewor14 commented Oct 12, 2025 •

edited

Loading

pytorch-bot bot commented Oct 12, 2025 •

edited

Loading