Skip to content

Conversation

@mxyng
Copy link
Contributor

@mxyng mxyng commented Oct 16, 2025

No description provided.

@mxyng mxyng changed the base branch from main to mxyng/convert October 16, 2025 19:27
@mxyng mxyng force-pushed the mxyng/qwen3vl branch 2 times, most recently from eddb168 to cc6ed87 Compare October 16, 2025 20:24
@jmorganca jmorganca requested review from dhiltgen and pdevine October 16, 2025 21:11
@dhiltgen
Copy link
Collaborator

Can you add it to the list of image models we test here?

@mxyng mxyng force-pushed the mxyng/convert branch 2 times, most recently from 2bc23ea to 5c51e3e Compare October 16, 2025 22:59
@mxyng mxyng force-pushed the mxyng/convert branch 2 times, most recently from e59fb68 to 13b5d3a Compare October 16, 2025 23:58
}

func (m *VisionModel) positions(ctx ml.Context, grid *Grid) (_, _ ml.Tensor) {
indices := ctx.Input().FromIntSlice(slices.Collect(func(yield func(int32) bool) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just return the slice here? Aren't you yielding and then immediately collecting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is cleaner. the alternative is to build up a slice instead of simply generating a slice


halfDim := m.headDim() / 2
maxGrid := max(grid.Height, grid.Width)
frequencies := ctx.Input().FromFloatSlice(slices.Collect(func(yield func(float32) bool) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here. Why not just return the slice instead the complexity of the iterator?

spatialMergeSize: int(c.Uint("vision.spatial_merge_size", 2)),
temporalPatchSize: int(c.Uint("vision.temporal_patch_size", 2)),
gridPerSide: int(math.Sqrt(float64(c.Uint("vision.num_positional_embeddings", 2304)))),
mropeSections: slices.Collect(func(yield func(int) bool) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above.

@mxyng mxyng force-pushed the mxyng/convert branch 4 times, most recently from 2bd4b6e to 2b26dc7 Compare October 20, 2025 21:19
@mxyng mxyng changed the base branch from mxyng/convert to mxyng/server-tests October 20, 2025 21:21
@mxyng mxyng force-pushed the mxyng/server-tests branch from b5535ec to 05bc209 Compare October 20, 2025 23:42
@mxyng mxyng force-pushed the mxyng/qwen3vl branch 5 times, most recently from b51c8bb to c9b37ec Compare October 22, 2025 18:47
@mxyng mxyng changed the base branch from mxyng/server-tests to main October 27, 2025 22:35
@mxyng mxyng force-pushed the mxyng/qwen3vl branch 3 times, most recently from 7d5d232 to 9d60e9b Compare October 28, 2025 03:27
Comment on lines +116 to +124
func makeSlice2D[T int32 | float32](n0, n1 int) iter.Seq[[]T] {
return func(yield func([]T) bool) {
for range n0 {
if !yield(make([]T, n1)) {
return
}
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are iterators required here? (For readability)

Copy link
Member

@jmorganca jmorganca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, @pdevine should take another look (and has some outstanding comments)

var discard int32
for discard < max(targetFree-currentFree, 0) {
if sameBatch := inputs[numKeep+discard].SameBatch; sameBatch > 0 {
discard += int32(sameBatch)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SameBatch is the number of tokens following the current one that need to be in the same batch so I believe that this should be discard += 1 + int32(sameBatch). You actually should not need to special case it - if SameBatch is 0 then it the same as the current non-SameBatch case.

The behavior of this loop is a little bit different from how we do truncation in NewSequence, which is SameBatch aware. That one will keep extending discard is there are overlapping SameBatch. That scenario is sort of undefined behavior but it's better to be consistent about it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SameBatch is the number of tokens following the current one

that's not how it's being used right now. models are setting SameBatch to include the token setting SameBatch

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like some models probably use SameBatch that way but others use the original definition. Regardless, the runner has always executed batches as described above, so that's how these models are being run. It doesn't help things for shifting to have a different interpretation from the rest of the runner.

// PostTokenize arranges Qwen 3 VL's inputs for the forward pass
func (m *Model) PostTokenize(inputs []*input.Input) ([]*input.Input, error) {
m.positionCache = m.positionCache[:0]
return slices.Collect(func(yield func(*input.Input) bool) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm don't think the Collect() / yield iterator pattern is adding anything here

}

func (m *VisionPositionEmbedding) Forward(ctx ml.Context, hiddenStates ml.Tensor, grid *Grid, opts VisionOptions) ml.Tensor {
indexSlice := slices.Collect(makeSlice2D[int32](4, grid.Height*grid.Width))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be easier to read as:

indexSlice := make([][]int32, 4)
weightSlice := make([][]float32, 4)

and then just appending the ints/float32s inside of the nested loop below.

@mxyng mxyng force-pushed the mxyng/qwen3vl branch 6 times, most recently from 6be5624 to 77ea19a Compare October 28, 2025 20:42
Copy link
Contributor

@jessegross jessegross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The runner/cache/GGML changes look good to me. I didn't review anything specific to the model itself.

@mxyng mxyng force-pushed the mxyng/qwen3vl branch 2 times, most recently from d655b0c to 26a8bb7 Compare October 28, 2025 22:47
Copy link
Contributor

@pdevine pdevine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ship it!

@mxyng mxyng merged commit 7d25b9e into main Oct 29, 2025
9 checks passed
@mxyng mxyng deleted the mxyng/qwen3vl branch October 29, 2025 00:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants