-
Notifications
You must be signed in to change notification settings - Fork 13.6k
feat(model): add qwen3vl #12665
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(model): add qwen3vl #12665
Conversation
eddb168 to
cc6ed87
Compare
|
Can you add it to the list of image models we test here? |
2bc23ea to
5c51e3e
Compare
e59fb68 to
13b5d3a
Compare
model/models/qwen3vl/model_vision.go
Outdated
| } | ||
|
|
||
| func (m *VisionModel) positions(ctx ml.Context, grid *Grid) (_, _ ml.Tensor) { | ||
| indices := ctx.Input().FromIntSlice(slices.Collect(func(yield func(int32) bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just return the slice here? Aren't you yielding and then immediately collecting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is cleaner. the alternative is to build up a slice instead of simply generating a slice
model/models/qwen3vl/model_vision.go
Outdated
|
|
||
| halfDim := m.headDim() / 2 | ||
| maxGrid := max(grid.Height, grid.Width) | ||
| frequencies := ctx.Input().FromFloatSlice(slices.Collect(func(yield func(float32) bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment here. Why not just return the slice instead the complexity of the iterator?
| spatialMergeSize: int(c.Uint("vision.spatial_merge_size", 2)), | ||
| temporalPatchSize: int(c.Uint("vision.temporal_patch_size", 2)), | ||
| gridPerSide: int(math.Sqrt(float64(c.Uint("vision.num_positional_embeddings", 2304)))), | ||
| mropeSections: slices.Collect(func(yield func(int) bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above.
2bd4b6e to
2b26dc7
Compare
b5535ec to
05bc209
Compare
b51c8bb to
c9b37ec
Compare
7d5d232 to
9d60e9b
Compare
| func makeSlice2D[T int32 | float32](n0, n1 int) iter.Seq[[]T] { | ||
| return func(yield func([]T) bool) { | ||
| for range n0 { | ||
| if !yield(make([]T, n1)) { | ||
| return | ||
| } | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are iterators required here? (For readability)
jmorganca
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, @pdevine should take another look (and has some outstanding comments)
runner/ollamarunner/cache.go
Outdated
| var discard int32 | ||
| for discard < max(targetFree-currentFree, 0) { | ||
| if sameBatch := inputs[numKeep+discard].SameBatch; sameBatch > 0 { | ||
| discard += int32(sameBatch) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SameBatch is the number of tokens following the current one that need to be in the same batch so I believe that this should be discard += 1 + int32(sameBatch). You actually should not need to special case it - if SameBatch is 0 then it the same as the current non-SameBatch case.
The behavior of this loop is a little bit different from how we do truncation in NewSequence, which is SameBatch aware. That one will keep extending discard is there are overlapping SameBatch. That scenario is sort of undefined behavior but it's better to be consistent about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SameBatchis the number of tokens following the current one
that's not how it's being used right now. models are setting SameBatch to include the token setting SameBatch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like some models probably use SameBatch that way but others use the original definition. Regardless, the runner has always executed batches as described above, so that's how these models are being run. It doesn't help things for shifting to have a different interpretation from the rest of the runner.
| // PostTokenize arranges Qwen 3 VL's inputs for the forward pass | ||
| func (m *Model) PostTokenize(inputs []*input.Input) ([]*input.Input, error) { | ||
| m.positionCache = m.positionCache[:0] | ||
| return slices.Collect(func(yield func(*input.Input) bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm don't think the Collect() / yield iterator pattern is adding anything here
| } | ||
|
|
||
| func (m *VisionPositionEmbedding) Forward(ctx ml.Context, hiddenStates ml.Tensor, grid *Grid, opts VisionOptions) ml.Tensor { | ||
| indexSlice := slices.Collect(makeSlice2D[int32](4, grid.Height*grid.Width)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be easier to read as:
indexSlice := make([][]int32, 4)
weightSlice := make([][]float32, 4)
and then just appending the ints/float32s inside of the nested loop below.
6be5624 to
77ea19a
Compare
jessegross
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The runner/cache/GGML changes look good to me. I didn't review anything specific to the model itself.
d655b0c to
26a8bb7
Compare
pdevine
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ship it!
No description provided.