-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Model: Granite docling + Idefics3 preprocessing (SmolVLM) #16206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
17 commits
Select commit
Hold shift + click to select a range
64e10f5
feat: Add granite-docling conversion using trillion pretokenizer
gabe-l-hart 428db16
feat: Add granite-docling vocab pre enum
gabe-l-hart c2202d2
fix: Use granite-docling pre
gabe-l-hart 4ef3128
feat: Add clip_is_idefics3
gabe-l-hart 0aef5e9
feat: Allow multi-token boundary sequences for image templating
gabe-l-hart 8819c96
feat: Add tiling support for idefices3 in clip.cpp
gabe-l-hart e172313
feat: Partial support for full templating for idefics3 in mtmd
gabe-l-hart 64cef62
feat: Fully working image preprocessing for idefics3 w/ resize and sl…
gabe-l-hart e1ba793
feat: Parse the preprocessor config's longest side and add it to the …
gabe-l-hart f5a7f4d
fix: Use the longest side instead of size * scale_factor
gabe-l-hart cb51d4e
fix: Allow batch encoding and remove clip_is_idefics3
gabe-l-hart 08f3055
Merge remote-tracking branch 'origin/master' into GraniteDocling
gabe-l-hart 64fc676
Merge remote-tracking branch 'origin/master' into GraniteDocling
gabe-l-hart 899b48a
refactor: Remove unnecessary conditionals for empty token vectors
gabe-l-hart a966110
refactor: Use image_manipulation util
gabe-l-hart 4be2ce9
add test model
ngxson 72c6e67
Merge branch 'master' into GraniteDocling
ngxson File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -31,6 +31,7 @@ | |
|
||
// vision-specific | ||
#define KEY_IMAGE_SIZE "clip.vision.image_size" | ||
#define KEY_PREPROC_IMAGE_SIZE "clip.vision.preproc_image_size" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wasn't totally sure the right name for this one since it comes from |
||
#define KEY_PATCH_SIZE "clip.vision.patch_size" | ||
#define KEY_IMAGE_MEAN "clip.vision.image_mean" | ||
#define KEY_IMAGE_STD "clip.vision.image_std" | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Us and our long names 😞. I assume vertical alignment is worth preserving, but happy to not touch the other lines if preferred.