You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Create llava-survery-v2.py
* Update convert-image-encoder-to-gguf.py
* Update convert-image-encoder-to-gguf.py
* Rename llava-survery-v2.py to llava-surgery-v2.py
* Update convert-image-encoder-to-gguf.py
will now search for projector
* Update convert-image-encoder-to-gguf.py
whoops
* Update llava-surgery-v2.py
* Clip: Bugfix for normalization (it did not loat the 3 std and mean values)
Clip: bicubic resize function
Clip: added save-to-bmp/pil for debugging and conversion from/to 32/8 images
Clip: added normalization with FP16 precision simulation (image tensors match HF implementation, can be switched off, only used for llava-1.6)
Clip: added newline tensor, mergetype kv, image-grid kv, new resize-pad function with resolution from gridpoints
Clip: clip_image_preprocess now returns a float * vector instead of float, this way llava 1.5 and 1.6 is supported
llava: added ggml cpu graph for embedding patching, added spatial_unpad preliminary support, added a lot of comments that need to be cleaned when all is final
convert-image-encoder: fixed image-grid flattening
* whitespace corrections
* ws
* Tensors are now properly permuted.
Before the embeddings were inserted 1:1, now they are split into the 24x24 patches as in reference.
* ws
* added verbose_prompt support into cli
added stopwords for llava-1.6 into cli
* moved llava functions to llava.cpp, made clip.h C compatible API, replaced vector style functions with pointers, added a debug define to remove functions from compilation while not needed
* ws
* convert : skip unknown tensors (need for LLaVA)
* llava : update readme
* llava : fix compile warnings
* llava : style
* convert : add --skip-unknown CLI arg
* server : remove clip structs
* bugfix for non llava-1.6
It should now work with llava-1.5 as well
* clip : minor code rearrange
* llava : update readme a bit
---------
Co-authored-by: John <cmt-nct@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
parser=argparse.ArgumentParser(description="Convert a LLaMa model to a GGML compatible file")
1380
-
parser.add_argument("--awq-path", type=Path, help="Path to scale awq cache file", default=None)
1381
-
parser.add_argument("--dump", action="store_true", help="don't convert, just show what's in the model")
1382
-
parser.add_argument("--dump-single", action="store_true", help="don't convert, just show what's in a single model file")
1383
-
parser.add_argument("--vocab-only", action="store_true", help="extract only the vocab")
1384
-
parser.add_argument("--outtype", choices=output_choices, help="output format - note: q8_0 may be very slow (default: f16 or f32 based on input)")
1385
-
parser.add_argument("--vocab-dir", type=Path, help="directory containing tokenizer.model, if separate from model file")
1386
-
parser.add_argument("--vocab-type", choices=vocab_types, help="The vocabulary format used to define the tokenizer model (default: spm)", default="spm")
1387
-
parser.add_argument("--outfile", type=Path, help="path to write to; default: based on input")
1388
-
parser.add_argument("model", type=Path, help="directory containing model file, or model file itself (*.pth, *.pt, *.bin)")
1389
-
parser.add_argument("--ctx", type=int, help="model training context (default: based on input)")
1390
-
parser.add_argument("--concurrency", type=int, help=f"concurrency used for conversion (default: {DEFAULT_CONCURRENCY})", default=DEFAULT_CONCURRENCY)
1391
-
parser.add_argument("--big-endian", action="store_true", help="model is executed on big endian machine")
1392
-
parser.add_argument("--pad-vocab", action="store_true", help="add pad tokens when model vocab expects more than tokenizer metadata provides")
1384
+
parser.add_argument("--awq-path", type=Path, help="Path to scale awq cache file", default=None)
1385
+
parser.add_argument("--dump", action="store_true", help="don't convert, just show what's in the model")
1386
+
parser.add_argument("--dump-single", action="store_true", help="don't convert, just show what's in a single model file")
1387
+
parser.add_argument("--vocab-only", action="store_true", help="extract only the vocab")
1388
+
parser.add_argument("--outtype", choices=output_choices, help="output format - note: q8_0 may be very slow (default: f16 or f32 based on input)")
1389
+
parser.add_argument("--vocab-dir", type=Path, help="directory containing tokenizer.model, if separate from model file")
1390
+
parser.add_argument("--vocab-type", choices=vocab_types, help="The vocabulary format used to define the tokenizer model (default: spm)", default="spm")
1391
+
parser.add_argument("--outfile", type=Path, help="path to write to; default: based on input")
1392
+
parser.add_argument("model", type=Path, help="directory containing model file, or model file itself (*.pth, *.pt, *.bin)")
1393
+
parser.add_argument("--ctx", type=int, help="model training context (default: based on input)")
1394
+
parser.add_argument("--concurrency", type=int, help=f"concurrency used for conversion (default: {DEFAULT_CONCURRENCY})", default=DEFAULT_CONCURRENCY)
1395
+
parser.add_argument("--big-endian", action="store_true", help="model is executed on big endian machine")
1396
+
parser.add_argument("--pad-vocab", action="store_true", help="add pad tokens when model vocab expects more than tokenizer metadata provides")
1397
+
parser.add_argument("--skip-unknown", action="store_true", help="skip unknown tensor names instead of failing")
0 commit comments