Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llama : expose model's rope_freq_scale in the API #3418

Merged
merged 1 commit into from
Oct 3, 2023

Conversation

grencez
Copy link
Contributor

@grencez grencez commented Sep 30, 2023

I think this is necessary for automatic implementations of https://github.com/ggerganov/llama.cpp/tree/master/examples/main#extended-context-size when the model's RoPE scaling factor isn't 1.0. (We want to further scale it rather than overwriting the value, right?)

so it can be scaled further before creating a context.
@ggerganov ggerganov merged commit 48be797 into ggerganov:master Oct 3, 2023
32 checks passed
@grencez grencez deleted the model_rope branch October 3, 2023 18:20
grencez added a commit to rendezqueue/rendezllama that referenced this pull request Oct 4, 2023
joelkuiper added a commit to vortext/llama.cpp that referenced this pull request Oct 5, 2023
…example

* 'master' of github.com:ggerganov/llama.cpp: (24 commits)
  convert : fix Baichuan2 models by using vocab size in config.json (ggerganov#3299)
  readme : add project status link
  ggml : fix build after ggerganov#3329
  llm : add Refact model (ggerganov#3329)
  sync : ggml (conv 1d + 2d updates, UB fixes) (ggerganov#3468)
  finetune : readme fix typo (ggerganov#3465)
  ggml : add RISC-V Vector Support for K-Quants and improved the existing intrinsics (ggerganov#3453)
  main : consistent prefix/suffix coloring (ggerganov#3425)
  llama : fix session saving/loading (ggerganov#3400)
  llama : expose model's rope_freq_scale in the API (ggerganov#3418)
  metal : alibi for arbitrary number of heads (ggerganov#3426)
  cmake : make LLAMA_NATIVE flag actually use the instructions supported by the processor (ggerganov#3273)
  Work on the BPE tokenizer (ggerganov#3252)
  convert : fix vocab size when not defined in hparams (ggerganov#3421)
  cmake : increase minimum version for add_link_options (ggerganov#3444)
  CLBlast: Add broadcast support for matrix multiplication (ggerganov#3402)
  gguf : add BERT, MPT, and GPT-J arch info (ggerganov#3408)
  gguf : general usability improvements (ggerganov#3409)
  cmake : make CUDA flags more similar to the Makefile (ggerganov#3420)
  finetune : fix ggerganov#3404 (ggerganov#3437)
  ...
yusiwen pushed a commit to yusiwen/llama.cpp that referenced this pull request Oct 7, 2023
so it can be scaled further before creating a context.
grencez added a commit to rendezqueue/rendezllama that referenced this pull request Oct 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants