Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

configs/newfile results in vLLM error #3

Closed
sethahrenbach opened this issue Aug 27, 2024 · 1 comment
Closed

configs/newfile results in vLLM error #3

sethahrenbach opened this issue Aug 27, 2024 · 1 comment

Comments

@sethahrenbach
Copy link

sethahrenbach commented Aug 27, 2024

I made a new configs/ file

from prover.utils import AttrDict
from prover.algorithms import RMaxTS


# dataset
data_path = 'datasets/bin_addition.jsonl'
data_split = 'valid'
data_repeat = 16  # run 16 * 6400

# verifier
lean_max_concurrent_requests = 64
lean_memory_limit = 10
lean_timeout = 300

# model
batch_size = 512
model_path = 'deepseek-ai/DeepSeek-Prover-V1.5-RL'
model_args = AttrDict(
    mode='cot', # `cot` or `non-cot`
    temperature=1,
    max_tokens=2048,
    top_p=0.95,
)

# algorithm
n_search_procs = 256
sampler = dict(
    algorithm=RMaxTS,
    gamma=0.99,
    sample_num=6400,
    concurrent_num=32,
    tactic_state_comment=True,
    ckpt_interval=128,
    log_interval=32,
)

It uses the same settings as the configs/RMaxTS.py file, but I changed to data_path to point to a jsonl file with a single simple theorem. When I run python -m prover.launch --config=configs/RMaxTSbin.py --log_dir=logs/RMaxTSbin_results it results in the following error:

ValueError: The model's max seq len (4096) is larger than the maximum number of tokens that can be stored in KV cache (2112). Try increasing `gpu_memory_utilization` or decreasing `max_model_len` when initializing the engine.
INFO 08-27 11:09:45 model_runner.py:976] Capturing the model for CUDA graphs. This may lead to unexpected consequences if the model is not static. To run the model in eager mode, set 'enforce_eager=True' or use '--enforce-eager' in the CLI.
INFO 08-27 11:09:45 model_runner.py:980] CUDA graphs can take additional 1~3 GiB memory per GPU. If you are running out of memory, consider decreasing `gpu_memory_utilization` or enforcing eager mode. You can also reduce the `max_num_seqs` as needed to decrease memory usage.
INFO 08-27 11:09:47 model_runner.py:1057] Graph capturing finished in 2 secs.
All 2 LLMProcesses stopped

which I believe is related to vllm-project/vllm#2418

My new config file is:

{"name": "binaryAdditionLength", "split": "valid", "informal_prefix": "/-- Let `binaryAddition` be a function that takes two lists of booleans and returns a list of booleans representing their sum in binary. Prove that the length of the output is always positive (greater than zero). -/\n", "formal_statement": "theorem binaryAdditionLength (s t : List Bool) :\n  (binaryAddition s t).length > 0 := by\n", "goal": "s t : List Bool\n⊢ (binaryAddition s t).length > 0", "header": "import Mathlib\n\nopen List\n\ndef binaryAddition (s t : List Bool) : List Bool :=\n  let rec loop : Nat → List Bool → List Bool\n    | i, acc =>\n      if i ≥ 10 then\n        acc.reverse\n      else\n        match s.get? i, t.get? i with\n        | some c, some d =>\n          let sum := (if c then 1 else 0) + (if d then 1 else 0) + (if acc.head? = some true then 1 else 0)\n          loop (i+1) ((sum % 2 = 1) :: acc)\n        | _, _ => acc.reverse\n  termination_by i _ => 10 - i\n  loop 0 []\n\n"}
@sethahrenbach sethahrenbach changed the title configs/newfile does not alter settings, results in vLLM error configs/newfile results in vLLM error Aug 27, 2024
@sethahrenbach
Copy link
Author

sethahrenbach commented Aug 27, 2024

In prover/workers/generator.py I changed line 32 to

llm = LLM(model=self.model_path, max_num_batched_tokens=8192, seed=seed, trust_remote_code=True, max_model_len=1096)

adding the max_model_length and I got past the vLLM error.

@soloice soloice closed this as completed Sep 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants