Eval bug: llama-server memory leak / infinite graph rebuild with LoRA between commits 7692 (works) and 7792 (broken)

### Name and Version

### Bug description

There is a **severe memory leak / infinite graph rebuild** in `llama-server` when using LoRA adapters.

The issue is **100% reproducible** and was bisected to a very narrow commit range.

---

### Working vs broken versions

- ✅ **Commit <= 7692** — works correctly
- ❌ **Commit >= 7792** — broken

Between these commits, `llama-server` starts to:
- repeatedly rebuild execution graphs
- continuously reserve memory
- increase RAM usage without bound
- eventually exhaust system memory

This happens even with:
- CPU-only mode
- single request
- `--parallel 1`
- small context size

---

### Environment

- OS: Windows 11
- llama.cpp built from source
- GPU: RTX 3060 (also reproduced with `-ngl 0`, CPU-only)
- CUDA: 12.4 (but **issue reproduces without CUDA**)

---

### Model / LoRA setup

- Base model:  
  `Meta-Llama-3.1-8B-Instruct.Q8_0.gguf`
- LoRA:  
  Converted to GGUF (`convert_lora_to_gguf.py`)
- LoRA was trained on the same base model (non-quantized)

---

### Command used to reproduce (CPU-only)

```bash
llama-server.exe ^
  -m Meta-Llama-3.1-8B-Instruct.Q8_0.gguf ^
  --lora _gestalt-adapter.gguf ^
  -ngl 0 ^
  -c 1024 ^
  --parallel 1 ^
  --host 0.0.0.0 ^
  --port 8080


### Operating systems

Windows

### GGML backends

CPU, CUDA

### Hardware

rtx 5060 3060

### Models

_No response_

### Problem description & steps to reproduce

Problem description & steps to reproduce
*
Please give us a summary of the problem and tell us how to reproduce it. If you can narrow down the bug to specific hardware, compile flags, or command line arguments, that information would be very much appreciated by us. If possible, please try to reproduce the issue using llama-completion with -fit off. If you can only reproduce the issue with -fit on, please provide logs both with and without --verbose.

### First Bad Commit

_No response_

### Relevant log output

<details>
<summary>Logs</summary>


```console

```
</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eval bug: llama-server memory leak / infinite graph rebuild with LoRA between commits 7692 (works) and 7792 (broken) #19217

Name and Version

Bug description

Working vs broken versions

Environment

Model / LoRA setup

Command used to reproduce (CPU-only)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Eval bug: llama-server memory leak / infinite graph rebuild with LoRA between commits 7692 (works) and 7792 (broken) #19217

Description

Name and Version

Bug description

Working vs broken versions

Environment

Model / LoRA setup

Command used to reproduce (CPU-only)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions