Skip to content

Commit ee45b44

Browse files
authored
Simplify README + Propagate Relay Provider args (#38)
1 parent ec4e471 commit ee45b44

File tree

3 files changed

+32
-26
lines changed

3 files changed

+32
-26
lines changed

.gitignore

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -117,6 +117,7 @@ triton_kernel_logs/
117117
*.log
118118
session_*/
119119
worker_*/
120+
.fuse/
120121

121122
# Generated kernels
122123
kernel.py
@@ -139,6 +140,6 @@ CLAUDE.md
139140
.Spotlight-V100
140141
.Trashes
141142
ehthumbs.db
142-
Thumbs.db
143+
Thumbs.db
143144
# Local batch runner
144145
scripts/run_kernelbench_batch.py

README.md

Lines changed: 27 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -18,42 +18,46 @@ Every stage writes artifacts to a run directory under `.fuse/<run_id>/`, includi
1818
## Quickstart
1919

2020
### Requirements
21+
- Python 3.8 – 3.12
2122
- Linux or macOS; CUDA‑capable GPU for Triton execution
22-
- Python 3.8–3.12
23-
- Triton (install separately: `pip install triton` or nightly from source)
24-
- At least one LLM provider:
25-
- OpenAI (`OPENAI_API_KEY`, models like `o4-mini`, `gpt-5`)
26-
- Anthropic (`ANTHROPIC_API_KEY`; default fallback model is `claude-sonnet-4-20250514` when `OPENAI_MODEL` is unset)
27-
- Any OpenAI‑compatible relay endpoint (`LLM_RELAY_URL`, optional `LLM_RELAY_API_KEY`; see `triton_kernel_agent/providers/relay_provider.py`)
28-
- Gradio (UI dependencies; installed as part of the core package)
23+
- Triton (installed separately: `pip install triton` or nightly from source)
2924
- PyTorch (https://pytorch.org/get-started/locally/)
25+
- LLM provider ([OpenAI](https://openai.com/api/), [Anthropic](https://www.anthropic.com/), or a self-hosted relay)
3026

31-
### Installation
27+
### Install
3228
```bash
33-
git clone https://github.com/pytorch-labs/KernelAgent.git
34-
cd KernelAgent
35-
python -m venv .venv && source .venv/bin/activate # choose your own env manager
36-
pip install -e .[dev] # project + tooling deps
37-
pip install triton # not part of extras; install the version you need
29+
pip install -e .
30+
```
3831

39-
# (optional) Install KernelBench for problem examples
32+
#### (Optional) Install KernelBench for problem examples
33+
```bash
4034
git clone https://github.com/ScalingIntelligence/KernelBench.git
4135
```
36+
Note: By default, KernelAgent UI searches for KernelBench at the same level as `KernelAgent`. (i.e. `../KernelBench`)
4237

43-
### Configure credentials
44-
You can export keys directly or use an `.env` file that the CLIs load automatically:
38+
### Configure
39+
You can export keys directly or use an `.env` file that the CLIs load automatically.
4540

4641
```bash
47-
OPENAI_API_KEY=sk-...
48-
OPENAI_MODEL=gpt-5 # override default fallback (claude-sonnet-4-20250514)
42+
OPENAI_MODEL=gpt-5 # default model for extraction
4943
NUM_KERNEL_SEEDS=4 # parallel workers per kernel
5044
MAX_REFINEMENT_ROUNDS=10 # retry budget per worker
51-
LOG_LEVEL=INFO
45+
LOG_LEVEL=INFO # logging level
46+
```
47+
48+
#### LLM Providers
49+
KernelAgent currently supports OpenAI and Anthropic out-of-the-box. You can also use a custom OpenAI endpoint.
50+
These can be configured in `.env` or via environment variables.
51+
```bash
52+
# OpenAI (models like `o4-mini`, `gpt-5`)
53+
OPENAI_API_KEY=sk-...
54+
55+
# Anthropic (default; `claude-sonnet-4-20250514` is used when `OPENAI_MODEL` is unset)
56+
ANTHROPIC_API_KEY=sk-ant-...
5257

53-
# Optional relay configuration for self-hosted gateways
54-
# LLM_RELAY_URL=http://127.0.0.1:11434
55-
# LLM_RELAY_API_KEY=your-relay-token
56-
# LLM_RELAY_TIMEOUT_S=120
58+
# Relay configuration for self-hosted gateways
59+
LLM_RELAY_URL=http://127.0.0.1:11434
60+
LLM_RELAY_TIMEOUT_S=120
5761
```
5862

5963
More knobs live in `triton_kernel_agent/agent.py` and `Fuser/config.py`.

triton_kernel_agent/providers/relay_provider.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@
1818

1919
import requests
2020
import logging
21+
import os
2122

2223
from .base import BaseProvider, LLMResponse
2324

@@ -34,7 +35,7 @@ class RelayProvider(BaseProvider):
3435
"""
3536

3637
def __init__(self):
37-
self.server_url = "http://127.0.0.1:11434"
38+
self.server_url = os.environ.get("LLM_RELAY_URL", "http://127.0.0.1:11434")
3839
self.is_available_flag = False
3940
super().__init__()
4041

@@ -68,7 +69,7 @@ def get_response(
6869
self.server_url,
6970
json=request_data,
7071
headers={"Content-Type": "application/json"},
71-
timeout=120.0,
72+
timeout=int(os.environ.get("LLM_RELAY_TIMEOUT_S", 120)),
7273
)
7374

7475
if response.status_code != 200:

0 commit comments

Comments
 (0)