KPI Search

Semantic search for Swedish KPIs from Kolada, using vector embeddings.

Quick Start

Requires uv package manager.

uv sync                                       # Install dependencies
uv run python -m kpisearch.sync               # Download KPIs + build embeddings
uv run python -m kpisearch.search build-all   # Build embeddings for all models
uv run uvicorn kpisearch.app:app              # Start server

Open http://localhost:8000 in your browser.

Data Pipeline

The sync command is the easiest way to get started. It fetches KPIs from the Kolada API, detects changes using content hashes, and only recomputes embeddings for what changed.

uv run python -m kpisearch.sync

For a full rebuild instead (e.g. after switching models):

uv run python -m kpisearch.download_kpis     # Re-download all KPIs
uv run python -m kpisearch.search build       # Rebuild embeddings for current model
uv run python -m kpisearch.search build-all   # Rebuild embeddings for all models

Search Methods

The frontend offers three search methods (toggleable via checkboxes):

Method	How it works
Semantisk	Pure vector similarity search
Hybrid	Semantic search with additive keyword boost for title matches
Kolada API	Proxied title search via Kolada's own API

Hybrid Algorithm

The hybrid search scores each KPI in three stages:

Semantic similarity — The query is embedded and compared against pre-computed title and description embeddings using cosine similarity. The two scores are combined with a configurable weight (default 60% title, 40% description):
```
semantic = title_weight * title_sim + (1 - title_weight) * desc_sim
```
Keyword boost — Each query word is checked for literal substring presence in the KPI title (case-insensitive). The boost is proportional to the fraction of query words matched, scaled by a brevity factor that favors shorter titles (where the query covers more of the title):
```
match_ratio  = matched_words / total_words
brevity      = clamp(query_char_len / title_char_len, 0, 1)
keyword_boost = match_ratio * (1 + 0.5 * brevity)
```
Standard KPI bonus — KPIs with IDs starting with N (Kolada's standard/national indicators) get a 15% multiplicative boost to the combined score:
```
score = (semantic + 0.25 * keyword_boost) * (1 + 0.15 * is_standard)
```

The top-k results are returned sorted by final score.

API

GET /api/search

Semantic search. Parameters: q (required), limit (default 10), min_score (default 0.4), title_weight (0-1).

GET /api/hybrid-search

Hybrid search. Parameters: q (required), limit (default 10), title_weight (0-1).

GET /api/kolada-search

Proxied Kolada API search. Parameters: q (required), limit (default 15).

curl "http://localhost:8000/api/search?q=skolresultat&limit=5"

Models

Three embedding models, switchable at runtime via the admin panel:

Model	Notes
KBLab/sentence-bert-swedish-cased	Swedish-specific (default)
intfloat/multilingual-e5-small	High-quality multilingual
sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2	Lightweight multilingual

Admin

Admin panel at /admin, protected with HTTP Basic Auth.

Default password: change_this_now_really! (forced change on first login).

uv run python -m kpisearch.auth set <new-password>   # Set password
uv run python -m kpisearch.auth reset                 # Reset to default

Development

uv run uvicorn kpisearch.app:app --reload   # Dev server with auto-reload
uv run ruff check kpisearch/                # Lint
uv run ruff format kpisearch/               # Format
uv run ty check                             # Type check

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
data		data
kpisearch		kpisearch
.gitignore		.gitignore
.python-version		.python-version
CLAUDE.md		CLAUDE.md
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

KPI Search

Quick Start

Data Pipeline

Search Methods

Hybrid Algorithm

API

GET /api/search

GET /api/hybrid-search

GET /api/kolada-search

Models

Admin

Development

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

oengwall/kpisearch

Folders and files

Latest commit

History

Repository files navigation

KPI Search

Quick Start

Data Pipeline

Search Methods

Hybrid Algorithm

API

GET /api/search

GET /api/hybrid-search

GET /api/kolada-search

Models

Admin

Development

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages