Skip to content

Commit

Permalink
Merge pull request #228 from RamiAwar/add-basic-auth-support
Browse files Browse the repository at this point in the history
Add basic auth support
  • Loading branch information
RamiAwar authored Jun 30, 2024
2 parents d3cc864 + 2c3ecf7 commit 9dc0fe9
Show file tree
Hide file tree
Showing 27 changed files with 677 additions and 165 deletions.
51 changes: 22 additions & 29 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,9 @@ RUN npm install
COPY frontend/ .

# Temporary setup - need local env as the 'production' build is landing page only
ARG API_URL="http://localhost:7377"

ENV VITE_API_URL=$API_URL
ENV NODE_ENV=local
RUN npm run build
# -------------------------------
Expand All @@ -31,7 +34,7 @@ RUN npm run build
# -------------------------------
# BASE-BUILD IMAGE WITH BACKEND
# Build backend dependencies and install them
# ------------------------------
# -------------------------------
FROM python:3.11.6-slim-bookworm as temp-backend

# Set working directory
Expand All @@ -49,7 +52,7 @@ ENV PYTHONDONTWRITEBYTECODE=1
RUN pip install --no-cache-dir poetry

# Install build dependencies, install dependencies, remove build dependencies
RUN apt update && \
RUN apt-get clean && apt update --fix-missing && \
apt upgrade -y && \
apt-get install git libpq-dev build-essential -y

Expand Down Expand Up @@ -78,6 +81,7 @@ COPY backend/*.py .
COPY backend/samples ./samples
COPY backend/dataline ./dataline
COPY backend/alembic ./alembic
COPY backend/templates ./templates
COPY backend/alembic.ini .

WORKDIR /home/dataline
Expand All @@ -89,37 +93,26 @@ ENV SQLITE_PATH="/home/.dataline/db.sqlite3"
ENV DATA_DIRECTORY="/home/.dataline/data"

# -------------------------------
# DEV BUILD WITH MINIMAL DEPS
# SPA BUILD WITH MINIMAL DEPS
# -------------------------------
FROM base as dev
FROM base as spa

WORKDIR /home/dataline/backend

# Running alembic and uvicorn without combining them in a bash -c command won't work
CMD ["bash", "-c", "python -m alembic upgrade head && python -m uvicorn dataline.main:app --port=7377 --host=0.0.0.0 --reload"]


# -------------------------------
# PROD BUILD WITH MINIMAL DEPS
# -------------------------------
# FROM python:3.11.8-alpine as prod
FROM base as prod

# Setup supervisor and caddy
WORKDIR /home/dataline

# Install supervisor to manage be/fe processes
RUN pip install --no-cache-dir supervisor

# Install Caddy server
# RUN apk update && apk add caddy
RUN apt update && apt install caddy -y


# Copy in supervisor config, frontend build, backend source
COPY supervisord.conf .
# Copy in frontend build so we can serve it from FastAPI
COPY --from=temp-frontend /home/dataline/frontend/dist /home/dataline/frontend/dist
COPY frontend/Caddyfile /home/dataline/frontend/Caddyfile
RUN \
cp -r /home/dataline/frontend/dist/assets /home/dataline/backend && \
cp /home/dataline/frontend/dist/favicon.ico /home/dataline/backend/assets && \
cp /home/dataline/frontend/dist/manifest.json /home/dataline/backend/assets

# This stage is meant to be used as an SPA server with FastAPI serving a React build
ENV SPA_MODE=1
ARG AUTH_USERNAME
ENV AUTH_USERNAME=$AUTH_USERNAME
ARG AUTH_PASSWORD
ENV AUTH_PASSWORD=$AUTH_PASSWORD

# Running alembic and uvicorn without combining them in a bash -c command won't work
CMD ["bash", "-c", "python -m dataline.main"]

CMD ["supervisord", "-n"]
59 changes: 43 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,10 @@
</p>

<p align="center">
<strong>Chat with your data using natural language</strong>
<strong>💬 Chat with your data using natural language 📊</strong>
</p>
<p align="center">
<em>Gone are the days of time-consuming querying! Generate charts, tables, reports in seconds.</em>
<em>Gone are the days of time-consuming querying!</em>⚡️<em>Generate charts, tables, reports in seconds with DataLine: An AI-driven data analysis and visualization tool</em>🤓
</p>
<div align="center">
<img src="https://img.shields.io/github/downloads/ramiawar/dataline/total?style=flat&color=%2322c55e">
Expand All @@ -25,43 +25,54 @@
- [Linux](#linux)
- [Docker](#docker)
- [Running manually](#running-manually)
- [Startup Quest](#startup-quest)
- [Authentication](#authentication)
- [Startup Quest](#startup-quest)


## Who is this for?

Technical or non-technical people who want to explore data, fast.
It also works for backend developers to speed up drafting SQL queries and explore new DBs.
It's specially well-suited for businesses given it's security-first and open-source nature.
Technical or non-technical people who want to explore data, fast. ⚡️⚡️

It also works for backend developers to speed up drafting queries and explore new DBs with ease. 😎

It's especially well-suited for businesses given its security-first 🔒 and open-source 📖 nature.

## What is it?

DataLine is a simple tool for chatting with your data. It's privacy-focused, running only locally and storing everything on your device. It hides your data from the LLMs used by default, but this can be disabled if the data is not deemed sensitive.
DataLine is an AI-driven data analysis and visualization tool.

It's privacy-focused, storing everything on your device. No ☁️, only ☀️!

It hides your data from the LLMs used by default, but this can be disabled if the data is not deemed sensitive.

It can connect to a variety of data sources (Postgres, MySQL, SQLite, CSV, and more), execute queries, generate charts, and allow for copying the results to build reports quickly.
It can connect to a variety of data sources (Postgres, Snowflake, MySQL, SQLite, CSV, and more), execute queries, generate charts, and allow for copying the results to build reports quickly.

## Where is it going?

For now, we're trying to help people get insights out of their data, fast.

This is meant to enable non-technical folks to query data and aid data analysts in getting their jobs done 10x as fast.

But you can still influence the direction we go in. We're building this for you, so you have the biggest say.

## Feature Support
- [x] Connecting to Postgres, MySQL databases
- [x] Broad DB support: Postgres, MySQL, Snowflake, CSV, SQLite, and more
- [x] Generating and executing SQL from natural language
- [x] Ability to modify SQL results, save them, and re-run
- [x] Better support for explorative questions
- [x] Querying data files like CSV, SQLite (more connection types)
- [x] Charting via natural language
- [x] Modifying chart queries and re-rendering/refreshing charts
- [ ] Reporting tools (copy tables, copy charts)
- [ ] Storing copies of queries and labelling and searching them
- [ ] Creating dashboards
- [ ] Increasing connection support (NoSQL, Elasticsearch, ...)

With a lot more coming soon. You can still influence what we build, so if you're a user and you're down for it, we'd love to interview you! Book some time with one of us here:
- [Rami](https://calendly.com/ramiawar/quick)


## Getting started

### Setup
There are multiple ways of setting up DataLine, simplest being using a binary executable. This allows you to download a file and run it to get started.

A more flexible option is using our hosted Docker image. This allows you to setup authentication and other features if you need them.

#### Windows

Expand Down Expand Up @@ -95,7 +106,7 @@ You may also wish to use the binary instead, to do so, follow the instructions i
You can also use our official docker image and get started in one command. This is more suitable for business use:

```bash
docker run -p 2222:2222 -p 7377:7377 -v dataline:/home/.dataline --name dataline ramiawar/dataline:latest
docker run -p 7377:7377 -v dataline:/home/.dataline --name dataline ramiawar/dataline:latest
```

You can manage this as you would any other container. `docker start dataline`, `docker stop dataline`
Expand All @@ -104,7 +115,7 @@ For updating to a new version, just remove the container and rerun the command.

```bash
docker rm dataline
docker run -p 2222:2222 -p 7377:7377 -v dataline:/home/.dataline --name dataline ramiawar/dataline:latest
docker run -p 7377:7377 -v dataline:/home/.dataline --name dataline ramiawar/dataline:latest
```

To connect to the frontend, you can then visit:
Expand All @@ -114,6 +125,22 @@ To connect to the frontend, you can then visit:

Check the [backend](./backend/README.md) and [frontend](./frontend/README.md) readmes.

## Authentication

DataLine also supports basic auth 🔒 in self-hosted mode 🥳 in case you're hosting it and would like to secure it with a username/password.

Auth is NOT supported ❌ when running the DataLine executable.

To enable authentication on the self-hosted version, add the environment variables AUTH_USERNAME and AUTH_PASSWORD while launching the service. ✅

### With Docker

Inject the env vars with the docker run command as follows:
```docker run -p 7377:7377 -v dataline:/home/.dataline --name dataline -e AUTH_USERNAME=admin -e AUTH_PASSWORD=admin ramiawar/dataline:latest```

We plan on supporting multiple user auth in the future, but for now it supports a single user by default.


### Startup Quest

Go through the following checklist to explore DataLine's features!
Expand Down
22 changes: 22 additions & 0 deletions backend/dataline/api/auth/router.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
from typing import Annotated

import fastapi

from dataline.auth import validate_credentials

router = fastapi.APIRouter(
prefix="/auth",
tags=["auth"],
responses={401: {"description": "Incorrect username or password"}},
)


@router.post("/login")
async def login(username: Annotated[str, fastapi.Body()], password: Annotated[str, fastapi.Body()]) -> fastapi.Response:
validate_credentials(username, password)
return fastapi.Response(status_code=200)


@router.head("/login")
async def login_head() -> fastapi.Response:
return fastapi.Response(status_code=200)
20 changes: 15 additions & 5 deletions backend/dataline/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,17 @@
from typing import Any, AsyncContextManager, Callable, Mapping, Self

import fastapi
from fastapi import Request, status
from fastapi import Depends, Request, status
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import JSONResponse

from dataline.api.auth.router import router as auth_router
from dataline.api.connection.router import router as connection_router
from dataline.api.conversation.router import router as conversation_router
from dataline.api.result.router import router as result_router
from dataline.api.settings.router import router as settings_router
from dataline.auth import authenticate
from dataline.config import config
from dataline.errors import UserFacingError, ValidationError
from dataline.repositories.base import NotFoundError, NotUniqueError

Expand Down Expand Up @@ -44,10 +47,17 @@ def __init__( # type: ignore[misc]
allow_headers=["*"],
)

self.include_router(settings_router)
self.include_router(connection_router)
self.include_router(conversation_router)
self.include_router(result_router)
common_dependencies = []
if config.has_auth:
common_dependencies = [Depends(authenticate)]

# Add route for login
self.include_router(auth_router)

self.include_router(settings_router, dependencies=common_dependencies)
self.include_router(connection_router, dependencies=common_dependencies)
self.include_router(conversation_router, dependencies=common_dependencies)
self.include_router(result_router, dependencies=common_dependencies)

# Handle 500s separately to play well with TestClient and allow re-raising in tests
self.add_exception_handler(NotFoundError, handle_exceptions)
Expand Down
60 changes: 60 additions & 0 deletions backend/dataline/auth.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
import binascii
import secrets
from base64 import b64decode
from typing import Optional

from fastapi import Depends, HTTPException, status
from fastapi.security import HTTPBasic, HTTPBasicCredentials
from fastapi.security.utils import get_authorization_scheme_param
from starlette.requests import Request
from starlette.status import HTTP_401_UNAUTHORIZED

from dataline.config import config


class HTTPBasicCustomized(HTTPBasic):
# Override __call__ method to not send www-authenticate header back
async def __call__(self, request: Request) -> Optional[HTTPBasicCredentials]: # type: ignore
authorization = request.headers.get("Authorization")
scheme, param = get_authorization_scheme_param(authorization)
if not authorization or scheme.lower() != "basic":
if self.auto_error:
raise HTTPException(
status_code=HTTP_401_UNAUTHORIZED,
detail="Not authenticated",
)
else:
return None
invalid_user_credentials_exc = HTTPException(
status_code=HTTP_401_UNAUTHORIZED,
detail="Invalid authentication credentials",
)
try:
data = b64decode(param).decode("ascii")
except (ValueError, UnicodeDecodeError, binascii.Error):
raise invalid_user_credentials_exc # noqa: B904
username, separator, password = data.partition(":")
if not separator:
raise invalid_user_credentials_exc
return HTTPBasicCredentials(username=username, password=password)


security = HTTPBasicCustomized()


def validate_credentials(username: str, password: str) -> bool:
correct_username = secrets.compare_digest(username, str(config.auth_username))
correct_password = secrets.compare_digest(password, str(config.auth_password))
if not (correct_username and correct_password):
# Do not send www-authenticate header back
# as we do not want the browser to show a popup
# FE will handle authentication
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Incorrect email or password",
)
return True


def authenticate(credentials: HTTPBasicCredentials = Depends(security)) -> None:
validate_credentials(credentials.username, credentials.password)
13 changes: 11 additions & 2 deletions backend/dataline/config.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,8 @@
import sys
from pathlib import Path

from pydantic_settings import BaseSettings

from dataline.utils.appdirs import user_data_dir
from pydantic_settings import BaseSettings

# https://pyinstaller.org/en/v6.6.0/runtime-information.html
IS_BUNDLED = bool(getattr(sys, "frozen", False) and hasattr(sys, "_MEIPASS"))
Expand Down Expand Up @@ -37,5 +36,15 @@ class Config(BaseSettings):
environment: str = EnvironmentType.development if not IS_BUNDLED else EnvironmentType.production
release: str | None = None

# HTTP Basic Authentication
auth_username: str | None = None
auth_password: str | None = None

spa_mode: bool = False

@property
def has_auth(self) -> bool:
return bool(self.auth_username and self.auth_password)


config = Config()
Loading

0 comments on commit 9dc0fe9

Please sign in to comment.