Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #10: Move DB.md here for future reference #23

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
87 changes: 87 additions & 0 deletions DB.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
# Some notes on interacting with postgresql database

## postgresql extensions
melanie-fressard marked this conversation as resolved.
Show resolved Hide resolved

```
pip install pgxnclient
pgxn install vector
```

## workaround for psycopg2 not finding the socket file
melanie-fressard marked this conversation as resolved.
Show resolved Hide resolved

either build from source:

```
pip install --no-binary psycopg2
```

or symlink:

```
sudo mkdir -p /var/run/postgresql
sudo ln -s /tmp/.s.PGSQL.5432 /var/run/postgresql/.s.PGSQL.5432
```

see extensions available: https://pgxn.org/


## configuration

postgresql.conf

```
log_min_duration_statement = 40
```

## testing impact of indexes by flushing cache first

stop database:

```
pg_ctl stop
```

in your OS (not the container):

```
echo 3 > /proc/sys/vm/drop_caches
```

start database:

```
pg_ctl start
```

test query:

```
time curl -X POST http://localhost:5000/search --data '{"query": "is e.coli a virus or bacteria?"}' -H "Content-Type: application/json"
```

result:

```
real 0m4.791s
user 0m0.003s
sys 0m0.016s
```

create index.

Repeat operations to clear cache.

## database client

Suggested: https://dbeaver.io/download/

## References

* [text-embedding-ada-002](https://platform.openai.com/docs/guides/embeddings)
* [pgvector](https://github.com/pgvector/pgvector)
* [Switch postgresql to utf-8](https://tutorials.technology/tutorials/How-to-change-postgresql-database-encoding-to-UTF8-from-SQL_ASCII.html?utm_content=cmp-true)
* [pgvector](https://github.com/pgvector/pgvector)
* [Tutorial: Explore Azure OpenAI Service embeddings and document search](https://learn.microsoft.com/en-us/azure/cognitive-services/openai/tutorials/embeddings)
* [How to optimize performance when using pgvector on Azure Cosmos DB for PostgreSQL](https://learn.microsoft.com/en-us/azure/cosmos-db/postgresql/howto-optimize-performance-pgvector)
* [Building a custom connector](https://docs.elastic.co/search-ui/guides/building-a-custom-connector)
* [How to change PostgreSQL database encoding to UTF-8](https://www.shubhamdipt.com/blog/how-to-change-postgresql-database-encoding-to-utf8/)
8 changes: 7 additions & 1 deletion DEVELOPER.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,10 @@
# Development guidelines for louis-db
# louis-db

## layers

* louis.db: any interaction with the postgresql database is done here
* louis.models: interactions with LLM
* openai.py: openai API interactions

## Making changes to the database schema

Expand Down
9 changes: 7 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
If you need to interface with the database, use this to install:

```
pip install git+https://github.com/ai-cfia/louis-db@v0.0.5-alpha3
pip install git+https://github.com/ai-cfia/louis-db@main
```

You'll often want to add, move or modify existing database layer functions found in louis-db from a client repository.
Expand All @@ -18,4 +18,9 @@ pip install -e git+https://github.com/ai-cfia/louis-db#egg=louis_db

this will checkout the latest source in a local git in src/louis-db allowing edits in that directory to be immediately available for use by louis-crawler.

Don't forget to create a PR with your changes once you're done!
Don't forget to create a PR with your changes once you're done!

## More documentation

* [Developer documentation](DEVELOPER.md)
* [Working with Postgresql](DB.md)
Loading