Skip to content

Commit

Permalink
Merge pull request #1 from weni-ai/products-index
Browse files Browse the repository at this point in the history
Products index
  • Loading branch information
rasoro authored Oct 4, 2023
2 parents 2c81226 + 3c2c71b commit c04aa9c
Show file tree
Hide file tree
Showing 23 changed files with 3,050 additions and 1 deletion.
2 changes: 2 additions & 0 deletions .coveragerc
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
[report]
exclude_lines = pass
38 changes: 38 additions & 0 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
name: CI

on:
push:
branches:
- '*'
pull_request:
branches:
- '*'

jobs:
build:
runs-on: ubuntu-latest

steps:
- name: Checkout code
uses: actions/checkout@v2

- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: '3.10'

- name: Install project dependencies
run: |
pip install poetry
poetry install
working-directory: ${{ github.workspace }}

- name: Run tests
run: |
poetry run coverage run -m unittest discover ./app/tests/
poetry run coverage report
poetry run coverage xml
working-directory: ${{ github.workspace }}

- name: Upload coverage report
uses: codecov/codecov-action@v2
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -158,3 +158,8 @@ cython_debug/
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/

.vscode
*cache*
.coverage
htmlcov
20 changes: 20 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
FROM python:3.10-slim

WORKDIR /app

RUN pip install poetry

COPY pyproject.toml poetry.lock ./

RUN poetry config virtualenvs.create false && \
poetry install --no-dev

COPY . .

EXPOSE 8000

COPY entrypoint.sh /entrypoint.sh

RUN chmod +x /entrypoint.sh

CMD ["/entrypoint.sh"]
189 changes: 188 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1 +1,188 @@
# microservice-ia
[![CI](https://github.com/weni-ai/SentenX/actions/workflows/ci.yaml/badge.svg)](https://github.com/weni-ai/SentenX/actions/workflows/ci.yaml)

# SentenX

microservice that uses a sentence transformer model to index and search records.

## Table of Contents

1. [Requirements](#requirements)
2. [Quickstart](#quickstart)
3. [Usage](#usage)
4. [Test](#test)

## Requirements

* python 3.10
* elasticsearch 8.9.1

## Quickstart
on root directory of this project run the following commands to:

setup sagemaker required keys and elasticsearch url environment variables

```
export AWS_ACCESS_KEY_ID=YOUR_SAGEMAKER_AWS_ACCESS_KEY
export AWS_SECRET_ACCESS_KEY=YOUR_SAGEMAKER_AWS_SECRET_ACCESS_KEY
export ELASTICSEARCH_URL=YOUR_ELASTICSEARCH_URL
```

install poetry
```
pip install poetry
```

create a python 3.10 virtual environment
```
poetry env use 3.10
```

activate the environment
```
poetry shell
```

install dependencies
```
poetry install
```

start the microservice
```
uvicorn app.main:main_app.api --reload
```

### Docker compose

to start sentenx with elasticsearch with docker compose:

setup `AWS_SECRET_ACCESS_KEY` and `AWS_ACCESS_KEY_ID` on `docker-compose.yml`
```
docker compose up -d
```

to stop:
```
docker compose down
```

to start with rebuild after any change on source:
```
docker compose up -d --build
```


## Usage

### To index a product

request:
```bash
curl -X PUT http://localhost:8000/products/index \
-H 'Content-Type: application/json' \
-d '{
"catalog_id": "cat1",
"product": {
"facebook_id": "123456789",
"title": "massa para bolo de baunilha",
"org_id": "1",
"channel_id": "5",
"catalog_id": "cat1",
"product_retailer_id": "pp1"
}
}
'
```
response:
```json
status: 200
{
"catalog_id": "cat1",
"documents": [
"cac65148-8c1d-423c-a022-2a52cdedcd3c"
]
}
```

### To index products in batch

request:
```bash

curl -X PUT http://localhost:8000/products/batch \
-H 'Content-Type: application/json' \
-d '{
"catalog_id": "asdfgh",
"products": [
{
"facebook_id": "1234567891",
"title": "banana prata 1kg",
"org_id": "1",
"channel_id": "5",
"catalog_id": "asdfgh",
"product_retailer_id": "p1"
},
{
"facebook_id": "1234567892",
"title": "doce de banana 250g",
"org_id": "1",
"channel_id": "5",
"catalog_id": "asdfgh",
"product_retailer_id": "p2"
}
]
}'
```

response:
```json
status: 200

{
"catalog_id": "asdfgh",
"documents": [
"f5b8d394-eb62-4c92-9501-51a8ebcf1380",
"bcb551e8-0bd1-4ca7-825b-cf8aa8a3f0e0"
]
}
```

### To search for products

request
```bash
curl http://localhost:8000/products/search \
-H 'Content-Type: application/json' \
-d '{
"search": "massa",
"filter": {
"catalog_id": "cat1"
},
"threshold": 1.6
}
'
```
response:
```json
status: 200
{
"products": [
{
"facebook_id": "1",
"title": "massa para bolo de baunilha",
"org_id": "1",
"channel_id": "5",
"catalog_id": "asdfgh4321",
"product_retailer_id": "abc321"
}
]
}
```

## Test

we use unittest with discover to run the tests that are in `./app/tests`
```
coverage run -m unittest discover -s app/tests
```

Empty file added app/__init__.py
Empty file.
26 changes: 26 additions & 0 deletions app/config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
import os


class AppConfig:
def __init__(self):
self.product_index_name = os.environ.get(
"INDEX_PRODUCTS_NAME", "catalog_products"
)
self.es_url = os.environ.get("ELASTICSEARCH_URL", "http://localhost:9200")
self.embedding_type = os.environ.get("EMBEDDING_TYPE", "sagemaker")
self.sagemaker = {
"endpoint_name": os.environ.get(
"SAGEMAKER_ENDPOINT_NAME",
"huggingface-pytorch-inference-2023-07-28-21-01-20-147",
),
"region_name": os.environ.get("SAGEMAKER_REGION_NAME", "us-east-1"),
}
self.huggingfacehub = {
"repo_id": os.environ.get(
"HUGGINGFACE_REPO_ID", "sentence-transformers/all-MiniLM-L6-v2"
),
"task": os.environ.get("HUGGINGFACE_TASK", "feature-extraction"),
"huggingfacehub_api_token": os.environ.get(
"HUGGINGFACE_API_TOKEN", "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"
),
}
23 changes: 23 additions & 0 deletions app/handlers/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
from abc import ABC, abstractmethod


class IDocumentHandler(ABC):
@abstractmethod
def index(self):
pass

@abstractmethod
def batch_index(self):
pass

@abstractmethod
def search(self):
pass

@abstractmethod
def delete(self):
pass

@abstractmethod
def delete_batch(self):
pass
Loading

0 comments on commit c04aa9c

Please sign in to comment.