API Wrapper

## Idea
The idea is to have an API wrapper that:

1. Receives the query (and the model version?)
2. Calls the tokenizer to tokenize the query
3. Calls the serving API for inference/prediction
4. Return the response of the serving API

## Features
- [ ] User input validation
- [ ] Rate limiting
- [ ] Authentication (API keys?)
- [ ] Caching
- [ ] Provisioning and deployment
- O11y:
  - [ ] Metrics
  - [ ] Logging
  - [ ] Tracing?
- [ ] Usage metering
- [ ] Analytics
- [ ] Extensibility to include other APIs, libraries (libinjection) and model versions
- [x] [Model warm-up](https://www.tensorflow.org/tfx/serving/saved_model_warmup) (to load models after deployment)

## Workflow
```mermaid
sequenceDiagram
    participant Client
    participant API Wrapper
    participant Tokenizer API
    participant Serving API
    Client ->> API Wrapper: send query
    API Wrapper ->> API Wrapper: authenticate user
    API Wrapper ->> API Wrapper: validate user input
    API Wrapper ->> API Wrapper: check rate limit
    API Wrapper ->> Tokenizer API: tokenize the query
    Tokenizer API ->> API Wrapper: receive tokens
    API Wrapper ->> Serving API: predict whether tokens are injection or not
    Serving API ->> API Wrapper: receive score
    API Wrapper ->> API Wrapper: cache query and score
    API Wrapper ->> Client: receive score
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

API Wrapper #8

Idea

Features

Workflow

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

API Wrapper #8

Description

Idea

Features

Workflow

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions