Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: exposing onnx backend to JS land #436

Merged
merged 21 commits into from
Nov 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
5ab8724
feat: integrate `trasnformers.js` with rust backend
kallebysantos Oct 31, 2024
2f26320
stamp: refactoring tensors ser/de to try zero-copy
kallebysantos Sep 24, 2024
bdaff49
stamp: refactoring to use `serde_v8`
kallebysantos Sep 25, 2024
0078625
fix(ai): seq2seq models causing null pointer error
kallebysantos Oct 1, 2024
754c750
test(sb_ai): implementing tests for ort backend
kallebysantos Oct 16, 2024
9f78148
stamp(sb_ai): example for generate image embeddings
kallebysantos Oct 31, 2024
116a996
test(sb_ai): implementing computer vision tests for ort backend
kallebysantos Nov 2, 2024
2d5d462
stamp: clippy
kallebysantos Nov 2, 2024
4765a3f
fix(ci): makes share common env vars from dotenv file
nyannyacha Nov 7, 2024
57d3065
fix(ci): update `ORT_DYLIB_PATH`
kallebysantos Nov 7, 2024
6bef30c
fix(ci): makes share common env vars from dotenv file
nyannyacha Nov 7, 2024
c7ba875
chore(sb_ai): update dependencies
nyannyacha Nov 7, 2024
4f620fc
chore(event_worker): add a dependency
nyannyacha Nov 7, 2024
c6c067c
chore(event_worker): install a tracing macro
nyannyacha Nov 7, 2024
4e88715
chore(base): update `Cargo.toml`
nyannyacha Nov 7, 2024
f4bc4bb
chore(base): trace `malloced_mb` more precisely
nyannyacha Nov 7, 2024
9cd4943
chore: update an integration test case script
nyannyacha Nov 7, 2024
79ac8f4
chore: install tracing subscriber when `base/tracing` feature is enabled
nyannyacha Nov 7, 2024
9f26703
chore: update `Cargo.lock`
nyannyacha Nov 7, 2024
ff43c6f
stamp: add `docker build` script with shared envs
kallebysantos Nov 9, 2024
a9931df
fix(devcontainer): shared `.env` file path
kallebysantos Nov 9, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 4 additions & 14 deletions .devcontainer/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,28 +1,18 @@
FROM mcr.microsoft.com/devcontainers/rust:dev-1-bookworm

ARG TARGETPLATFORM
ARG ONNXRUNTIME_VERSION
ARG DENO_VERSION

RUN apt-get update && apt-get install -y build-essential cmake libclang-dev lldb \
nodejs npm hyperfine

COPY .env /tmp/.env
COPY .devcontainer/install.sh /tmp/install.sh
COPY scripts/install_onnx.sh /tmp/install_onnx.sh
COPY scripts/download_models.sh /tmp/download_models.sh

WORKDIR /tmp
RUN ./install_onnx.sh $ONNXRUNTIME_VERSION $TARGETPLATFORM /usr/local/bin/libonnxruntime.so
RUN ./download_models.sh
RUN mkdir -p /etc/sb_ai && cp -r /tmp/models /etc/sb_ai/models

ENV ORT_DYLIB_PATH=/usr/local/bin/libonnxruntime.so
ENV SB_AI_MODELS_DIR=/etc/sb_ai/models

# Ollama
RUN curl -fsSL https://ollama.com/install.sh | sh

# Deno
ENV DENO_INSTALL=/deno
RUN mkdir -p /deno \
&& curl -fsSL https://deno.land/install.sh | bash -s -- v$DENO_VERSION \
&& chown -R vscode /deno

RUN /tmp/install.sh $TARGETPLATFORM
8 changes: 3 additions & 5 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,7 @@
"name": "Rust",
"build": {
"dockerfile": "Dockerfile",
"context": "..",
"args": {
"ONNXRUNTIME_VERSION": "1.19.2",
"DENO_VERSION": "1.45.2"
}
"context": ".."
},
"containerEnv": {
"PATH": "${localEnv:PATH}:/deno/bin"
Expand All @@ -16,6 +12,8 @@
"ghcr.io/jungaretti/features/make:1": {}
},
"runArgs": [
"--env-file",
".env",
"--rm",
"--privileged",
"--security-opt",
Expand Down
20 changes: 20 additions & 0 deletions .devcontainer/install.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
#!/usr/bin/env bash
set -e

TARGETPLATFORM=$1

export $(grep -v '^#' /tmp/.env | xargs)

# ONNX Runtime
/tmp/install_onnx.sh $ONNXRUNTIME_VERSION $TARGETPLATFORM /tmp/onnxruntime
mv /tmp/onnxruntime/lib/libonnxruntime.so* /usr/lib
/tmp/download_models.sh
mkdir -p /etc/sb_ai && cp -r /tmp/models /etc/sb_ai/models

# Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Deno
mkdir -p /deno
curl -fsSL https://deno.land/install.sh | bash -s -- v$DENO_VERSION
chown -R vscode /deno
4 changes: 4 additions & 0 deletions .env
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
ONNXRUNTIME_VERSION=1.19.2
DENO_VERSION=1.45.2
EDGE_RUNTIME_PORT=9998
AI_INFERENCE_API_HOST=http://localhost:11434
6 changes: 6 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ env:
CARGO_NET_RETRY: 10
CARGO_TERM_COLOR: always
RUSTUP_MAX_RETRIES: 10
ORT_DYLIB_PATH: /tmp/onnxruntime/lib/libonnxruntime.so

jobs:
cargo-fmt:
Expand Down Expand Up @@ -49,4 +50,9 @@ jobs:
- uses: actions/checkout@v4
- run: rustup show
- uses: Swatinem/rust-cache@v2
- uses: cardinalby/export-env-action@v2
with:
envFile: ".env"
- name: Install ONNX Runtime Library
run: ./scripts/install_onnx.sh ${{ env.ONNXRUNTIME_VERSION }} x64 /tmp/onnxruntime
- run: ./scripts/test.sh
12 changes: 10 additions & 2 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,9 @@ jobs:
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}

- uses: cardinalby/export-env-action@v2
with:
envFile: ".env"
- id: build
uses: docker/build-push-action@v3
with:
Expand All @@ -61,7 +64,8 @@ jobs:
cache-from: type=gha
cache-to: type=gha,mode=max
build-args: |
GIT_V_VERSION=${{ needs.release.outputs.version }}
GIT_V_TAG=${{ needs.release.outputs.version }}
ONNXRUNTIME_VERSION=${{ env.ONNXRUNTIME_VERSION }}

publish_arm:
needs:
Expand Down Expand Up @@ -95,6 +99,9 @@ jobs:
image=moby/buildkit:master
network=host

- uses: cardinalby/export-env-action@v2
with:
envFile: ".env"
- id: build
uses: docker/build-push-action@v3
with:
Expand All @@ -104,7 +111,8 @@ jobs:
tags: ${{ steps.meta.outputs.tags }}
no-cache: true
build-args: |
GIT_V_VERSION=${{ needs.release.outputs.version }}
GIT_V_TAG=${{ needs.release.outputs.version }}
ONNXRUNTIME_VERSION=${{ env.ONNXRUNTIME_VERSION }}

merge_manifest:
needs: [release, publish_x86, publish_arm]
Expand Down
12 changes: 8 additions & 4 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

13 changes: 5 additions & 8 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
FROM rust:1.79.0-bookworm as builder

ARG TARGETPLATFORM
ARG GIT_V_VERSION
ARG ONNXRUNTIME_VERSION=1.19.2
ARG ONNXRUNTIME_VERSION
ARG GIT_V_TAG
ARG PROFILE=release
ARG FEATURES

Expand All @@ -15,7 +15,7 @@ WORKDIR /usr/src/edge-runtime
COPY . .

RUN --mount=type=cache,target=/usr/local/cargo/registry,id=${TARGETPLATFORM} --mount=type=cache,target=/usr/src/edge-runtime/target,id=${TARGETPLATFORM} \
GIT_V_TAG=${GIT_V_VERSION} cargo build --profile ${PROFILE} --features "${FEATURES}" && \
${GIT_V_TAG} cargo build --profile ${PROFILE} --features "${FEATURES}" && \
mv /usr/src/edge-runtime/target/${PROFILE}/edge-runtime /root

RUN objcopy --compress-debug-sections \
Expand All @@ -36,8 +36,6 @@ RUN apt-get remove -y perl && apt-get autoremove -y
COPY --from=builder /root/edge-runtime /usr/local/bin/edge-runtime
COPY --from=builder /root/edge-runtime.debug /usr/local/bin/edge-runtime.debug

ENV ORT_DYLIB_PATH=/usr/local/bin/onnxruntime/lib/libonnxruntime.so


# ONNX Runtime provider
# Application runtime with ONNX
Expand All @@ -60,10 +58,9 @@ FROM nvidia/cuda:11.8.0-cudnn8-runtime-ubuntu22.04 as edge-runtime-cuda

COPY --from=edge-runtime-base /usr/local/bin/edge-runtime /usr/local/bin/edge-runtime
COPY --from=builder /root/edge-runtime.debug /usr/local/bin/edge-runtime.debug
COPY --from=ort-cuda /root/onnxruntime /usr/local/bin/onnxruntime
COPY --from=ort-cuda /root/onnxruntime/lib/libonnxruntime.so* /usr/lib
COPY --from=preload-models /usr/src/edge-runtime/models /etc/sb_ai/models

ENV ORT_DYLIB_PATH=/usr/local/bin/onnxruntime/lib/libonnxruntime.so
ENV NVIDIA_VISIBLE_DEVICES=all
ENV NVIDIA_DRIVER_CAPABILITIES=compute,utility

Expand All @@ -72,7 +69,7 @@ ENTRYPOINT ["edge-runtime"]

# Base
FROM edge-runtime-base as edge-runtime
COPY --from=ort /root/onnxruntime /usr/local/bin/onnxruntime
COPY --from=ort /root/onnxruntime/lib/libonnxruntime.so* /usr/lib
COPY --from=preload-models /usr/src/edge-runtime/models /etc/sb_ai/models

ENTRYPOINT ["edge-runtime"]
2 changes: 2 additions & 0 deletions crates/base/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,7 @@ notify.workspace = true
pin-project.workspace = true
rustls-pemfile.workspace = true
tracing.workspace = true
tracing-subscriber = { workspace = true, optional = true, features = ["env-filter", "tracing-log"] }

reqwest_v011 = { package = "reqwest", version = "0.11", features = ["stream", "json", "multipart"] }
tls-listener = { version = "0.10", features = ["rustls"] }
Expand Down Expand Up @@ -129,4 +130,5 @@ tokio.workspace = true
url.workspace = true

[features]
tracing = ["dep:tracing-subscriber"]
termination-signal-ext = []
5 changes: 2 additions & 3 deletions crates/base/src/deno_runtime.rs
Original file line number Diff line number Diff line change
Expand Up @@ -190,6 +190,7 @@ impl MemCheck {
}
}

trace!(malloced_mb = bytes_to_display(total_bytes as u64));
total_bytes
}
}
Expand Down Expand Up @@ -965,11 +966,9 @@ where

if is_user_worker {
let mem_state = mem_check_state.as_ref().unwrap();
let total_malloced_bytes = mem_state.check(js_runtime.v8_isolate().as_mut());

mem_state.check(js_runtime.v8_isolate().as_mut());
mem_state.waker.register(waker);

trace!(malloced_mb = bytes_to_display(total_malloced_bytes as u64));
}

// NOTE(Nyannyacha): If tasks are empty or V8 is not evaluating the
Expand Down
2 changes: 1 addition & 1 deletion crates/base/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,5 +15,5 @@ pub use inspector_server::InspectorOption;
pub use sb_core::cache::CacheSetting;
pub use sb_graph::DecoratorType;

#[cfg(test)]
#[cfg(any(test, feature = "tracing"))]
mod tracing_subscriber;
57 changes: 57 additions & 0 deletions crates/base/test_cases/ai-ort-rust-backend/main/index.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
import * as path from "jsr:@std/path";

Deno.serve(async (req: Request) => {
console.log(req.url);
const url = new URL(req.url);
const { pathname } = url;
const service_name = pathname;

if (!service_name || service_name === "") {
const error = { msg: "missing function name in request" }
return new Response(
JSON.stringify(error),
{ status: 400, headers: { "Content-Type": "application/json" } },
)
}

const servicePath = path.join("test_cases/ai-ort-rust-backend", pathname);

const createWorker = async () => {
const memoryLimitMb = 1500;
const workerTimeoutMs = 10 * 60 * 1000;
const cpuTimeSoftLimitMs = 10 * 60 * 1000;
const cpuTimeHardLimitMs = 10 * 60 * 1000;
const noModuleCache = false;
const importMapPath = null;
const envVarsObj = Deno.env.toObject();
const envVars = Object.keys(envVarsObj).map(k => [k, envVarsObj[k]]);

return await EdgeRuntime.userWorkers.create({
servicePath,
memoryLimitMb,
workerTimeoutMs,
cpuTimeSoftLimitMs,
cpuTimeHardLimitMs,
noModuleCache,
importMapPath,
envVars
});
}

const callWorker = async () => {
try {
const worker = await createWorker();
return await worker.fetch(req);
} catch (e) {
console.error(e);

const error = { msg: e.toString() }
return new Response(
JSON.stringify(error),
{ status: 500, headers: { "Content-Type": "application/json" } },
);
}
}

return await callWorker();
})
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
import { assertEquals, assertAlmostEquals } from 'jsr:@std/assert';
import {
env,
pipeline,
} from 'https://cdn.jsdelivr.net/npm/@huggingface/transformers@3.0.1';

// Ensure we do not use browser cache
env.useBrowserCache = false;
env.allowLocalModels = false;

const pipe = await pipeline('feature-extraction', 'supabase/gte-small', { device: 'auto' }); // 384 dims model

Deno.serve(async () => {
const input = [
'This framework generates embeddings for each input sentence',
'Sentences are passed as a list of string.',
'The quick brown fox jumps over the lazy dog.',
];

const output = await pipe(input, { pooling: 'mean', normalize: true });

assertEquals(output.size, 3 * 384);
assertEquals(output.dims.length, 2);

// Comparing first 3 predictions
[-0.050660304725170135, -0.006694655399769545, 0.003071750048547983]
.map((expected, idx) => {
assertAlmostEquals(output.data[idx], expected);
});

return new Response();
});
Loading