Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistently running into exit code 139 on Sonoma #7016

Closed
ryan-serpico opened this issue Oct 6, 2023 · 7 comments
Closed

Consistently running into exit code 139 on Sonoma #7016

ryan-serpico opened this issue Oct 6, 2023 · 7 comments

Comments

@ryan-serpico
Copy link

Description

Hey y'all,

Ever since upgrading my M1 Pro MacBook Pro to macOS Sonoma, I haven't been able to generate any embeddings using Chroma in my Docker container. Every time I run docker-compose up --index embedding-test with the basic code below, I receive exited with code 139.

What I don't understand is that I can run this same Docker container on another Mac running Monterey and on an EC2 server without any issue. I can get the same script below running on my main machine if I run it outside of a Docker container and in a simple venv.

This is only the second Github issue I've ever submitted (first being earlier this morning), so apologies in advance if this posting is misplaced or incomplete. I'm also attaching my Dockerfile. If I can supply any other information that would lead to any assistance, let me know. I would appreciate any help on this issue.

I believe this may have something to do with #7006

embedding_test.py

import chromadb
from chromadb.utils import embedding_functions

# Let's define the embedding function
embedding = embedding_functions.SentenceTransformerEmbeddingFunction(
    model_name="intfloat/e5-base-v2",
)

persist_directory = "/code/scripts/testDB"

client = chromadb.PersistentClient(path=persist_directory)

client.delete_collection(name="Students")

collection = client.create_collection(name="Students")

student_info = """
Alexandra Thompson, a 19-year-old computer science sophomore with a 3.7 GPA,
is a member of the programming and chess clubs who enjoys pizza, swimming, and hiking
in her free time in hopes of working at a tech company after graduating from the University of Washington.
"""

club_info = """
The university chess club provides an outlet for students to come together and enjoy playing
the classic strategy game of chess. Members of all skill levels are welcome, from beginners learning
the rules to experienced tournament players. The club typically meets a few times per week to play casual games,
participate in tournaments, analyze famous chess matches, and improve members' skills.
"""

university_info = """
The University of Washington, founded in 1861 in Seattle, is a public research university
with over 45,000 students across three campuses in Seattle, Tacoma, and Bothell.
As the flagship institution of the six public universities in Washington state,
UW encompasses over 500 buildings and 20 million square feet of space,
including one of the largest library systems in the world.
"""

embeddings = embedding([student_info, club_info, university_info])

collection.add(
    documents=embeddings,
    metadatas=[
        {"source": "student info"},
        {"source": "club info"},
        {"source": "university info"},
    ],
    ids=["id1", "id2", "id3"],
)

results = collection.query(query_texts=["What is the student name?"], n_results=2)

print(results)

Dockerfile

FROM python:3.11.4-slim-bookworm
ADD requirements.txt /code/requirements.txt
WORKDIR /code

RUN apt-get update
RUN apt-get install build-essential -y
RUN apt-get install -y gdal-bin libgdal-dev
RUN pip install --upgrade pip
RUN pip install -r "requirements.txt"
RUN python -m spacy download en_core_web_sm

# Add the scripts from the local 'scripts` folder
ADD scripts/embedding_test.py /code/scripts/

WORKDIR /code/scripts

requirements.txt

beautifulsoup4==4.12.2
chromadb==0.4.13
geopandas==0.14.0
gspread==5.11.3
gspread_dataframe==3.3.1
langchain==0.0.306
openai==0.27.9
pandas==2.0.3
python-dotenv==1.0.0
pytz==2023.3
Requests==2.31.0
slack_bolt==1.18.0
slack_sdk==3.21.3
spacy==3.6.1
tenacity==8.2.3
tiktoken==0.4.0
sentence_transformers==2.2.2
lark==1.1.7

versions

Chroma v0.4.13, macOS 14.0 Sonoma, Docker v4.24.0, python:3.11.4-slim-bookworm image

Reproduce

  1. docker-compose up --build embedding-test

Expected behavior

docker-compose up --build embedding-test runs and generates embeddings successfully.

docker version

Client:
 Cloud integration: v1.0.35+desktop.5
 Version:           24.0.6
 API version:       1.43
 Go version:        go1.20.7
 Git commit:        ed223bc
 Built:             Mon Sep  4 12:28:49 2023
 OS/Arch:           darwin/arm64
 Context:           desktop-linux

Server: Docker Desktop 4.24.0 (122432)
 Engine:
  Version:          24.0.6
  API version:      1.43 (minimum version 1.12)
  Go version:       go1.20.7
  Git commit:       1a79695
  Built:            Mon Sep  4 12:31:36 2023
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          1.6.22
  GitCommit:        8165feabfdfe38c65b599c4993d227328c231fca
 runc:
  Version:          1.1.8
  GitCommit:        v1.1.8-0-g82f18fe
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

docker info

Client:
 Version:    24.0.6
 Context:    desktop-linux
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.11.2-desktop.5
    Path:     /Users/ryan/.docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.22.0-desktop.2
    Path:     /Users/ryan/.docker/cli-plugins/docker-compose
  dev: Docker Dev Environments (Docker Inc.)
    Version:  v0.1.0
    Path:     /Users/ryan/.docker/cli-plugins/docker-dev
  extension: Manages Docker extensions (Docker Inc.)
    Version:  v0.2.20
    Path:     /Users/ryan/.docker/cli-plugins/docker-extension
  init: Creates Docker-related starter files for your project (Docker Inc.)
    Version:  v0.1.0-beta.8
    Path:     /Users/ryan/.docker/cli-plugins/docker-init
  sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.)
    Version:  0.6.0
    Path:     /Users/ryan/.docker/cli-plugins/docker-sbom
  scan: Docker Scan (Docker Inc.)
    Version:  v0.26.0
    Path:     /Users/ryan/.docker/cli-plugins/docker-scan
  scout: Docker Scout (Docker Inc.)
    Version:  v1.0.7
    Path:     /Users/ryan/.docker/cli-plugins/docker-scout

Server:
 Containers: 1
  Running: 1
  Paused: 0
  Stopped: 0
 Images: 2
 Server Version: 24.0.6
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 8165feabfdfe38c65b599c4993d227328c231fca
 runc version: v1.1.8-0-g82f18fe
 init version: de40ad0
 Security Options:
  seccomp
   Profile: unconfined
  cgroupns
 Kernel Version: 6.4.16-linuxkit
 Operating System: Docker Desktop
 OSType: linux
 Architecture: aarch64
 CPUs: 9
 Total Memory: 15.61GiB
 Name: docker-desktop
 ID: 5edac7f3-7a82-4e53-b391-ca3f50bf24f5
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 HTTP Proxy: http.docker.internal:3128
 HTTPS Proxy: http.docker.internal:3128
 No Proxy: hubproxy.docker.internal
 Experimental: false
 Insecure Registries:
  hubproxy.docker.internal:5555
  127.0.0.0/8
 Live Restore Enabled: false

Diagnostics ID

78FCD04C-5FB1-457D-97C2-F7F899584238/20231006183318

Additional Info

docker logs 8c38e53b2186
Downloading (…)a20e8/.gitattributes: 100%|██████████| 1.48k/1.48k [00:00<00:00, 16.0MB/s]
Downloading (…)_Pooling/config.json: 100%|██████████| 200/200 [00:00<00:00, 2.78MB/s]
Downloading (…)16616a20e8/README.md: 100%|██████████| 67.6k/67.6k [00:00<00:00, 12.3MB/s]
Downloading (…)616a20e8/config.json: 100%|██████████| 650/650 [00:00<00:00, 10.2MB/s]
Downloading model.safetensors: 100%|██████████| 438M/438M [00:51<00:00, 8.58MB/s]
Downloading (…)0e8/onnx/config.json: 100%|██████████| 632/632 [00:00<00:00, 929kB/s]
Downloading model.onnx: 100%|██████████| 436M/436M [00:50<00:00, 8.55MB/s]
Downloading (…)cial_tokens_map.json: 100%|██████████| 125/125 [00:00<00:00, 223kB/s]
Downloading (…)/onnx/tokenizer.json: 100%|██████████| 711k/711k [00:00<00:00, 3.54MB/s]
Downloading (…)okenizer_config.json: 100%|██████████| 314/314 [00:00<00:00, 1.29MB/s]
Downloading (…)a20e8/onnx/vocab.txt: 100%|██████████| 232k/232k [00:00<00:00, 3.20MB/s]
Downloading pytorch_model.bin: 100%|██████████| 438M/438M [00:51<00:00, 8.56MB/s]
Downloading (…)nce_bert_config.json: 100%|██████████| 57.0/57.0 [00:00<00:00, 74.2kB/s]
Downloading (…)cial_tokens_map.json: 100%|██████████| 125/125 [00:00<00:00, 579kB/s]
Downloading (…)a20e8/tokenizer.json: 100%|██████████| 711k/711k [00:00<00:00, 9.27MB/s]
Downloading (…)okenizer_config.json: 100%|██████████| 314/314 [00:00<00:00, 1.38MB/s]
Downloading (…)16616a20e8/vocab.txt: 100%|██████████| 232k/232k [00:00<00:00, 3.39MB/s]
Downloading (…)16a20e8/modules.json: 100%|██████████| 387/387 [00:00<00:00, 1.87MB/s]

@Yemeen
Copy link

Yemeen commented Oct 6, 2023

Same issue here on Mac OS Sonoma M2 Max MBP. Not working with chroma but something similar as my requirements.txt is

openai~=0.27.2
pydantic~=1.8.2
pydantic[dotenv]
uvicorn~=0.17.6
fastapi~=0.78.0
pandas~=1.4.2
tqdm~=4.64.0
requests>=2.27.1
beautifulsoup4~=4.11.1
transformers~=4.19.1
numpy~=1.24.1
scipy~=1.8.0
sentence_transformers==2.2.2
pyyaml~=6.0
scikit-learn~=1.1.0
unidecode~=1.3.4
tiktoken~=0.3.3
streamlit~=1.20.0
trafilatura>=1.6.0
backoff~=2.2.1
pydub~=0.25.1
simple_salesforce~=1.12.5
selenium~=4.10.0

and getting

84 INFO: Use pytorch device: cpu
Insert data to database:   0%|          | 0/25 [00:00<Insert data to database: 100%|██████████| 25/25 [00:00<00:00, 485451.85it/s]
Batches:   0%|          | 0/1 [00:00<?, ?it/s]
xxxx-api-service-1 exited with code 139

what's interesting is i recloned my repo and the container worked once, but any attempts to use it again leads to the same exit code.

@ryan-serpico
Copy link
Author

@Yemeen Whew, OK, I thought I was the only one experiencing this. So your script was functional before the Sonoma update as well?

@Yemeen
Copy link

Yemeen commented Oct 9, 2023

not immediately. i've been on the beta sonoma for about 6 weeks now. not sure why it started now

@ryan-serpico
Copy link
Author

@Yemeen I figured it out last night, at least for me. I added torch==2.0.1 to my requirements.txt, and boom, everything works. It seems like whatever PyTorch did in the update they released on Oct. 4 broke everything on M1 chips.

@Yemeen
Copy link

Yemeen commented Oct 9, 2023

you're a lifesaver!!

@bsousaa bsousaa removed the area/osx label Oct 11, 2023
@bsousaa
Copy link
Contributor

bsousaa commented Oct 11, 2023

Closing the issue as this got solved by pinning pytorch.

@bsousaa bsousaa closed this as not planned Won't fix, can't repro, duplicate, stale Oct 11, 2023
@sabaimran
Copy link

Amazing! Had the same issue, and pinning to 2.0.1 solved it for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants