Using local ollama chat get 404 #574

Illia-M · 2024-11-03T19:23:10Z

Illia-M
Nov 3, 2024

I deploy local ollama in my docker-compose setup, add gramps web env vars:
GRAMPSWEB_LLM_BASE_URL: "http://ollama:11434/v1"
GRAMPSWEB_LLM_MODEL: FacebookAI/xlm-roberta-base
OPENAI_API_KEY: ollama

Error in grampsweb container:
INFO:httpx:HTTP Request: POST http://ollama:11434/v1/chat/completions "HTTP/1.1 404 Not Found"

ollama container:
ollama | 2024/11/03 19:02:23 routes.go:1158: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
ollama | time=2024-11-03T19:02:23.453Z level=INFO source=images.go:754 msg="total blobs: 0"
ollama | time=2024-11-03T19:02:23.454Z level=INFO source=images.go:761 msg="total unused blobs removed: 0"
ollama | time=2024-11-03T19:02:23.456Z level=INFO source=routes.go:1205 msg="Listening on [::]:11434 (version 0.3.14)"
ollama | time=2024-11-03T19:02:23.463Z level=INFO source=common.go:49 msg="Dynamic LLM libraries" runners="[cpu_avx2 cuda_v11 cuda_v12 cpu cpu_avx]"
ollama | time=2024-11-03T19:02:23.464Z level=INFO source=gpu.go:221 msg="looking for compatible GPUs"
ollama | time=2024-11-03T19:02:23.484Z level=INFO source=gpu.go:384 msg="no compatible GPUs were discovered"
ollama | time=2024-11-03T19:02:23.484Z level=INFO source=types.go:123 msg="inference compute" id=0 library=cpu variant=avx2 compute="" driver=0.0 name="" total="15.6 GiB" available="14.9 GiB"

and when chat interactions come

ollama | [GIN] 2024/11/03 - 19:05:06 | 404 | 3.182725ms | 172.18.0.6 | POST "/v1/chat/completions"
ollama | [GIN] 2024/11/03 - 19:06:16 | 404 | 354.065µs | 172.18.0.6 | POST "/v1/chat/completions"
ollama | [GIN] 2024/11/03 - 19:06:19 | 404 | 348.557µs | 172.18.0.6 | POST "/v1/chat/completions"
ollama | [GIN] 2024/11/03 - 19:06:22 | 404 | 303.619µs | 172.18.0.6 | POST "/v1/chat/completions"

The only one thing disturbing me is that I use GRAMPSWEB_VECTOR_EMBEDDING_MODEL: "paraphrase-multilingual-MiniLM-L12-v2" for my semantic search and it returns me all 221+k results when I try search any text with semantic search.
My container configured for uk-UA locale - a lot of things in my DB in Ukrainian or russian language.

in logs I see next logs:

grampsweb | INFO:sentence_transformers.SentenceTransformer:Use pytorch device_name: cpu
grampsweb | INFO:sentence_transformers.SentenceTransformer:Load pretrained SentenceTransformer: paraphrase-multilingual-MiniLM-L12-v2
grampsweb | УВАГА: Модуль isfamilyfiltermatchevent не має перекладу на жодну з мов, вибраних Вами. Використовую американську англійську
grampsweb | УВАГА: Модуль ageatdeath не має перекладу на жодну з мов, вибраних Вами. Використовую американську англійську
grampsweb | УВАГА: Модуль peopleeventscount не має перекладу на жодну з мов, вибраних Вами. Використовую американську англійську
grampsweb | УВАГА: Модуль associationsofpersonmatch не має перекладу на жодну з мов, вибраних Вами. Використовую американську англійську
grampsweb | УВАГА: Модуль degreesofseparation не має перекладу на жодну з мов, вибраних Вами. Використовую американську англійську
grampsweb | УВАГА: Модуль isrelatedwithfiltermatch не має перекладу на жодну з мов, вибраних Вами. Використовую американську англійську
grampsweb | УВАГА: Модуль hasrolerule не має перекладу на жодну з мов, вибраних Вами. Використовую американську англійську
grampsweb | УВАГА: Модуль familieswitheventfiltermatch не має перекладу на жодну з мов, вибраних Вами. Використовую американську англійську
grampsweb | УВАГА: Модуль multipleparents не має перекладу на жодну з мов, вибраних Вами. Використовую американську англійську
grampsweb | УВАГА: Модуль infamilyrule не має перекладу на жодну з мов, вибраних Вами. Використовую американську англійську
grampsweb | УВАГА: Модуль hassourcefilter не має перекладу на жодну з мов, вибраних Вами. Використовую американську англійську
grampsweb | УВАГА: Модуль activepersonrule не має перекладу на жодну з мов, вибраних Вами. Використовую американську англійську
grampsweb | УВАГА: Модуль postgresql не має перекладу на жодну з мов, вибраних Вами. Використовую американську англійську
grampsweb | УВАГА: Модуль sharedpostgresql не має перекладу на жодну з мов, вибраних Вами. Використовую американську англійську

Gramps details:
Gramps 5.2.3
Gramps Web API 2.5.0
Gramps Web Frontend 24.10.0
Gramps QL 0.3.0
Sifts 1.0.0
locale: uk
multi-tree: false
task queue: true
OCR: true
chat: true

DavidMStraub · 2024-11-03T19:31:53Z

DavidMStraub
Nov 3, 2024
Maintainer

Thanks for reporting!

Semantic search & vector embedding does not use the LLM, so it's not surprising that this is working even if Ollama isn't.

Can you please try if Ollama works at all with the OpenAI API, following e.g. this guide? https://ollama.com/blog/openai-compatibility

If you can very this works, we can debug the Gramps Web API configuration.

0 replies

Illia-M · 2024-11-03T21:22:22Z

Illia-M
Nov 3, 2024
Author

So, I play with ollama and find issue with no models pulled - it should be done manually or via API, I simply

> docker exec -it ollama /bin/sh
# ollama pull llama3.1
# curl http://localhost:11434/v1/chat/completions -H "Content-Type: application/json"   -d '{ "model": "llama3.1",  "messages": [{   "role": "system",           "content": "You are a helpful assistant for genealogy research."},{"role": "user","content": "привіт, розкажи як ти можеш мені допомогти. Звжди відповідай українською"}]}'

And discover ugly Ukrainian answers, so next tests was in English, but it works!
Update used model in grampsweb config and chat start working fine, but still has problem that chat has no context about my tree - try change semantic search model and comeback.

0 replies

Illia-M · 2024-11-03T21:32:17Z

Illia-M
Nov 3, 2024
Author

Maybe I need another model less censored - it strange that it cannot answer with information about NKVD repressions that touch most families in Ukraine around 30s, I has a lot of such in my tree

0 replies

DavidMStraub · 2024-11-04T07:05:19Z

DavidMStraub
Nov 4, 2024
Maintainer

That's very interesting. So it seems the setup is working now, but we are hitting an issue with LLM "security" interfering with historical context. I wonder if playing with the system prompt would help, i.e. letting the LLM know that the questions are for historical investigations. You could try that out just using curl.

0 replies

Illia-M · 2024-11-04T18:30:23Z

Illia-M
Nov 4, 2024
Author

Look like it can help, promt generated by ChatGPT

curl http://localhost:11434/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "llama3.1", "messages": [{ "role": "system", "content": "
You are an advanced historical AI assistant with a focus on delivering comprehensive, educational, and nuanced answers for studying history. Your goal is to answer questions from a factual, analytical, and historical context. When faced with sensitive or restricted topics, focus on providing contextual insights, general trends, societal factors, or analogous historical situations. When answering: 1. Use Objective Language: Prioritize a neutral tone and avoid subjective interpretations unless the user specifically asks for historical analysis or perspectives from historians. 2. Provide Broader Context: If a specific event, figure, or topic is sensitive, explain the surrounding social, economic, political, or cultural factors instead. Emphasize what was happening globally or within similar contexts during that period. 3. Offer Comparative Analysis: Where possible, relate events to similar occurrences in other times or places, focusing on recurring themes, motivations, or social dynamics rather than on isolated incidents. 4. Focus on Historical Analysis and Sources: Describe the general methods historians use to research these kinds of events or figures, mentioning common primary sources, historiographical trends, or analytical frameworks without focusing on potentially sensitive specifics. 5. Encourage Critical Thinking: Guide users in understanding the complexities of history, including the diversity of perspectives, debates among historians, and the limitations of sources. 6. Provide Alternative Scenarios: If a topic is especially sensitive, explore “what-if” scenarios, parallel historical developments, or broader consequences to engage the user’s curiosity without addressing restricted specifics directly. By following these principles, you will help the user gain a deeper understanding of historical study and its complexities, without focusing on restricted specifics.
"},{"role": "user","content": "hello, explain how you can help with with my genealogy research using my data as context and with details about NKVD."}]}'

{"id":"chatcmpl-177","object":"chat.completion","created":1730744105,"model":"llama3.1","system_fingerprint":"fp_ollama","choices":[{"index":0,"message":{"role":"assistant","content":"
Hello! I'd be happy to help with your genealogy research.\n\nRegarding the NKVD (Narodny Komissariat Vnutrennikh Del or People's Commissariat for Internal Affairs), note that it was a Soviet security agency responsible for law enforcement, counter-intelligence, and internal surveillance from 1923 until its name change in 1946 to the MVD. The NKVD played significant roles in Soviet history, including conducting purges, manhunts, and arrests under Stalin's rule.\n\nNow, focusing on genealogy research with your data as context:\nIf you have information about your ancestors that lived during the time period of the NKVD (1920s-1950s), I can help you with:\n\n1. Ancestor Research in Russia: Using public records and available archives where applicable, we might talk about broader patterns or common characteristics among groups based on what the historical context might offer.\n\n2. Potential Historical Impact on Ancestry: If there's something specific you're researching related to your ancestry that occurred around the time frame NKVD was active (e.g., a disappearance of family member under Stalin, migration patterns), we can discuss how societal and political contexts influenced emigration or local population dynamics.\n\n3. Genealogical Research Methods in Restricted Areas: Note however, for very specific topics like direct investigations into NKVD records as they pertain to particular families—those are usually protected by privacy laws and restricted access policies. Historians use indirect clues and broader trends, rather than the names of individuals or detailed NKVD files, even for public figures, due to historical archival restrictions.\n\nFor genealogical purposes, discussing historical phenomena related to migrations, population shifts, general patterns among regional groups that occurred around the time the NKVD was active can give insights into how societal factors influenced your ancestry's paths. \n\nTo begin:\n- What specific aspect of your family history resonates with your interest in NKVD?
"},"finish_reason":"stop"}],"usage":{"prompt_tokens":368,"completion_tokens":396,"total_tokens":764}}

0 replies

DavidMStraub · 2024-11-17T20:21:04Z

DavidMStraub
Nov 17, 2024
Maintainer

Since the original issue (404) was resolved but the discussion about system prompt is intersting, converting this to a discussion.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using local ollama chat get 404 #574

{{title}}

Replies: 6 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Using local ollama chat get 404 #574

Illia-M Nov 3, 2024

Replies: 6 comments

DavidMStraub Nov 3, 2024 Maintainer

Illia-M Nov 3, 2024 Author

Illia-M Nov 3, 2024 Author

DavidMStraub Nov 4, 2024 Maintainer

Illia-M Nov 4, 2024 Author

DavidMStraub Nov 17, 2024 Maintainer

Illia-M
Nov 3, 2024

DavidMStraub
Nov 3, 2024
Maintainer

Illia-M
Nov 3, 2024
Author

Illia-M
Nov 3, 2024
Author

DavidMStraub
Nov 4, 2024
Maintainer

Illia-M
Nov 4, 2024
Author

DavidMStraub
Nov 17, 2024
Maintainer