CaseConnect

This project is a semantic search engine for the NamUs missing persons database. It contains tools to scrape all the missing person cases and download all the associated images with each case. It also contains tools to embed the text and images into a vector space. The goal is to be able to search the database by text or image and get back similar cases. This could be useful for law enforcement to search for similar cases to a new case or for the public to improve the ability to search if someone is in the database.

TODO

write a scraper for the database
embed all the text
embed the images
build the front end
switch to embedding db from sklearn nn
remove extra json data to improve embedding cost and relevance
test search by image
move cali db scraper to seperate file as there are SSL issues with their db

stretch goals

use control net to convert sketches to images and then do image search on those semantically
live generations from the sketches and then doing semantic search so you can see people as you draw
add chat to prompt user for more details if the input is not very descriptive

cost

$0.0004 / 1K tokens 34229931 tokens $13.6919724

data

data	filetype	description	embeddings
json_cases	json	raw data
case_images	jpg	raw data
text_embeddings	json	embedded json_cases	text-embedding-ada-002
image_embeddings	json	embedded case_images	ViT-bigG-14
search_text	user input	user input text	ada-002 and ViT-bigG-14
search_image	user input	user input image	ViT-bigG-14

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
README.md		README.md
clipper.py		clipper.py
namus_scraper.py		namus_scraper.py
run.sh		run.sh
search.py		search.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CaseConnect

TODO

cost

data

About

Releases

Packages

Languages

spartanhaden/CaseConnect

Folders and files

Latest commit

History

Repository files navigation

CaseConnect

TODO

cost

data

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages