Search Functionality for a Ecommerce site.

Try it out here: https://ecommerce-retrieval-search.streamlit.app/
View the doc and approach for this project here: https://1drv.ms/w/s!AnO5FdGErMSuh7VXK1wgpzwajPXOGw?e=fAh5Qh

Overview

This project aims to enhance the search experience for an e-commerce platform, by implementing an efficient and accurate product retrieval system. The system uses natural language processing techniques and vector similarity search to match user queries with relevant products.

Features

Text-based product search
Image retrieval based on text queries
Utilizes product metadata for improved search accuracy
Fast similarity search using FAISS (Facebook AI Similarity Search)
Simple and intuitive web interface

How it works

Data Preprocessing:
- Cleans and combines relevant product information (name, description, category, specifications, brand)
- Performs text preprocessing (removing HTML tags, lowercasing, removing punctuation and stopwords)
Embedding Generation:
- Uses the 'gte-base-1.5' embedding model (768 dimensions)
- Converts preprocessed text into dense vector representations
Similarity Search:
- Utilizes FAISS for efficient storage and querying of embeddings
- Implements L2 distance (Euclidean) for similarity measurement
Query Processing:
- Applies the same preprocessing to user queries
- Generates embeddings for the query
- Retrieves top K most similar products using FAISS
Result Display:
- Fetches and displays product images based on the retrieved results

Tech Stack

python: Primary programming language
pandas: Data manipulation and CSV handling
matplotlib: Data visualization
transformers: Embedding model
pyTorch: Tensor operations and GPU support
nltk: Text preprocessing (stopwords, stemming)
faiss-cpu: Vector store for similarity search
streamlit: GUI and deployment

Installation and Usage

Clone the repository.
Install all the dependencies from requirements.txt file. Run !pip install -r requirements.txt in the terminal.
Run streamlit run app.py and the app will run on localhost.

Future Improvements

Implement multimodal search using vision-language models like CLIP
Enable image-to-image and image-to-text queries
Fine-tune image models (e.g., ResNet, ViT) on the product dataset
Implement re-ranking for multimodal queries
Create a manually curated test set for evaluation (using metrics like Recall@K)
Integrate an LLM for handling malformed and multilingual queries

Results

Query: A Red skirt
Query: Football shoes
Query: Running shoes
Query: Superhero t-shirt

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contact

Your Name - Vamsi K

LinkedIn: Vamsi K

Twitter: @VamsiK76294

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
Data/data		Data/data
Imgs		Imgs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
eda.ipynb		eda.ipynb
embeddings.npy		embeddings.npy
embeddings.py		embeddings.py
id2img.csv		id2img.csv
id_list.npy		id_list.npy
index		index
preprocessed_text.csv		preprocessed_text.csv
requirements.txt		requirements.txt
similarity.py		similarity.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Search Functionality for a Ecommerce site.

Overview

Features

How it works

Tech Stack

Installation and Usage

Future Improvements

Results

License

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

sunny7712/Ecommerce-Retrieval-Search

Folders and files

Latest commit

History

Repository files navigation

Search Functionality for a Ecommerce site.

Overview

Features

How it works

Tech Stack

Installation and Usage

Future Improvements

Results

License

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages