Skip to content

A scalable backend API built with Python, utilizing FastAPI (or Flask) for high-performance routing and SQLAlchemy to manage data in a robust PostgreSQL database. Implements full CRUD and follows a clean, layered architecture.

Notifications You must be signed in to change notification settings

shubhamkoti/AI-Search-Engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

InternMatch - AI-Powered Internship Recommendation Engine

An intelligent internship matching system that uses Machine Learning to recommend the best PM internships based on student profiles. Built with Streamlit and powered by content-based filtering using cosine similarity.

🎯 Features

  1. Student Profile Input

    • Name, skills, and location preferences
    • Internship type selection
    • Easy-to-use sidebar interface
  2. Sample Internship Dataset

    • 10 curated PM internship opportunities
    • Diverse companies and locations
    • Detailed job descriptions and requirements
  3. ML-Based Matching Algorithm

    • Content-based filtering using TF-IDF vectorization
    • Cosine similarity for matching student profiles to internships
    • Match scores (0-100%) for each recommendation
  4. Top 3 Recommendations Display

    • Visual match score indicators (🟢 70%+, 🟡 50-70%, 🔴 <50%)
    • Detailed internship information
    • Company, location, duration, and required skills
  5. Feedback System

    • Accept/Reject buttons for each recommendation
    • Feedback history tracking
    • Downloadable JSON feedback data
    • Analytics dashboard with statistics

🚀 Quick Start

Prerequisites

  • Python 3.11 (already installed)
  • Required packages (already installed):
    • streamlit
    • pandas
    • scikit-learn
    • numpy

Running the Application

Option 1: Using the run script (Recommended)

./run_streamlit.sh

Option 2: Direct command

streamlit run app.py --server.port 5000 --server.address 0.0.0.0

Option 3: Default Streamlit port

streamlit run app.py

The app will be available at:

📁 Project Structure

.
├── app.py                      # Main Streamlit application
├── internship_data.py          # Sample internship dataset
├── recommendation_engine.py    # ML recommendation engine (cosine similarity)
├── run_streamlit.sh           # Convenience script to run the app
└── README.md                   # This file

🧠 How It Works

1. Data Loading

The system loads a curated dataset of 10 PM internship opportunities with:

  • Job titles and descriptions
  • Required skills
  • Location and duration
  • Company information

2. Feature Extraction

  • Combines internship description, skills, and location into a single content string
  • Uses TF-IDF (Term Frequency-Inverse Document Frequency) to vectorize text
  • Creates numerical representations of internships and student profiles

3. Similarity Calculation

  • Transforms student profile (skills + location) into TF-IDF vector
  • Calculates cosine similarity between student and all internships
  • Cosine similarity measures the angle between vectors (0-1 scale)

4. Recommendation Generation

  • Ranks internships by similarity score
  • Returns top 3 matches with percentage scores (0-100%)
  • Higher scores indicate better matches

💡 Usage Example

  1. Enter Your Profile:

    • Name: "Sarah Johnson"
    • Skills: "product strategy, analytics, SQL, user research"
    • Location: "Remote"
  2. Click "Get Recommendations"

  3. View Results: See your top 3 matches with scores like:

    • 🟢 Product Management Intern @ TechCorp - 92.5%
    • 🟢 Growth Product Intern @ GrowthLabs - 85.3%
    • 🟡 Product Strategy Intern @ ConsultCo - 78.9%
  4. Provide Feedback: Click ✅ Accept or ❌ Reject for each recommendation

  5. Review Feedback: View feedback history and download data

🔧 Technical Details

ML Algorithm

  • Type: Content-Based Filtering
  • Vectorization: TF-IDF (sklearn.feature_extraction.text.TfidfVectorizer)
  • Similarity Metric: Cosine Similarity (sklearn.metrics.pairwise.cosine_similarity)
  • Features Used: Job description, required skills, location

Why Cosine Similarity?

  • Measures semantic similarity between text documents
  • Scale-invariant (works well with different text lengths)
  • Values range from 0 (no similarity) to 1 (identical)
  • Industry standard for content-based recommendation systems

📊 Feedback System

The app tracks user feedback to help improve recommendations:

  • Accept/Reject: Simple binary feedback
  • Timestamp: When feedback was given
  • Match Score: AI-calculated match percentage
  • Export: Download feedback as JSON for analysis

🎨 UI Features

  • Clean, modern Streamlit interface
  • Responsive sidebar for profile input
  • Color-coded match scores (green/yellow/red)
  • Expandable feedback history
  • Real-time statistics dashboard

🔮 Future Enhancements

Potential improvements for the next phase:

  • Collaborative filtering using historical feedback data
  • Hybrid recommendation (content + collaborative)
  • Resume parsing with NLP
  • Skill gap analysis
  • Email notifications for new matches
  • A/B testing for algorithm improvements

📝 License

This project is for educational and demonstration purposes.


Built with ❤️ using Streamlit, scikit-learn, and Python "# INTER_MATCH_AI_MODEL"

About

A scalable backend API built with Python, utilizing FastAPI (or Flask) for high-performance routing and SQLAlchemy to manage data in a robust PostgreSQL database. Implements full CRUD and follows a clean, layered architecture.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •