An intelligent internship matching system that uses Machine Learning to recommend the best PM internships based on student profiles. Built with Streamlit and powered by content-based filtering using cosine similarity.
-
Student Profile Input
- Name, skills, and location preferences
- Internship type selection
- Easy-to-use sidebar interface
-
Sample Internship Dataset
- 10 curated PM internship opportunities
- Diverse companies and locations
- Detailed job descriptions and requirements
-
ML-Based Matching Algorithm
- Content-based filtering using TF-IDF vectorization
- Cosine similarity for matching student profiles to internships
- Match scores (0-100%) for each recommendation
-
Top 3 Recommendations Display
- Visual match score indicators (🟢 70%+, 🟡 50-70%, 🔴 <50%)
- Detailed internship information
- Company, location, duration, and required skills
-
Feedback System
- Accept/Reject buttons for each recommendation
- Feedback history tracking
- Downloadable JSON feedback data
- Analytics dashboard with statistics
- Python 3.11 (already installed)
- Required packages (already installed):
- streamlit
- pandas
- scikit-learn
- numpy
Option 1: Using the run script (Recommended)
./run_streamlit.shOption 2: Direct command
streamlit run app.py --server.port 5000 --server.address 0.0.0.0Option 3: Default Streamlit port
streamlit run app.pyThe app will be available at:
- Local: http://localhost:5000 (or http://localhost:8501 for default)
- Replit: Will be accessible via the Replit preview URL
.
├── app.py # Main Streamlit application
├── internship_data.py # Sample internship dataset
├── recommendation_engine.py # ML recommendation engine (cosine similarity)
├── run_streamlit.sh # Convenience script to run the app
└── README.md # This file
The system loads a curated dataset of 10 PM internship opportunities with:
- Job titles and descriptions
- Required skills
- Location and duration
- Company information
- Combines internship description, skills, and location into a single content string
- Uses TF-IDF (Term Frequency-Inverse Document Frequency) to vectorize text
- Creates numerical representations of internships and student profiles
- Transforms student profile (skills + location) into TF-IDF vector
- Calculates cosine similarity between student and all internships
- Cosine similarity measures the angle between vectors (0-1 scale)
- Ranks internships by similarity score
- Returns top 3 matches with percentage scores (0-100%)
- Higher scores indicate better matches
-
Enter Your Profile:
- Name: "Sarah Johnson"
- Skills: "product strategy, analytics, SQL, user research"
- Location: "Remote"
-
Click "Get Recommendations"
-
View Results: See your top 3 matches with scores like:
- 🟢 Product Management Intern @ TechCorp - 92.5%
- 🟢 Growth Product Intern @ GrowthLabs - 85.3%
- 🟡 Product Strategy Intern @ ConsultCo - 78.9%
-
Provide Feedback: Click ✅ Accept or ❌ Reject for each recommendation
-
Review Feedback: View feedback history and download data
- Type: Content-Based Filtering
- Vectorization: TF-IDF (sklearn.feature_extraction.text.TfidfVectorizer)
- Similarity Metric: Cosine Similarity (sklearn.metrics.pairwise.cosine_similarity)
- Features Used: Job description, required skills, location
- Measures semantic similarity between text documents
- Scale-invariant (works well with different text lengths)
- Values range from 0 (no similarity) to 1 (identical)
- Industry standard for content-based recommendation systems
The app tracks user feedback to help improve recommendations:
- Accept/Reject: Simple binary feedback
- Timestamp: When feedback was given
- Match Score: AI-calculated match percentage
- Export: Download feedback as JSON for analysis
- Clean, modern Streamlit interface
- Responsive sidebar for profile input
- Color-coded match scores (green/yellow/red)
- Expandable feedback history
- Real-time statistics dashboard
Potential improvements for the next phase:
- Collaborative filtering using historical feedback data
- Hybrid recommendation (content + collaborative)
- Resume parsing with NLP
- Skill gap analysis
- Email notifications for new matches
- A/B testing for algorithm improvements
This project is for educational and demonstration purposes.
Built with ❤️ using Streamlit, scikit-learn, and Python "# INTER_MATCH_AI_MODEL"