A versatile and user-friendly visual analysis interface powered by Moondream VLM, built with Python and Streamlit.
- 📝 Intelligent Image Captioning
- 🎯 Precise Object Detection with Bounding Boxes
- 📍 Object Pointing Capabilities
- 🔍 Natural Language Visual Querying
- 🎨 Clean, Tab-based User Interface
- 💾 Download Options for Analyzed Images
- 🔐 Secure API Key Management
- Python 3.11 or higher
- Web Browser
- Moondream API key from Moondream Console or download the model file from here
- Clone the repository:
git clone https://github.com/smaranjitghose/lunarsightai.git
cd lunarsightai
- Create and activate virtual environment:
# Windows
python -m venv env
.\env\Scripts\activate
# Linux/Mac
python3 -m venv env
source env/bin/activate
- Install required packages:
pip install -r requirements.txt
- Start the application:
streamlit run app.py
- Open your browser and navigate to:
http://localhost:8501
- Get detailed descriptions of any image
- Perfect for accessibility features
- Useful for content indexing
- "Detect all people in the image"
- "Find books on the shelf"
- "Locate electronic devices"
- "Point to the main subject"
- "Identify the location of logos"
- "Mark all faces in the image"
- "What colors are dominant in this image?"
- "How many people are wearing glasses?"
- "Describe the environment in the image"
Common Issues:
-
API Key Error
- Verify API key is entered correctly
- Check if API key has necessary permissions
- Ensure API key is active
-
Image Upload Issues
- Check if image format is supported (JPG, JPEG, PNG)
- Ensure image size is reasonable
- Verify image is not corrupted
-
Analysis Failures
- Check internet connection
- Verify API quota hasn't been exceeded
- Ensure prompts are clear and specific
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the project
- Create your feature branch (
git checkout -b feature/AmazingFeature
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
Made with ❤️ by Smaranjit Ghose