⭐ DreanSketch - Turning Dreams to Reality ⭐

Author: Sayuj Gupta

Date: 13-2-2025

Overview

This project allows users to draw an image on a web-based whiteboard and send it to a backend, where a deep learning model processes the drawing to generate a high-quality image using Stable Diffusion XL (SDXL) and BLIP captioning.

Reason of Creation

This project was created for an event in my college-IIT Jammu (Hence the logo)

Requirements

1. Hugging Face CLI Login

To access Hugging Face models, authenticate using the CLI:

huggingface-cli login

You will need an access token from Hugging Face.

2. Hardware Requirements

A CUDA-enabled GPU with at least 8GB VRAM is required to run Stable Diffusion efficiently. NVIDIA RTX 3060 or better is recommended.

3. Prerequisites

Ensure you have Python 3.8.10 installed. If not, install it from Python's official site.

Install the required dependencies:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install transformers diffusers bitsandbytes flask flask-cors accelerate sentencepiece
pip install opencv-python numpy pillow
pip install huggingface_hub

Usage

1. Running the Backend Server

Start the Flask-based backend that process images:

python server.py

This will launch the server, allowing the frontend to communicate with the deep learning model.

2. Running the Frontend

The frontend consists of a whiteboard for sketching and sending images to the backend. Deploy it using a simple server:

python -m http.server 5000

Now, visit http://localhost:8000 in your browser.

How It Works

User Sketches an Image – The frontend provides a whiteboard where users can draw a rough sketch and press generate.
Image Sent to Backend – The sketch is sent to the Flask backend via an API request.
BLIP Captioning – The sketch undergoes captioning using the BLIP model to generate a meaningful description.
Stable Diffusion XL (SDXL) Processing – The captioned text is fed into SDXL, which generates a high-quality image.
Image Display – The generated image is sent back to the frontend and displayed to the user.

API Endpoints

1. Upload Sketch & Generate Image

URL: /generate
Method: POST
Payload: Sketch image (base64 encoded or multipart form-data)
Response: AI-generated image

2. Health Check

URL: /health
Method: GET
Response: { "status": "running" }

Notes

When 'server.py' is run for first time, it takes around 5 minutes to start everything
Image generation takes around 2-3 minutes depending on your gpu.

Future Improvements

Add support for multiple sketch styles.
Improve captioning accuracy with fine-tuned BLIP models.
Optimize model inference for better performance on lower-end GPUs.

License

This project is released under the MIT License.

Contributing

If you’d like to contribute, fork the repository and submit a pull request. Suggestions and improvements are welcome!

Contact

For any issues or questions, feel free to reach out at sayujgupta2005@gmail.com or create an issue in the repository.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
LICENSE		LICENSE
README.md		README.md
ReadMe		ReadMe
logo.png		logo.png
new_index.html		new_index.html
server.py		server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

⭐ DreanSketch - Turning Dreams to Reality ⭐

Author: Sayuj Gupta

Date: 13-2-2025

Overview

Reason of Creation

Requirements

1. Hugging Face CLI Login

2. Hardware Requirements

3. Prerequisites

Usage

1. Running the Backend Server

2. Running the Frontend

How It Works

API Endpoints

1. Upload Sketch & Generate Image

2. Health Check

Notes

Future Improvements

License

Contributing

Contact

About

Releases

Packages

Languages

License

SayujGupta2005/DreamSketch---Turning-Dreams-to-Reality

Folders and files

Latest commit

History

Repository files navigation

⭐ DreanSketch - Turning Dreams to Reality ⭐

Author: Sayuj Gupta

Date: 13-2-2025

Overview

Reason of Creation

Requirements

1. Hugging Face CLI Login

2. Hardware Requirements

3. Prerequisites

Usage

1. Running the Backend Server

2. Running the Frontend

How It Works

API Endpoints

1. Upload Sketch & Generate Image

2. Health Check

Notes

Future Improvements

License

Contributing

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages