We use vanishing points and camera intrinsics to extract the dominant manhattan frame in an image to test the orthogonality constraints expected from real-world architectures. This method obtained an F1-Score of 0.75 on a mixed real/generated images handmade benchmark. This method showed signs of robustness to resampling, a common weakness of classical learning-based detectors, although more work is needed to explore it.
Tested with:
- Ubuntu 24.04
- Python 3.12
- CUDA 12.6
Clone the repository and set up the environment:
git clone --recurse-submodules git@github.com:Tetchki/cs413-project.git
cd cs413-project-
Create a Python 3.12 virtual environment
-
Install pip-tools
This project uses
pip-toolsto manage dependencies in a reproducible way.pip install pip-tools
-
Compile requirements.txt from requirements.in
Resolves all dependencies from
requirements.ininto a fully pinnedrequirements.txt.Note: This may take a while.
pip-compile --verbose requirements.in
-
Install dependencies
Install all dependencies and their exact versions as specified in
requirements.txt.Note: This may take a while.
pip install -r requirements.txt
-
Install the DeepLSD submodule
Some code in this project requires pre-trained weights for DeepLSD. Download them with:
python3 download_deeplsd_weights.py
If you prefer to download the weights manually instead of using python3 download_deeplsd_weights.py, follow these steps:
-
Ensure the DeepLSD submodule is initialized:
-
Confirm that the folder
ext/DeepLSDexists. If it doesn’t, run:git submodule update --init --recursive
-
-
Create the weights folder:
-
Inside the
ext/DeepLSDdirectory, create a subfolder namedweights. -
The folder structure should look like this:
ext/ └── DeepLSD/ └── weights/ └── ...
-
-
Download the required model files:
-
Move the downloaded files into the
weights/folder:- Place both
.tarfiles intoext/DeepLSD/weights/.
- Place both
The demo is provided in the Jupyter notebook called playground.ipynb. It runs on two example images located in the data/ folder.
To run the demo:
- Open
playground.ipynbin Jupyter. - Run the cells sequentially to execute the full pipeline on the sample images.
You can pass verbose=True to the main pipeline function if you want to see intermediate results and detailed information about each processing step.
Make sure all dependencies are installed and the data/ folder contains the required images before running the notebook.
- The
data/directory contains the datasets used for finetuning and evaluation. - You can download the full data folder from here.
- The
data/synthbuster/keepfolder contains the 100 hand-picked building examples from the Synthbuster dataset that fit our scene geometry assumptions. If you want you can download the full Synthbuster dataset from here. Put its content inside thedata/synthbuster/folder. - Use the
extract_dataset.ipynbnotebook to hand-pick building examples from the Synthbuster dataset that fit our scene geometry assumptions.
- Use
intrinsics_experiments.ipynbto compare the performance of GeoCalib and Perspective Field methods for camera intrinsics estimation.
pipeline.ipynbis the main notebook to run the full geometry-based detection pipeline and generate benchmark results.
playground.ipynbprovides a minimal demo to test the system on one real and one generated image for quick validation.
- This project assumes a focus on geometric priors like vanishing points, line segment distributions, and camera intrinsics to identify generated images.
- The DeepLSD and Geolib submodules are required for full functionality.
| Real Image | Generated Image |
|---|---|
![]() |
![]() |
| Real Output (correct Manhattan Frame) | Generated Output (incorrect Manhattan Frame) |
![]() |
![]() |



