V-RoAst: Visual Road Assessment. Can VLM be a Road Safety Assessor Using the iRAP Standard?

🚨News

2025-07-23 – Accepted to ICCV Workshops 2025 (CV4DC, Honolulu, Oct 19–23). Camera-ready coming soon.

Overview

Road safety assessments are costly and data-hungry, especially in LMICs. V-RoAst is a zero-shot Visual Question Answering framework that uses general-purpose VLMs (e.g., Gemini-1.5-Flash, GPT-4o-mini) to classify 52 iRAP attributes from street-level imagery.

We provide:

ThaiRAP dataset: 2,037 images (519 segments) with expert-coded iRAP labels.
Prompt templates & code for attribute classification with VLMs.
CNN baselines (VGG/ResNet) for comparison.

Key Contributions

Open VLM benchmark for iRAP-style road safety attributes.
Prompt engineering framework (system/user prompts + local context).
Zero-shot evaluation, incl. unseen classes.
Automatic star-rating demo using crowdsourced Mapillary imagery.

Abstract

Road safety assessments are critical yet costly, especially in Low- and Middle-Income Countries (LMICs), where most roads remain unrated. Traditional methods require expert annotation and training data, while supervised learning-based approaches struggle to generalise across regions. In this paper, we introduce V-RoAst, a zero-shot Visual Question Answering (VQA) framework using VisionLanguage Models (VLMs) to classify road safety attributes defined by the iRAP standard. We introduce the first opensource dataset from ThaiRAP, consisting of over 2,000 curated street-level images from Thailand annotated for this task. We evaluate Gemini-1.5-flash and GPT-4o-mini on this dataset and benchmark their performance against VGGNet and ResNet baselines. While VLMs underperform on spatial awareness, they generalise well to unseen classes and offer flexible prompt-based reasoning without retraining. Our results show that VLMs can serve as automatic road assessment tools when integrated with complementary data. This work is the first to explore VLMs for zero-shot infrastructure risk assessment and opens new directions for automatic, low-cost road safety mapping. Code and dataset:https://github.com/PongNJ/V-RoAst.

Installation

Step 1: Experimental Platform 🛠️

OpenAI: We used OpenAI version 1.40.3. Find the documentation here.
Google Gemini: We used google-generativeai version 0.7.2. Find the documentation here.
Mapillary API: Access the documentation here.

Step 2: V-RoAst Installation (This will be available later)

git clone https://github.com/PongNJ/V-RoAst.git

ThaiRAP Dataset 📂

Please download ThaiRAP dataset from (google drive) or (ucl rdr) and upload all images to the ./image/ThaiRAP/ directory.

The ThaiRAP dataset combines street images with road attributes, stored in a CSV file, as shown in the structure below:

ThaiRAP Structure:

├─V-RoAst
│  ├─image
│  │  ├─ThaiRAP
│  │  │  ├─1.jpg
│  │  │  ├─2.jpg
│  │  │  ├─...
│  │  │  └─2037.jpg
│  └─Validation.csv
│

ThaiRAP Location and Sample of Images in a 100-m road segment

ThaiRAP Attribute Distribution

Framework of V-RoAst for visual road assessment

Citation

If you use this dataset or refer to our work, please cite:

@misc{jongwiriyanurak2025vroastvisualroadassessment,
      title={V-RoAst: Visual Road Assessment. Can VLM be a Road Safety Assessor Using the iRAP Standard?}, 
      author={Natchapon Jongwiriyanurak and Zichao Zeng and June Moh Goo and Xinglei Wang and Ilya Ilyankou and Kerkritt Sriroongvikrai and Nicola Christie and Meihui Wang and Huanfa Chen and James Haworth},
      year={2025},
      eprint={2408.10872},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2408.10872}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
Resnet_model_results		Resnet_model_results
VGG_model_results		VGG_model_results
figure		figure
final_dataset		final_dataset
image		image
result		result
text		text
1-gemini-1.5-flash.ipynb		1-gemini-1.5-flash.ipynb
2-gpt-4o-mini.ipynb		2-gpt-4o-mini.ipynb
3-Mapillary_image_processor.ipynb		3-Mapillary_image_processor.ipynb
4-Evaluator.ipynb		4-Evaluator.ipynb
README.md		README.md
V-RoAst_framework.png		V-RoAst_framework.png
Validation.csv		Validation.csv
data_summary.ipynb		data_summary.ipynb
dataloader.py		dataloader.py
google_key.env		google_key.env
image_augmentation.py		image_augmentation.py
index.js		index.js
inference.py		inference.py
log2csv.py		log2csv.py
num_categories.npy		num_categories.npy
resnet.py		resnet.py
run_eval_scripts.sh		run_eval_scripts.sh
run_to_csv.sh		run_to_csv.sh
run_training_scripts.sh		run_training_scripts.sh
train.py		train.py
vgg.py		vgg.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

V-RoAst: Visual Road Assessment. Can VLM be a Road Safety Assessor Using the iRAP Standard?

🚨News

Overview

Key Contributions

Abstract

Installation

Step 1: Experimental Platform 🛠️

Step 2: V-RoAst Installation (This will be available later)

ThaiRAP Dataset 📂

ThaiRAP Structure:

ThaiRAP Location and Sample of Images in a 100-m road segment

ThaiRAP Attribute Distribution

Framework of V-RoAst for visual road assessment

Citation

About

Uh oh!

Releases

Packages

Languages

SpaceTimeLab/V-RoAst

Folders and files

Latest commit

History

Repository files navigation

V-RoAst: Visual Road Assessment. Can VLM be a Road Safety Assessor Using the iRAP Standard?

🚨News

Overview

Key Contributions

Abstract

Installation

Step 1: Experimental Platform 🛠️

Step 2: V-RoAst Installation (This will be available later)

ThaiRAP Dataset 📂

ThaiRAP Structure:

ThaiRAP Location and Sample of Images in a 100-m road segment

ThaiRAP Attribute Distribution

Framework of V-RoAst for visual road assessment

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages