Skip to content

SpaceTimeLab/V-RoAst

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

V-RoAst: Visual Road Assessment. Can VLM be a Road Safety Assessor Using the iRAP Standard?

ICCVW 2025 – CV4DC (Accepted) arXiv

🚨News

  • 2025-07-23 – Accepted to ICCV Workshops 2025 (CV4DC, Honolulu, Oct 19–23). Camera-ready coming soon.

Overview

Road safety assessments are costly and data-hungry, especially in LMICs. V-RoAst is a zero-shot Visual Question Answering framework that uses general-purpose VLMs (e.g., Gemini-1.5-Flash, GPT-4o-mini) to classify 52 iRAP attributes from street-level imagery.

We provide:

  • ThaiRAP dataset: 2,037 images (519 segments) with expert-coded iRAP labels.
  • Prompt templates & code for attribute classification with VLMs.
  • CNN baselines (VGG/ResNet) for comparison.

Key Contributions

  • Open VLM benchmark for iRAP-style road safety attributes.
  • Prompt engineering framework (system/user prompts + local context).
  • Zero-shot evaluation, incl. unseen classes.
  • Automatic star-rating demo using crowdsourced Mapillary imagery.

Abstract

Road safety assessments are critical yet costly, especially in Low- and Middle-Income Countries (LMICs), where most roads remain unrated. Traditional methods require expert annotation and training data, while supervised learning-based approaches struggle to generalise across regions. In this paper, we introduce V-RoAst, a zero-shot Visual Question Answering (VQA) framework using VisionLanguage Models (VLMs) to classify road safety attributes defined by the iRAP standard. We introduce the first opensource dataset from ThaiRAP, consisting of over 2,000 curated street-level images from Thailand annotated for this task. We evaluate Gemini-1.5-flash and GPT-4o-mini on this dataset and benchmark their performance against VGGNet and ResNet baselines. While VLMs underperform on spatial awareness, they generalise well to unseen classes and offer flexible prompt-based reasoning without retraining. Our results show that VLMs can serve as automatic road assessment tools when integrated with complementary data. This work is the first to explore VLMs for zero-shot infrastructure risk assessment and opens new directions for automatic, low-cost road safety mapping. Code and dataset:https://github.com/PongNJ/V-RoAst.

Installation

Step 1: Experimental Platform 🛠️

  • OpenAI: We used OpenAI version 1.40.3. Find the documentation here.

  • Google Gemini: We used google-generativeai version 0.7.2. Find the documentation here.

  • Mapillary API: Access the documentation here.

Step 2: V-RoAst Installation (This will be available later)

git clone https://github.com/PongNJ/V-RoAst.git

ThaiRAP Dataset 📂

Please download ThaiRAP dataset from (google drive) or (ucl rdr) and upload all images to the ./image/ThaiRAP/ directory.

The ThaiRAP dataset combines street images with road attributes, stored in a CSV file, as shown in the structure below:

ThaiRAP Structure:

├─V-RoAst
│  ├─image
│  │  ├─ThaiRAP
│  │  │  ├─1.jpg
│  │  │  ├─2.jpg
│  │  │  ├─...
│  │  │  └─2037.jpg
│  └─Validation.csv
│

ThaiRAP Location and Sample of Images in a 100-m road segment

ThaiRAP location and samples

ThaiRAP Attribute Distribution

ThaiRAP Attribute Distribution

Framework of V-RoAst for visual road assessment

Framework of V-RoAst for visual road assessment

ext Prompts from Framework of V-RoAst

Citation

If you use this dataset or refer to our work, please cite:

@misc{jongwiriyanurak2025vroastvisualroadassessment,
      title={V-RoAst: Visual Road Assessment. Can VLM be a Road Safety Assessor Using the iRAP Standard?}, 
      author={Natchapon Jongwiriyanurak and Zichao Zeng and June Moh Goo and Xinglei Wang and Ilya Ilyankou and Kerkritt Sriroongvikrai and Nicola Christie and Meihui Wang and Huanfa Chen and James Haworth},
      year={2025},
      eprint={2408.10872},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2408.10872}, 
}

About

V-RoAst: A New Dataset for Visual Road Assessment

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.6%
  • Other 0.4%