LabelAId: AI Interventions for Improving Human Labeling Quality and Domain Knowledge in Crowdsourcing Systems

LabelAId is an advanced inference model combining Programmatic Weak Supervision (PWS) with Feature Tokenizer + Transformer (FT-Transformer) to infer label correctness based on user behavior and domain knowledge. Check out our paper and video for more details.

This repository contains the data, model code, evaluation code for LabelAId: AI Interventions for Improving Human Labeling Quality and Domain Knowledge in Crowdsourcing Systems by Chu Li*, Zhihan Zhang*, Michael Saugstad, Esteban Safranchik, Minchu Kulkarni, Xiaoyu Huang, Shwetak Patel, Vikram Iyer, Tim Althoff, Jon E. Froehlich. The paper was published in Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI’24).

LabelAId Pipeline

The LabelAId machine learning pipeline consists of two phases, programmatic weak supervision for annotating raw data, and discriminative machine learning models for downstream tasks.

To study LabelAId in a real world context, we instrumented the open-source crowdsourcing tool, Project Sidewalk. When integrating LabelAId with Project Sidewalk, we chose the FT-transformer model as the discriminative model, as it is designed to handle tabular data with mixed numerical and categorical features, therefore it aligns with the heterogeneous nature of the Project Sidewalk dataset.

See our code for PWS and FT-Transformer for more details.

Datasets

Our datasets come from Project Sidewalk labels from Seattle, WA; Chicago, IL; and Oradell, NJ. The unannotated set is used to pre-train the model after our PWS annotation process. The expert-validated set is used to fine-tune and evaluate the inference model, which was created from labels manually-validated by the Project Sidewalk research team.

Label Type	Seattle Unannotated	Seattle Expert-Validated	Chicago Unannotated	Chicago Expert-Validated	Oradell Unannotated	Oradell Expert-Validated	Total
Curb Ramp	70,690	5,333	5,710	2,386	660	859	85,638
Missing Curb Ramp	32,968	4,239	463	1,294	325	396	39,685
No Sidewalk	36,021	3,460	2,211	48	3,949	1,217	46,906
Surface Problem	26,912	2,909	2,136	1,651	2,544	1,222	37,374
Obstacle	10,103	407	1,254	320	106	158	12,348
All Label Types	176,694	16,348	11,774	5,699	7,584	3,852	221,951

Technical Evaluation

LabelAId consistently outperforms all other baseline models and can improve mistake inference accuracy by up to 37% with just 50 downstream samples. See our evaluation code and paper for more details.

User Study

After demonstrating the technical efficacy of our LabelAId system in inferring label correctness, we implemented the LabelAId inference model in Project Sidewalk, and evaluated the user experience and performance of the end-to-end system with users in the loop.

The code for integrating LabelAId pipeline with Project Sidewalk labeling interface can be found here.

Cite LabelAId

@inproceedings{10.1145/3613904.3642089,
author = {Li, Chu and Zhang, Zhihan and Saugstad, Michael and Safranchik, Esteban and Kulkarni, Chaitanyashareef and Huang, Xiaoyu and Patel, Shwetak and Iyer, Vikram and Althoff, Tim and Froehlich, Jon E.},
title = {LabelAId: Just-in-time AI Interventions for Improving Human Labeling Quality and Domain Knowledge in Crowdsourcing Systems},
year = {2024},
isbn = {9798400703300},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3613904.3642089},
doi = {10.1145/3613904.3642089},
booktitle = {Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems},
articleno = {643},
numpages = {21},
keywords = {community science, crowdsourcing, human-ai collaboration, machine learning, programmatic weak supervision (pws), quality control, urban accessibility},
location = {Honolulu, HI, USA},
series = {CHI '24}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

LabelAId: AI Interventions for Improving Human Labeling Quality and Domain Knowledge in Crowdsourcing Systems

LabelAId Pipeline

Datasets

Technical Evaluation

User Study

Cite LabelAId

Files

README.md

Latest commit

History

README.md

File metadata and controls

LabelAId: AI Interventions for Improving Human Labeling Quality and Domain Knowledge in Crowdsourcing Systems

LabelAId Pipeline

Datasets

Technical Evaluation

User Study

Cite LabelAId