Unified discrete-continuous actions for free-form drag computer use.
📑 Paper | 🌐 Project Page
ShowUI-π: Flow-based Generative Models as GUI Dexterous Hands
Siyuan Hu†, Kevin Qinghong Lin†, Mike Zheng Shou*
Show Lab @ National University of Singapore
† Equal contribution * Corresponding author
- [2026.1.10] Release code and project page.
showui-pi-demo.mp4
ShowUI-π is a 450M flow-based vision-language-action model that treats GUI actions as continuous trajectories, generating smooth clicks and drags directly from screen observations. It unifies discrete and continuous actions, enabling precise drawing, rotation, sorting, and captcha solving without tokenized coordinates.
If you find our work helpful, please kindly consider citing our paper.
@misc{hu2025showuipi,
title={ShowUI-$\\pi$: Flow-based Generative Models as GUI Dexterous Hands},
author={Siyuan Hu and Kevin Qinghong Lin and Mike Zheng Shou},
year={2025},
eprint={2512.24965},
archivePrefix={arXiv},
primaryClass={cs.CV},
doi={10.48550/arXiv.2512.24965},
url={https://arxiv.org/abs/2512.24965},
}This project is licensed under the Apache License, Version 2.0.
See LICENSE for details.