You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
OmniParser is a visual prompt method, including a finetuned interactable icon detection model, a finetuned icon description model, and an OCR module. It should be more accurate than GroundingDino.
dandansamax
changed the title
OmniParser for Pure Vision Based GUI Agent: https://arxiv.org/abs/2408.00203
[Feature Request] OmniParser visual prompt
Sep 5, 2024
OmniParser is a visual prompt method, including a finetuned interactable icon detection model, a finetuned icon description model, and an OCR module. It should be more accurate than GroundingDino.
https://arxiv.org/abs/2408.00203
The text was updated successfully, but these errors were encountered: