some question about the detection ability #47

1benwu1 · 2024-11-20T07:24:04Z

I hope to input a screenshot of a webpage/app and have the model help me detect some components in the image, such as icons, textboxes, etc., but it is completely unable to do so. The only successful attempt is that when prompt="icon" is entered, the model can detect a certain number of icons/images. In other cases, it is almost impossible to detect anything. I want to know if the model has not been trained specifically in this area, or if my prompt is not written well enough. There is also a special case where prompt="icon. textbox". If a prompt has one more word (textbox), almost nothing can be detected.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

some question about the detection ability #47

some question about the detection ability #47

1benwu1 commented Nov 20, 2024

some question about the detection ability #47

some question about the detection ability #47

Comments

1benwu1 commented Nov 20, 2024