This repository is an open-source project to create a dataset of panoptic annotated data (COCO format) of user-defined class labels using google image search in combination with the recent works of Grounding Dino and Meta's segment-anything. The idea was born from the useful tool of Language Segment-Anything which combined the two models in an easy to use API.
git clone git@github.com:ub216/panoptic_dataset_collector.git && cd panoptic_dataset_collector
pip install torch torchvision && pip install -e . && pip install gradio==3.37.0 psutil croniter arrow inquirer deepdiff backoff lightning-cloud
You might get a error message:
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
lang-sam 0.1.0 requires huggingface-hub<0.14.0,>=0.13.4, but you have huggingface-hub 0.17.2 which is incompatible.
Successfully installed gradio-3.37.0 huggingface-hub-0.17.2
This is a known issue in lang-segment-anything. Please ignore this for now.
To collect the data you need to:
- Create a google programmable search engine that will allow this tool perform google image search.
- Create a custom search JSON API to convert the search results into JSON format.
Both these features are free with your google account. You can search upto 100 pages-per-day for free. Each pages gives you 10 results giving you upto 1000 free annotated images per day! By paying $5 to google you can even boost this to 10,000-100,0000 annotations per day.
with gui:
lightning run app panoptic_dataset_collector/google_search_gui.py
command line:
python3 panoptic_dataset_collector/google_search.py --search=<google_search_key> --label_file=<path_to_class_label_file> --engine_id=<google_search_engine_id> --api_key=<custom_json_api_key>
python3 panoptic_dataset_collector/google_search.py --search="safari in india" --label_file=panoptic_dataset_collector/safari_in_india.yaml --engine_id=<google_search_engine_id> --api_key=<custom_json_api_key>
Sample label file is provided that can be modified as per convenience. Its best to provide a search key that would return majority images with the required class labels.
- Change the number of pages to search per keyword using
--search_pages
flag (defaults to 10). - Perform a depper search by crawling the url of returned images to look for more images. This can be done with the
--deep_search
option. Note this will take a longer time. - You could restrict the tool to only return images with commercial license using the
--commercial_only
flag. Note only the images would be commercial. The annotations, requires models that could have restricted license. Please refer to the links in description.
- Add an easy to use GUI interface
- Handle multiple detections of same object
This project is based on the following repositories:
The code is provided with Apache License 2.0. For the collected images refer to individual license. For annotations, refer to model license of Grounding Dino and segment-anything.