This repository contains complementary files to existing well known datasets to enable people to construct fine grained geolocation datasets. We will give detailed descriptions for each dataset in the following sections.
If you use data from this repository in publication, please cite:
@article{fg_geo_networks,
author = {Grace Chu and Brian Potetz and Weijun Wang and Andrew Howard and Yang Song and Fernando Brucher and Thomas Leung and Hartwig Adam},
title = {Geo-Aware Networks for Fine Grained Recognition},
journal = {arXiv preprint arXiv:1906.01737},
year = {2019},
}
iNaturalist with Geolocation (download)
The meta data which contains geolocation as well as date information have been released on the original iNaturalist competition github page. Specifically, the data contains:
- id: image id in iNaturalist competition dataset,
- lat: latitude of where the image was taken,
- lon: longitude of where the image was taken,
- date: date of when the image was taken,
- user_id: user id of who owns the image.
YFCC100M with Geolocation (download)
The YFCC100M dataset contains 100 millions Flickr images and videos with creative commons licenses. We release a set of 36,146 YFCC100M images that had Flickr tags that we identified as corresponding to one of the labels in iNaturalist 2017 competition data above. The 36,146 images were sampled from a larger pool of candidate images as described in the paper above. Specifically:
- the image must have geolocation available,
- the image must have at most one iNaturalist label,
- at most ten examples were retained for each label.
The .csv file contains columns indicating:
- The YFCC100M ID (based on an image hash)
- The YFCC100M Line Number (the index of the image into the YFCC100M dataset, ranging from 0 to 999,999).
- iNaturalist label
- latitude
- longitude
- Flickr URL (URL of the image on Flickr).
- Multimedia Commons URL (URL of the image on mmcommons.org).