As CITES has released a more complete, larger, shipment-level database, we've moved development to the citesdb package, which uses a better out-of-memory framework for larger data.
Authors: Noam Ross
The cites package provides a complete extract of the CITES wildlife trade database.
Install the cites package with this command:
source("https://install-github.me/ecohealthalliance/cites")
The main function in cites is cites_data()
. This returns the main
CITES database as a dplyr tibble.
cites makes use of
datastorr to manage
data download. The first time you run cites_data()
the package will
download the most recent version of the database (~32MB). Subsequent
calls will load the database from storage on your computer.
The CITES database is stored as an efficiently compressed .fst
file, and loading it loads it a a
remote dplyr source. This means
that it does not load fully into memory on load, but some limited
operations (column selection) can be performed on-disk. If you wish to
manipulate it as a data frame, simply call dplyr::collect()
to load it
fully into memory, like so:
all_cites <- cites_data() %>%
collect()
Note that the full database will be approximately 270 MB in memory.
cites_codes()
returns a data frame with descriptions of the codes in
the various columns of cites_data()
. This is useful for lookup or
joining with the main data for more descriptive outputs. The
?cites_code
help file also has a searchable table of these codes.
cites_metadata()
provides field descriptions and cites_parties()
lists the CITES party countries and the date they joined the treaty.
See the developer README for more on the data-cleaning process.
Please give us feedback or ask questions by filing issues
cites is developed at EcoHealth Alliance. Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.