Skip to content

Commit

Permalink
Merge pull request #301 from ropensci/version
Browse files Browse the repository at this point in the history
Add version argument
  • Loading branch information
agila5 authored Jan 20, 2025
2 parents 35f1434 + 37704e5 commit 13380e6
Show file tree
Hide file tree
Showing 10 changed files with 135 additions and 11 deletions.
4 changes: 3 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,14 @@
# osmextract (development version)

* Added a `version` argument to `oe_match` to simplify the download of old extracts from Geofabrik provider ([#295](https://github.com/ropensci/osmextract/issues/295))

# osmextract 0.5.2

### MAJOR CHANGES

* Bump minimum R version from 3.5.0 to 3.6.0 since that's a requirement for one of our indirect dependencies (i.e. [evaluate](https://cran.r-project.org/package=evaluate)).
* Adjusted the SQL syntax used inside `oe_get_network` so that the queries are compatible with GDAL 3.10 ([#298](https://github.com/ropensci/osmextract/issues/291)).
* The output of `oe_get_network` does not drop elements tagged as `access = 'no'` as long as the `foot`/`bicycle`/`motor_vehicle` key (according to the chosen mode of transport) is equal to `yes`, `permissive`, or `designated` ([#289](https://github.com/ropensci/osmextract/issues/289)).
* The output of `oe_get_network` does not drop elements tagged as `access = 'no'` as long as the `foot`/`bicycle`/`motor_vehicle` (according to the chosen mode of transport) key is equal to `yes`, `permissive`, or `designated` ([#289](https://github.com/ropensci/osmextract/issues/289)).

### MINOR CHANGES

Expand Down
3 changes: 3 additions & 0 deletions R/get-network.R
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,9 @@
#' modifications to the current filters or propose new values for alternative
#' modes of transport.
#'
#' Starting from version 0.5.2, the `version` argument (see [oe_get()]) can be
#' used to download historical OSM extracts from Geofabrik provider.
#'
#' @seealso [oe_get()]
#'
#' @examples
Expand Down
8 changes: 8 additions & 0 deletions R/get.R
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,12 @@
#' say that smaller administrative units correspond to bigger levels. If
#' `NULL`, the default, the `oe_*` functions will select the highest available
#' level. See Details and Examples in [oe_match()].
#' @param version The version of the OSM extract to download. The default is
#' "latest". Other possible values are typically specified using the format
#' YYMMDD (e.g. "200101"). The complete list of all available historic files
#' for a given extract can be browsed from the Geofabrik website (e.g.
#' <https://download.geofabrik.de/europe/italy.html> and then click on 'raw
#' directory index').
#' @param download_directory Directory to store the file containing OSM data?.
#' @param force_download Should the `.osm.pbf` file be updated even if it has
#' already been downloaded? `FALSE` by default. This parameter is used to
Expand Down Expand Up @@ -216,6 +222,7 @@ oe_get = function(
match_by = "name",
max_string_dist = 1,
level = NULL,
version = "latest",
download_directory = oe_download_directory(),
force_download = FALSE,
max_file_size = 5e+8,
Expand Down Expand Up @@ -246,6 +253,7 @@ oe_get = function(
match_by = match_by,
max_string_dist = max_string_dist,
level = level,
version = version,
quiet = quiet
)

Expand Down
35 changes: 26 additions & 9 deletions R/match.R
Original file line number Diff line number Diff line change
Expand Up @@ -152,11 +152,13 @@ oe_match.sfc = function(
place,
provider = "geofabrik",
level = NULL,
version = "latest",
quiet = FALSE,
...
) {
# Load the data associated with the chosen provider.
provider_data = load_provider_data(provider)
version <- check_version(version, provider)

# Check if place has no CRS (i.e. NA_crs_, see ?st_crs) and, in that case, set
# 4326 + raise a warning message.
Expand Down Expand Up @@ -216,7 +218,6 @@ oe_match.sfc = function(
# If, again, there are multiple matches with the same "level", we will select
# only the area closest to the input place.
if (nrow(matched_zones) > 1L) {

nearest_id_centroid = sf::st_nearest_feature(
place,
sf::st_centroid(sf::st_geometry(matched_zones))
Expand All @@ -231,13 +232,19 @@ oe_match.sfc = function(
.subclass = "oe_match_sfcInputMatchedWith"
)

url <- matched_zones[["pbf"]]
url <- adjust_version_in_url(version, url)
file_size <- matched_zones[["pbf_file_size"]]
if (version != "latest") {
file_size <- NA # The file size is not available for older versions
}

# Return a list with the URL and the file_size of the matched place
result = list(
url = matched_zones[["pbf"]],
file_size = matched_zones[["pbf_file_size"]]
url = url,
file_size = file_size
)
result

}

#' @inheritParams oe_get
Expand Down Expand Up @@ -277,6 +284,7 @@ oe_match.character = function(
quiet = FALSE,
match_by = "name",
max_string_dist = 1,
version = "latest",
...
) {
# For the moment we support only length-one character vectors
Expand All @@ -290,6 +298,7 @@ oe_match.character = function(
)
)
}
version <- check_version(version, provider)

# See https://github.com/ropensci/osmextract/pull/125
if (place == "ITS Leeds") {
Expand Down Expand Up @@ -339,7 +348,6 @@ oe_match.character = function(
# If the approximate string distance between the best match is greater than
# the max_string_dist threshold, then:
if (isTRUE(high_distance)) {

# 1. Raise a message
oe_message(
"No exact match found for place = ", place,
Expand Down Expand Up @@ -389,7 +397,8 @@ oe_match.character = function(
provider = other_provider,
match_by = match_by,
quiet = TRUE,
max_string_dist = max_string_dist
max_string_dist = max_string_dist,
version = version
)
)
}
Expand All @@ -410,7 +419,8 @@ oe_match.character = function(
oe_match(
place = sf::st_geometry(place_online),
provider = provider,
quiet = quiet
quiet = quiet,
version = version
)
)
}
Expand All @@ -434,9 +444,16 @@ oe_match.character = function(
.subclass = "oe_match_characterinputmatchedWith"
)

url <- best_matched_place[["pbf"]]
url <- adjust_version_in_url(version, url)
file_size <- best_matched_place[["pbf_file_size"]]
if (version != "latest") {
file_size <- NA # The file size is not available for older versions
}

result = list(
url = best_matched_place[["pbf"]],
file_size = best_matched_place[["pbf_file_size"]]
url = url,
file_size = file_size
)
result
}
Expand Down
21 changes: 21 additions & 0 deletions R/utils.R
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,27 @@ check_layer_provider = function(layer, provider) {
invisible(0)
}

check_version <- function(version, provider) {
# Currently, the only provider that includes historic data for the OSM
# extracts is geofabrik.
if (version != "latest" && provider != "geofabrik") {
warning(
"version != 'latest' is only supported for 'geofabrik' provider. ",
"Overriding it to 'latest'.",
call. = FALSE
)
return("latest")
}
version
}
adjust_version_in_url <- function(version, url) {
if (version == "latest") {
return(url)
}
gsub("latest(?=\\.osm\\.pbf$)", version, url, perl = TRUE)
}


# Starting from sf 1.0.2, sf::st_read raises a warning message when both layer
# and query arguments are set, while it raises a warning in sf < 1.0.2 when
# there are multiple layers and the layer argument is not set. See also
Expand Down
8 changes: 8 additions & 0 deletions man/oe_get.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions man/oe_get_network.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

17 changes: 16 additions & 1 deletion man/oe_match.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

19 changes: 19 additions & 0 deletions tests/testthat/test-match.R
Original file line number Diff line number Diff line change
Expand Up @@ -214,3 +214,22 @@ test_that("oe_match_pattern: test spatial combine", {
MI_PA = sf::st_sfc(milan, palermo, crs = 4326)
expect_identical(oe_match_pattern(MI_PA)$geofabrik, c("Europe", "Italy"))
})

test_that("oe-match: detecting version works", {
latest_match <- oe_match("Italy", quiet = TRUE)
expect_true(grepl("latest", latest_match$url))

version2020_match <- oe_match("Italy", quiet = TRUE, version = "200101")
expect_true(grepl("200101", version2020_match$url))
})

test_that("oe-match: warning with version and provider", {
expect_warning(
oe_match("Leeds", provider = "bbbike", version = "2", quiet = TRUE),
regexp = "version != 'latest' is only supported for 'geofabrik' provider."
)
expect_warning(
oe_match("Lombardia", version = "ABC", quiet = TRUE),
regexp = "version != 'latest' is only supported for 'geofabrik' provider."
)
})
28 changes: 28 additions & 0 deletions vignettes/osmextract.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -268,6 +268,34 @@ Finally, to reduce unnecessary computational resources and save bandwidth/electr
(its_details = oe_match("ITS Leeds"))
```

### Matching historical OSM extracts

Starting from `osmextract` v0.5.2, the `version` argument can be used to match historical OSM extracts stored by Geofabrik provider. The default value is `"latest"` which corresponds to the most recent OSM extract. Other values can be specified using the format `"YYMMDD"`. The available extracts for each zone can be browsed from Geofabrik [website](https://download.geofabrik.de/).

For example:

```{r}
oe_match("Italy", quiet = TRUE)$url
oe_match("Italy", version = "200101", quiet = TRUE)$url # OSM data up to January 1st 2020
oe_match(c(9.1916, 45.4650), version = "210101", quiet = TRUE)$url
```

Unfortunately, Geofabrik is the only provider which currently stores historical OSM extracts. Therefore, `version != "latest"` is ignored (with a warning message) if you select a different provider

```{r}
oe_match("Leeds", provider = "bbbike", version = "200101")
```

or if the input `place` is not matched with Geofabrik provider.

```{r}
oe_match("Leeds", version = "200101")
```

<!-- TODO: Get in contact with Geofabrik to check the status of historical OSM extract. -->

Beware that the default value, i.e. `"latest"`, selects an _always evolving_ OSM extract. On the other hand, historical OSM extracts are static and they may be preferable for reproducibility purposes.

## `oe_download()`: Download OSM extracts

The `oe_download()` function is used to download `.pbf` files representing OSM extracts.
Expand Down

0 comments on commit 13380e6

Please sign in to comment.