Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

is it possible to do random reads of .tif or .nc files hosted on OSF Storage? #156

Open
mikoontz-vp opened this issue May 15, 2024 · 1 comment

Comments

@mikoontz-vp
Copy link

Hi all,

This is such a great package and its existence makes it so much easier to recommend OSF as a hosting platform for projects.

I'm trying to figure out whether it is possible to read data directly from the OSF storage, specifically raster layers. For instance, if I host a .tif on AWS S3, I can use /vsis3/ and the {terra} package to read the raster metadata without first downloading the whole file. I can then crop it and just download a smaller portion: r = terra::rast(glue::glue('s3://{bucket_name}/{raster_basename})). Is something like this possible with the OSF API, which might then make it a potential hugely valuable feature for {osfr}?

Here's an OSF page for an OSF Storage-hosted .tif in case that's handy for anyone: https://osf.io/kwreq

@mikoontz
Copy link

mikoontz commented Nov 29, 2024

An update here-- I recently learned that I could use the /vsicurl/ approach on raster that has a direct http link. I also learned about how to get a direct download link for an OSF id from CenterForOpenScience/osf.io#7020 and CenterForOpenScience/osf.io#8256.

So I'm able to do what I want using:

r <- terra::rast("/vsicurl/https://osf.io/download/kwreq")
r

Gives me:

class       : SpatRaster 
dimensions  : 104, 93, 2005  (nrow, ncol, nlyr)
resolution  : 11132, 11132  (x, y)
extent      : -434148, 601128, -656788, 500940  (xmin, xmax, ymin, ymax)
coord. ref. : NAD83 / California Albers (EPSG:3310) 
source      : kwreq 
names       : kwreq_1, kwreq_2, kwreq_3, kwreq_4, kwreq_5, kwreq_6, ... 

Note that this also works for vector files, because it is GDAL that implements the /vsicurl/ method:

(this one is pretty big, so might take a bit of time; probably a good example for walking the line between when it makes more sense to download the file locally then read it versus using the vsi approach)

gpkg <- sf::read_sf("/vsicurl/https://osf.io/download/74xcw")

And for completeness, we can also do this with .csv files by passing the https link directly to our file reading function:

csv <- readr::read_csv("https://osf.io/download/rgtvs")
csv_base <- utils::read.csv("https://osf.io/download/rgtvs")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants