Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

should ok set an user agent #125

Closed
maelle opened this issue Nov 13, 2019 · 1 comment
Closed

should ok set an user agent #125

maelle opened this issue Nov 13, 2019 · 1 comment
Milestone

Comments

@maelle
Copy link
Member

maelle commented Nov 13, 2019

url <- "https://doi.org/10.1093/chemse/bjq042"

crul::HttpClient$new(url)$head()$status_code
#> Error in curl::curl_fetch_memory(x$url$url, handle = x$url$handle): GnuTLS recv error (-54): Error in the pull function.

crul::HttpClient$new(url, 
                     opts = list(useragent = "my-header"))$head()$status_code
#> [1] 200

Created on 2019-11-13 by the reprex package (v0.3.0)

@sckott
Copy link
Collaborator

sckott commented Nov 14, 2019

good question.

ok does set a user agent string by deafult, like:

ok("https://google.com", verbose=TRUE)
#> > HEAD / HTTP/1.1
#> Host: google.com
#> User-Agent: libcurl/7.54.0 r-curl/4.2 crul/0.9.0.9100

the user can choose to change the ua string like

ok("https://google.com", useragent = "hello world", verbose = TRUE)
#> > HEAD / HTTP/1.1
#> Host: google.com
#> User-Agent: hello world

in your eg url above, i guess we can't do anything automatically, but we can document this. for example, tell users that a FALSE may be incorrect depending on their use case, e.g, if they want to know if curl based scraping will work without fiddling with curl options, then the FALSE is probably correct, but if they want to fiddle with curl options, then first step would be to send verbose=TRUE so they can see whats going on with any redirects and headers. And then talk about user agent strings and some websites blocking based on user agent strings.

@sckott sckott added this to the v1.0 milestone Jan 27, 2020
@sckott sckott closed this as completed in ac681bb Jul 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants