Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

body_config handling for new curl::form_data function #430

Closed
ampu3ro opened this issue Mar 10, 2017 · 20 comments
Closed

body_config handling for new curl::form_data function #430

ampu3ro opened this issue Mar 10, 2017 · 20 comments
Labels
reprex needs a minimal reproducible example

Comments

@ampu3ro
Copy link

ampu3ro commented Mar 10, 2017

@jeroenooms added a useful new field type for multipart forms that addresses the issue of defining content-type for character data. The form_data() function handles this nicely, but outputs a list with value of class "raw" and type of class "character". In order to play nice with httr, I think body_config() needs to be modified to not convert raw to character (line 58 here)

Say I want to POST this form

url <- "https://my_salesforce_instance/services/data/v39.0/sobjects/Document/"
header <-  httr::add_headers("my_salesforce_auth")

x <- data.frame(FolderId="a0a1300000ZG7u3",Name="test1.png")
json <- jsonlite::toJSON(unbox(x),pretty=T)
metadata <- curl:::form_data(as.character(json),"application/json")

tmp <- tempfile(fileext=".png")
png(tmp)
plot(mtcars)
dev.off()
media <- httr::upload_file(tmp)

body <- list(Info=metadata,Body=media)

Ideally I could write something like

httr::POST(url,header,body=body,encode="multipart",verbose())

but currently only this works

handle <- curl::new_handle()
req <- httr:::request(fields=body)
req <- httr:::request_build("POST",url,req,header)
httr:::request_perform(req,handle)
@hadley
Copy link
Member

hadley commented Jul 24, 2017

Could you please rework your reproducible example to use the reprex package ? That makes it easier to see both the input and the output, formatted in such a way that I can easily re-run in a local session.

@hadley hadley added the reprex needs a minimal reproducible example label Jul 24, 2017
@ampu3ro
Copy link
Author

ampu3ro commented Jul 24, 2017

Sorry, this still isn't exactly reproducible since I have to hide my authorization and you'd need an active Salesforce account, but if I knew of a public API that had this upload formatting, I'd use it. Anyway here's what reprex produced (didn't know about this package, but it's quite helpful!):

url <- "https://my_salesforce_instance/services/data/v39.0/sobjects/Attachment/"
header <-  httr::add_headers(c(Authorization="my_salesforce_auth"))

x <- data.frame(Name="test1.png",ParentId="a031300000GjJj0")
json <- jsonlite::toJSON(jsonlite::unbox(x),pretty=T)
metadata <- curl:::form_data(as.character(json),"application/json")

tmp <- tempfile(fileext=".png")
png(tmp)
plot(mtcars)
dev.off()
#> png 
#>   2
media <- httr::upload_file(tmp)

body <- list(Info=metadata,Body=media)

resp <- httr::POST(url,header,body=body,encode="multipart")
#> Warning in charToRaw(enc2utf8(val)): argument should be a character vector of length 1
#> all but the first element will be ignored
resp
#> Response [https://my_salesforce_instance/services/data/v39.0/sobjects/Attachment/]
#>   Date: 2017-07-24 14:04
#>   Status: 400
#>   Content-Type: application/json;charset=UTF-8
#>   Size: 104 B

handle <- curl::new_handle()
req <- httr:::request(fields=body)
req <- httr:::request_build("POST",url,req,header)
resp <- httr:::request_perform(req,handle)
resp
#> Response [https://my_salesforce_instance/services/data/v39.0/sobjects/Attachment/]
#>   Date: 2017-07-24 14:04
#>   Status: 201
#>   Content-Type: application/json;charset=UTF-8
#>   Size: 54 B
httr::content(resp)
#> $id
#> [1] "00P18000001DAIkEAO"
#> 
#> $success
#> [1] TRUE
#> 
#> $errors
#> list()

@jeroen
Copy link
Member

jeroen commented Jul 24, 2017

The latest version of OpenCPU supports form data as well to post binary payloads. Perhaps you can use this as a prototype.

# Post requests with various form-data types
data_arg <- protolite::serialize_pb(cars)
geom_arg <- jsonlite::toJSON(c("point", "smooth"))
h <- curl::handle_setform(new_handle(), x = 'speed', y = 'dist', 
                          data = curl::form_data(data_arg, type = 'application/protobuf'),
                          geom = curl::form_data(geom_arg, type = 'application/json'))
req <- curl::curl_fetch_memory('https://cran.ocpu.io/ggplot2::qplot', handle = h)

# Get the output
url <- paste0(curl::parse_headers_list(req$headers)$location, "graphics/1/pb")
req <- curl::curl_fetch_memory(url)
reqplot <- protolite::unserialize_pb(req$content)
print(reqplot)

@hadley
Copy link
Member

hadley commented Jul 24, 2017

I think this illustrates the basic problem:

citation <- upload_file(system.file("CITATION"))

POST("https://httpbin.org", 
  body = list(
    string = curl::form_data('{"a": 1}', type = "application/json"),
    file = upload_file(system.file("CITATION"))
  ),
  verbose()
)

@hadley
Copy link
Member

hadley commented Jul 24, 2017

@jeroen is there a good way to test this? When posting to httpbin, I see:

>> --------------------------bd484675b8965050
>> Content-Disposition: form-data; name="y"
>> Content-Type: application/json
>> 
>> {"a": 1}

which is ok, I think. But the returned json has

 $ form   :List of 2
  ..$ x: chr "1"
  ..$ y: chr "{\"a\": 1}"

@hadley
Copy link
Member

hadley commented Jul 24, 2017

The fix is simple, as @ampu3ro it's just a matter of switching from request(fields = lapply(body, as.character)) to request(fields = body). The only downside we now lose coercion for (e.g.) numeric and logical. It's probably better to require the user to send a character or raw vector, but I'd prefer not to change the API.

@jeroen would you consider adding support for other atomic vectors in handle_setform()?

@jeroen
Copy link
Member

jeroen commented Jul 24, 2017

@hadley I think you just need to add another if to https://github.com/r-lib/httr/blob/master/R/body.R#L9 ?

@hadley
Copy link
Member

hadley commented Jul 24, 2017

@jeroen
Copy link
Member

jeroen commented Jul 24, 2017

How does that not break form_file() values? To support form_data() you just need to do the same as you do for form_file()...

@hadley
Copy link
Member

hadley commented Jul 24, 2017

Well it needs a fix there too - but you can also have a list, for form_file() and form_data() values.

@hadley
Copy link
Member

hadley commented Jul 24, 2017

Well maybe, does it make sense to send a single unnamed form_data() object? Probably not, since there are existing ways to send raw data with a specified content type.

@jeroen
Copy link
Member

jeroen commented Jul 24, 2017

No, both form_data() and form_file() only make sense as pieces of a multipart form post.

@hadley
Copy link
Member

hadley commented Jul 24, 2017

httr allows you to use form_file() to submit a single file as the body of a request.

@jeroen
Copy link
Member

jeroen commented Jul 24, 2017

Right you probably shouldn't have used the form_file class to send single files in body_config(). They are really intended to be used by libcurl as part of a multipart form. It probably would have been better if users use base::file() to send binary payloads.

Anyway that ship has probably sailed. The important thing is to not touch the form_data and form_file objects when you pass them to curl::handle_setform().

@hadley
Copy link
Member

hadley commented Jul 24, 2017

Right, so would you consider coercing numeric and logical vectors automatically? I'd prefer not to make an API change in a widely used package (even though using things other than character vectors is likely to be rare)

@jeroen
Copy link
Member

jeroen commented Jul 24, 2017

@hadley
Copy link
Member

hadley commented Jul 24, 2017

It just seems easier to rely on curl to do all the coercion, so that there's no need to make changes in the future - whatever curl supports, httr will support.

@jeroen
Copy link
Member

jeroen commented Jul 24, 2017

Hmm I prefer curl to be a bit more conservative and actually require that the user specifies the data as a string or raw vector. Auto coercing other types leads to ambiguity w.r.t. serialization or number formatting and what not.

It seems the fix is easy on the httr side without need to change the api at all?

@hadley
Copy link
Member

hadley commented Jul 24, 2017

Sure, if you don't want to coerce, it's fine. I can do it.

@hadley hadley closed this as completed in c8f7a69 Jul 24, 2017
@hadley
Copy link
Member

hadley commented Jul 24, 2017

This API will be completely reworked in httr2.

jiwalker-usgs pushed a commit to jiwalker-usgs/httr that referenced this issue Jul 24, 2017
When create the fields for a multipart body request. Fixes r-lib#430.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
reprex needs a minimal reproducible example
Projects
None yet
Development

No branches or pull requests

3 participants