Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azure Functions (node) doesn't process binary data properly #294

Closed
christian-vorhemus opened this issue Mar 31, 2020 · 14 comments · Fixed by #618
Closed

Azure Functions (node) doesn't process binary data properly #294

christian-vorhemus opened this issue Mar 31, 2020 · 14 comments · Fixed by #618
Labels
P1 v4 model 🚀 Related to the new V4 programming model
Milestone

Comments

@christian-vorhemus
Copy link
Member

christian-vorhemus commented Mar 31, 2020

UPDATE: Per final design, we added a property request.bufferBody that you can use to access the binary data


My goal is to write a function which processes the binary HTTP body (for example, an image). I've created the function using the func CLI (version 2.7.2254) doing the following:

func init --worker-runtime "node" --language "typescript"
func new --name "SampleFunction" --template "HTTP trigger"
npm install
npm install @types/node --save

My function.json looks like this:

{
  "bindings": [
    {
      "authLevel": "function",
      "type": "httpTrigger",
      "dataType": "binary",
      "direction": "in",
      "name": "req",
      "methods": [
        "get",
        "post"
      ]
    },
    {
      "type": "http",
      "direction": "out",
      "name": "res"
    }
  ],
  "scriptFile": "../dist/SampleFunction/index.js"
}

My expectation would be that now req.body is a Buffer but when I check the type
typeof req.body;

it seems to be a string. Therefore, my attempt to save this file to disk like you see below in my index.ts also fails:

import { AzureFunction, Context, HttpRequest } from "@azure/functions"
const fs = require('fs');

const httpTrigger: AzureFunction = async function (context: Context, req: HttpRequest): Promise<void> {
    context.log('HTTP trigger function processed a request.');
    var file = req.body;
    fs.writeFileSync("file.png", file, 'binary');
};

export default httpTrigger;

Test:

curl -X POST --data-binary "@C:\path\to\image.png" http://localhost:7
071/api/SampleHttpFunction

What's the correct way of processing binary data in a node based Azure Function?

@ghost ghost assigned yojagad Mar 31, 2020
@yojagad
Copy link
Contributor

yojagad commented Mar 31, 2020

Can you try this with the Content-Type set to "application/octet-stream" ?

@christian-vorhemus
Copy link
Member Author

christian-vorhemus commented Mar 31, 2020

Indeed, if the content type is set to "application/octet-stream" it works.

But why does it fail when Content-Type is more specific, for example "image/png"? "application/octet-stream" isn't telling the web app anything other than it is some binary data. If req.headers looks like this, it fails:

{
  "cache-control": "no-cache",
  "connection": "keep-alive",
  "content-type": "image/png",
  "accept": "*/*",
  "accept-encoding": "gzip, deflate, br",
  "authorization": "Bearer U0V3Q",
  "host": "localhost:7071",
  "user-agent": "PostmanRuntime/7.24.0",
  "content-length": "11460",
  "postman-token": "897aaa99-f0c5-40fb-80db-3d7b24a01572"
}

Or is the function host logic simply saying: If content type equals "application/octet-stream" treat HTTP body as binary file, otherwise as string?

@yojagad
Copy link
Contributor

yojagad commented Mar 31, 2020

Yeah, there's specific logic around content-type "application/octet-stream" and those starting with "multipart" (https://github.com/Azure/azure-functions-host/blob/dev/src/WebJobs.Script/Extensions/HttpRequestExtensions.cs).

I'll go ahead and close this thread since this solved the issue. Please feel free to reopen/start new thread if you have any more questions around this.

@yojagad yojagad closed this as completed Mar 31, 2020
@christian-vorhemus
Copy link
Member Author

Alright, then just out of interest, is this behavior documented somewhere? Because I'd assume this can be a potential pitfall for users who send data to an Azure Function by a browser (at least the browsers I've seen will by default set the Content-Type header based on the actual file type and not only to "application/octet-stream") or use e.g. Postman for testing.

And why do we set "dataType": "binary" in the function.json when apparently the Content-Type header decides how the HTTP body is treated?

@yojagad
Copy link
Contributor

yojagad commented Apr 2, 2020

Tagging @mhoeger and @pragnagopa who might have answers to this.

@pragnagopa
Copy link
Member

reopening and moving to nodejs worker repo to provide a code sample.
Using isRaw flag: https://docs.microsoft.com/en-us/azure/azure-functions/functions-reference-node#response-object should provide data as buffer. @mhoeger can you confirm?

@pragnagopa pragnagopa reopened this Apr 2, 2020
@pragnagopa pragnagopa transferred this issue from Azure/azure-functions-host Apr 2, 2020
@pragnagopa pragnagopa assigned mhoeger and unassigned yojagad Apr 2, 2020
@mhoeger
Copy link
Contributor

mhoeger commented Apr 2, 2020

"isRaw" is about the response but not the request. I also need to double check if that guidance is now deprecated, because "isRaw" was a V1 feature.

The "dataType" property is applied by some bindings, but isn't for the HttpBinding, which is misleading :\ agreed that we should document which ones we special-case to send as binary.

@mhoeger
Copy link
Contributor

mhoeger commented Mar 26, 2021

@alrod and @AnatoliB - this will be super important to get into the next major version release - very much a pain today. Either by making all "body" the raw version or actually populating "rawBody" with the bytes version. This capability is related, most other language workers consume raw bytes for body but we couldn't do the same on this because of legacy behavior.

@midanilo
Copy link

midanilo commented May 25, 2022

Same problem here, I've searching a past day looking for this strange behavior on google as my binary upload doesn't have the same size and contents of my original one.
Lucky, I found this opened ticket with this undocumented tip to put content-type to application/octet-stream, but if this a known behavior by microsoft, this could be at least documented

@haukepribnow
Copy link

I also ran into this issue and was very surprised to learn about this unexpected behavior.

After all, the current behavior is very surprising: Even if dataType is set to "binary" in the function.json, then req.rawBody still remains a string in any case? And req.body also remains a string unless the request is of type application/octet-stream? This does not feel right and does also not seem to be documented - at least not here: https://docs.microsoft.com/en-us/azure/azure-functions/functions-triggers-bindings#trigger-and-binding-definitions

In my scenario, I don't have control over the caller, i.e. I cannot enforce that my Function will get Content-Type: application/octet-stream. Since Proxies have been made obsolete with Functions v4, the only chance I see is to plug in a hefty API middleware - which seems to be overkill for my scenario.

I understand that backwards compatibility is important. So maybe a solution could be to introduce req.bodyBuffer - which is guaranteed to always contain a Buffer object of the body.

@ejizba
Copy link
Contributor

ejizba commented Jun 15, 2022

Hi folks we're working on a new programming model for Node.js and one of the benefits will be to improve the experience around backwards compatibility and breaking changes. We should be able to react much faster to requests like this, and you should have more choice when you adopt the breaking changes. See here for more details: #568

It's one part of our larger effort (tracked by #480) which is currently under way. We will go through the list of breaking changes as we're finishing up the new model and likely address a good chunk of them.

@ejizba
Copy link
Contributor

ejizba commented Aug 24, 2022

FYI we took two approaches to address this issue:

  1. In the existing programming model, we added a new bufferBody property to the http request object. The existing body and rawBody properties should behave the exact same, but the new property should always return a buffer. This has been merged, but will take a month or two to roll out in Azure
  2. In the new programming model, we refactored the http request type in Azure/azure-functions-nodejs-library@8b75882 to have better naming and be closer to the fetch standard. Minor details may still change as we go through a testing/feedback phase, but the main idea should be done. I've started a doc here with more information: https://aka.ms/AzFuncNodeV4

@andersonmorony
Copy link

andersonmorony commented Aug 25, 2022

I can´t believe it, I tried ALL, fs, Buffer, all library to read xls file and not work. was just add content-type: application/octet-stream on header and it worked!

Thanks!

@ejizba ejizba removed the breaking label Aug 26, 2022
@ejizba
Copy link
Contributor

ejizba commented Dec 5, 2022

Hi all, just FYI the change for this just finished rolling out in Azure as a part of host v4.14.0. binary data will always be available in the new req.bufferBody property

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 v4 model 🚀 Related to the new V4 programming model
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants