-
-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: added transform option #195
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -78,6 +78,8 @@ the 4th byte in the stream. | |
|
||
Enable or disable etag generation, defaults to true. | ||
|
||
This will default to `false` if the [`transform` option is passed](#transform). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This wording is misleading; if the |
||
|
||
##### extensions | ||
|
||
If a given file doesn't exist, try appending one of the given extensions, | ||
|
@@ -104,6 +106,8 @@ in preferred order. | |
Enable or disable `Last-Modified` header, defaults to true. Uses the file | ||
system's last modified value. | ||
|
||
This will default to `false` if the [`transform` option is passed](#transform). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This wording is misleading; if the |
||
|
||
##### maxAge | ||
|
||
Provide a max-age in milliseconds for http caching, defaults to 0. | ||
|
@@ -119,6 +123,55 @@ Serve files relative to `path`. | |
Byte offset at which the stream starts, defaults to 0. The start is inclusive, | ||
meaning `start: 2` will include the 3rd byte in the stream. | ||
|
||
##### transform | ||
|
||
A function which returns a stream which can be used to transform the data. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It may be useful to point to what, exactly, "a stream" means. I assume it means it needs to be a Node.js ReadableStream ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see it is addressed below, which seems like duplicate wording. I guess that it is explained down below, but maybe it doesn't need to be listed twice, idk on this comment, but left it for posterity. |
||
If this option is passed it will change the defaults of `etag` and `lastModified` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This wording is misleading; if the |
||
to `false`. If the transform stream changes the content length, it is also | ||
responsible for updating the `Content-Length` header or changing to | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is likely to be a large foot gun. Is there some way to just make this work without adding such a potential issue? For example, for the intended use cases of this, is it more likely than not that the content length will be different from before? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I thought hard on this. I don't think there is a good solution which doesn't rely on the user setting it if they are modifying the content. So in my case, I need to buffer the entire file to parse the json and then transform. In this case, I can calculate the new length, but if you are doing something like escaping content in a CSV you might be able to do it mid stream and in that way you would not know the content length in advance. In this case, you might consider moving to chunked encoding to avoid it. One of the reference implementations just does that for you, but that is also very presumptuous. TBQH, I feel that this is an advanced feature, requiring advanced understanding. That justifies, to me, documenting it and letting users "do the right thing" for their use case. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe this feature is being implemented on the "wrong side" of this module's operations. As in, maybe the transformation should happen prior to the stream entering into this module, perhaps solving the virtual fs stuff folks want can also solve this as well, including keeping all the features of this module working an intact, like ranges, etags, etc. I'm thinking like in a way, the transform is a kind of virtual fs in itself, where the contents, at least, are virtual. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hm, well if we allowed you to pass in the reference to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So to expand on what I mean by a virtual fs is not those enormous things. More like providing an interface that is much simpler to do the little but that this module needs like "hey, give me a readable stream for the file of this name" or some such. In that way, the returned stream in this case could have all the transforms added at that point, which would be below where things like ranging and similar are done. If the "virtual" thing didn't understand (or able to support) ranges, it would just return the full stream, for example. I am just thinking out loud here, to be honest. I just think having the transform part between this module and the fs seems like a better fit to get all the features working, but happy to hear other solutions. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ahh, ok then yeah I think I can get on board with that. So we would need There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right. And to also be clear: I don't think we even need to copy those names / those specific interfaces. We can "dumb it down" to what is actually needed for this module to do it's operations and let the "default impl" map to those I know we need to fix the TOCTOU issue, so perhaps we can roll those things together. But even if not, we would do well to make this interface not prevent us from resolving that issue, at least :) To be specific, the underlying changes that were being held is to (1) open a fd to the fs path, (2) stat the fd and then (3) read from the fd. I suppose this simplified interface could provide the same mechanical interface or something. |
||
`Transfer-Encoding: chunked`. | ||
|
||
The transform function is given the readable stream from `fs.createReadStream`, | ||
the `response` object and, the `options` passed to `send()`. It must return | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is the purpose of passing in the |
||
a `ReadableStream`, you can even just return the `stream` passed to you. | ||
|
||
If you expect the transform to always return the same results, you can re-enable | ||
`etag` and `lastModified` and to use the underlying file as it normally would. | ||
|
||
*Note on range requests:* When a range request is sent, the read stream will | ||
only read part of the file. If your transform is preforming replacements, or | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This would violate the actual http spec, as the range is on the actual recieved output. This will be a large issue for example if the transform doubled the output (maybe encode as hex); the web browser will send a range way past the end of the actual underlying file. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you think defaulting There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If transform starts removing pretty much all the features of this module, at what point is the feature actually just a separate functionality? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Well, default is not removal. In my use case I actually would force set There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry, to clarify i mean the removal as in what happens by default when the transform option is enabled. Yes, folks can then understand the foot guns and re enable certain things, but many will use the defaults. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. But I think the defaults are good mostly with the If you are using the The more I think about this, the more I think this is the right approach to a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I disagree on some of this; I think we can make a transform work with all the features if we implement it instead along the lines of a virtual fs, as I poked at in another comment, and not like this implementation. This implementation, ultimately, is just a simple solution that ends up having a lot of pitfalls I think don't need to be there if done differently. Range requests are not uncommon in practice; the module implements specifically what is not uncommon around the web. A very common point (besides audio and video files) that use range requests are pdf files too. That is how the web browser can display the first page of a 50mb pdf without waiting around forever. The browsers make use of range requests for quite a variety of file types under the hood. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I also generally think it is bad API design for the default setting of one option to change based on what another, different option is set to, as it just adds confusion as to what exactly is going to happen with a set of options, as the defaults are now dynamic to other factors. |
||
otherwise relying on always having complete content it might not work. You | ||
can set `acceptRanges: false` to disallow range requests or you will have | ||
to ensure that your transform logic can handle partial data. For an example | ||
of a transform which can work on range requests, see the test labled | ||
`should transform range requests`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Looking at the test, it is honestly hard to tell what exactly is going on there. Is there a better real-world transform like the tobi one that can be used instead? This is mainly because the docs are pointing folks to look there. |
||
|
||
Example: | ||
|
||
```javascript | ||
var http = require('http') | ||
var Readable = require('stream').Readable | ||
var concatStream = require('concat-stream') | ||
var send = require('send') | ||
|
||
http.createServer(function (req, res) { | ||
send(req, req.url, { | ||
transform: function (stream, res, opts) { | ||
var readStream = new Readable({ | ||
read: function () { } | ||
}) | ||
stream.pipe(concatStream(function (d) { | ||
var _d = Buffer.from(d.toString('utf8').replace('tobi', 'not tobi')) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I assume that this example does not work with range requests; just above this it says if it doesn't work with them, be sure to turn them off. So the example should probably head the notice from the docs above as a good example :) |
||
res.setHeader('Content-Length', _d.length) | ||
readStream.push(_d) | ||
readStream.push(null) | ||
})) | ||
return readStream | ||
} | ||
}).pipe(res) | ||
}) | ||
``` | ||
|
||
#### Events | ||
|
||
The `SendStream` is an event emitter and will emit the following events: | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -102,6 +102,10 @@ function SendStream (req, path, options) { | |
this.path = path | ||
this.req = req | ||
|
||
this._transform = opts.transform !== undefined | ||
? opts.transform | ||
: undefined | ||
|
||
this._acceptRanges = opts.acceptRanges !== undefined | ||
? Boolean(opts.acceptRanges) | ||
: true | ||
|
@@ -112,7 +116,7 @@ function SendStream (req, path, options) { | |
|
||
this._etag = opts.etag !== undefined | ||
? Boolean(opts.etag) | ||
: true | ||
: (this._transform === undefined) | ||
|
||
this._dotfiles = opts.dotfiles !== undefined | ||
? opts.dotfiles | ||
|
@@ -147,7 +151,7 @@ function SendStream (req, path, options) { | |
|
||
this._lastModified = opts.lastModified !== undefined | ||
? Boolean(opts.lastModified) | ||
: true | ||
: (this._transform === undefined) | ||
|
||
this._maxage = opts.maxAge || opts.maxage | ||
this._maxage = typeof this._maxage === 'string' | ||
|
@@ -794,6 +798,14 @@ SendStream.prototype.stream = function stream (path, options) { | |
|
||
// pipe | ||
var stream = fs.createReadStream(path, options) | ||
if (this._transform) { | ||
// When using a transform stream, the implementor is | ||
// responsible for setting content length or any other | ||
// headers for their stream type. | ||
res.removeHeader('Content-Length') | ||
|
||
stream = this._transform(stream, res, options) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So since the original |
||
} | ||
this.emit('stream', stream) | ||
stream.pipe(res) | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, my editor removed the trailing white space. This seems like a good thing, but I can make a separate commit if you think it worth it.