-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: Response format (JSON / CSV) #22
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic work! I have some suggestion about node stream ~
dataStream: Stream.Readable; | ||
columns: DataColumn[]; | ||
}) => { | ||
const csvStream = new Stream.Readable({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about using Transform Steam instead of creating another readable steam?
We used "Flowing Mode" of stream (listening on data event), but we didn't handle backpressure via .pause() .resume() functions. It might let us push too much data than the buffer should contain.
Here are some experiences to show how to handle backpressure and their result:
-
Push without pause: Like our case now, we always keep pushing data even if the buffer is full.
const Stream = require('stream'); const dest = require('fs').createWriteStream('test.json', 'utf8'); let n = 1e7; const source = new Stream.Readable({ objectMode: true, read() { source.push({ data: n-- }); if (n === 0) { this.push(null); } } }); const csvStream = new Stream.Readable({ objectMode: false, read() { return null; } }); source .on('data', function (dataRow) { csvStream.push(JSON.stringify(dataRow), 'utf-8'); }) .on('end', () => { csvStream.push(null); }); csvStream.on('end', () => { const used = process.memoryUsage().heapUsed / 1024 / 1024; console.log(`${Math.round(used * 100) / 100} MB`); }).pipe(dest) // 1463.76 MB
-
Handle Backpressure via .pause .resume functions:
const Stream = require('stream'); const dest = require('fs').createWriteStream('test.json', 'utf8'); let n = 1e7; const source = new Stream.Readable({ objectMode: true, read() { source.push({ data: n-- }); if (n === 0) { this.push(null); } } }); const csvStream = new Stream.Readable({ objectMode: false, read() { source.resume(); } }); source .on('data', function (dataRow) { const mightContinue = csvStream.push(JSON.stringify(dataRow), 'utf-8'); if (!mightContinue) { this.pause(); } }) .on('end', () => { csvStream.push(null); }); csvStream.on('end', () => { const used = process.memoryUsage().heapUsed / 1024 / 1024; console.log(`${Math.round(used * 100) / 100} MB`); }).pipe(dest) // 16.35 MB
-
Use transform stream
const Stream = require('stream'); const dest = require('fs').createWriteStream('test.json', 'utf8'); let n = 1e7; const source = new Stream.Readable({ objectMode: true, read() { source.push({ data: n-- }); if (n === 0) { this.push(null); } } }); const csvStream = new Stream.Transform({ objectMode: true, transform(chunk, encoding, callback) { callback(null, JSON.stringify(chunk), 'utf-8'); } }); source.pipe(csvStream).on('end', () => { const used = process.memoryUsage().heapUsed / 1024 / 1024; console.log(`${Math.round(used * 100) / 100} MB`); }).pipe(dest) // 14.55 MB
Here are some useful links about node stream:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wwwy3y3 Could you also show some experience with Node Steams?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @oscar60310 for reviewing and giving suggestions, I have read the article your provided, and also refactor the part by extending Stream.Transform
to build a CsvTransfom
to use, then delegate pipe
method to handle the back pressure.
Btw, I refactor the original structure, now I make the ResponseFormatMiddleware
to load built-in formatters and extension formatters to transform the format and add to ctx response.
so you could check the code in CsvFormatter
, not the original respondToCtx
code, thanks!
read: () => null, | ||
}); | ||
|
||
const dataResult: string[] = []; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not push data into memory while handling streams. We can use 3rd party or Transform steam to convert JSON objects to JSON strings.
Here are some expriences using Transform steams:
- Origin: Push into memory:
const Stream = require('stream') const source = new Stream.Readable({ objectMode: true, }); const dest = require('fs').createWriteStream('test.json', 'utf8'); const jsonStream = new Stream.Readable({ objectMode: false, read: () => null, }); const toBuffer = (str) => { return Buffer.from(str, 'utf8'); }; const dataResult = []; // Read data stream and convert the format to json format stream. source // assume data stream "objectMode" is true to get data row directly. e.g: { name: 'jack', age: 18, hobby:['book', 'travel'] } .on('data', (dataRow) => { // collect and stringify all data rows dataResult.push(JSON.stringify(dataRow)); }) .on('end', () => { // transform format data to buffer jsonStream.push(toBuffer('[')); jsonStream.push(toBuffer(dataResult.join())); jsonStream.push(toBuffer(']')); jsonStream.push(null); }); let n = 1e7; while (n--) { source.push({ data: n }); } source.push(null); jsonStream.on('end', () => { const used = process.memoryUsage().heapUsed / 1024 / 1024; console.log(`${Math.round(used * 100) / 100} MB`); }).pipe(dest) // 422.04 MB
- Transform steam:
const Stream = require('stream') const source = new Stream.Readable({ objectMode: true, }); const dest = require('fs').createWriteStream('test.json', 'utf8'); const toBuffer = (str) => { return Buffer.from(str, 'utf8'); }; const jsonToString = new Stream.Transform({ objectMode: true, construct(callback) { this.first = true; callback(null); }, transform(chunk, encoding, callback) { if (this.first) { this.push(toBuffer('[')); this.first = false; } else { this.push(toBuffer(',')); } this.push(toBuffer(JSON.stringify(chunk))) callback(null); }, final(callback) { this.push(toBuffer(']')) callback(null); } }) let jsonStream = source.pipe(jsonToString); let n = 1e7; while (n--) { source.push({ data: n }); } source.push(null); jsonStream.on('end', () => { const used = process.memoryUsage().heapUsed / 1024 / 1024; console.log(`${Math.round(used * 100) / 100} MB`); }).pipe(dest); // 20.52 MB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for @oscar60310 reviewing and suggestion, same as above, I build a JsonStringTransfom
by extending Stream.Transform
, then delegate pipe
method to handle the back pressure in flowing mode.
Also, now I use the ResponseFormatMiddleware
to load built-in formatters and extension formatters to transform the format and add to ctx response.
so you could check the code in JsonFormatter
, not the original respondToJson
code, thanks!
* add the custom header key "X-Response-Formatted" to notice other response middleware | ||
*/ | ||
protected setFormattedNotice(context: KoaRouterContext) { | ||
context.response.set(this.formattedIdentifier, 'true'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to save the value to context directly instead of setting it in headers because we don't need to expose the internal context to clients.
context[this.formattedIdentifier] = true;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reviewing and suggestion, because now I use the ResponseFormatMiddleware
to load built-in formatters and extension formatters, then according to the user-provided support formats to check that all formatters which formatted or not, so you will see the new design way to handle it.
The handle
method in ResponseFormatMiddleware
like below:
public async handle(context: KoaRouterContext, next: RouteMiddlewareNext) {
// return to skip the middleware, if disabled
if (!this.enabled) return next();
const formatters = await loadUsableFormatters(this.config.extensions);
// get supported and request format to use.
const format = checkUsableFormat({
context,
formatters,
supportedFormats: this.supportedFormats,
defaultFormat: this.defaultFormat,
});
context.request.path = context.request.path.split('.')[0];
// go to next to run middleware and route
await next();
// format the response and route handler ran.
formatters[format].formatToResponse(context);
return;
}
protected setFormattedNotice(context: KoaRouterContext) { | ||
context.response.set(this.formattedIdentifier, 'true'); | ||
} | ||
|
||
/** | ||
* add the custom header key "X-Response-Formatted" to notice other response middleware | ||
*/ | ||
protected isResponseFormatted(context: KoaRouterContext) { | ||
if (context.response.headers[this.formattedIdentifier] === 'true') | ||
return true; | ||
return false; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's wise to check whether the response had been formatted before doing the job, but since we should have only ONE formator at the same time, could we use another middleware to decide which format we should use?
getFormatMiddleware.ts
public async handle(context: KoaRouterContext, next: RouteMiddlewareNext) {
const format = context.request.url.getExtension(); // json / csv .....
context.request.url = context.request.url.split(`.${this.format}`)[0];
context.format = context.request.accepts(this.format);
if(!context.format) throw error
return next()
}
}
jsonResponseMiddleware.ts
public async handle(context: KoaRouterContext, next: RouteMiddlewareNext) {
if (!this.enabled) return next();
if (!this.isFormatSupported()) return next(); // Check context.format === 'json'
await next();
respondToJson(context);
}
In this design, we can set the priority of each formator by changing the order of "format" property. i.e. ['csv', 'json'] / ['json', 'csv']
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reviewing and suggestion, after the previous discussion, I have changed to a similar way, but not all delegated to middleware, I created the BaseResponseFormatter
and make the ResponseFormatMiddleware
to load all formatters which inherit BaseResponseFormatter
.
const request = context.request; | ||
|
||
//if url is end with the format, start to formatting | ||
if (request.url.endsWith(`.${this.format}`)) return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do request.url
contain parameters like /foo/bar.json?limit=1,offset=0
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reviewing and finding issue, I have been fixed, please check it
// middleware is enabled or not, default is enabled beside you give "enabled: false" in config. | ||
protected enabled: boolean; | ||
constructor(name: string, config: MiddlewareConfig) { | ||
super(name, config); | ||
|
||
const value = this.getConfig()?.['enabled'] as boolean; | ||
this.enabled = isUndefined(value) ? true : value; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did it mean we only have enabled option for built-in middlewares?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, because built-in middlewares have been added default, so we provide enabled
to let user could close if they would like to build itself, otherwise, the enabled
for customized middlewares may use for other purposes, so I do not enforce the customized middlewares also use enabled
.
csvStream.push(toBuffer('\n')); | ||
}) | ||
.on('error', (err: Error) => { | ||
logger.debug(`read stream failed, detail error ${err}`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use logger.warn instead to indicate the error in production env?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reviewing and suggesting, I have changed to logger.warn
}) | ||
.on('end', () => { | ||
csvStream.push(null); | ||
logger.info('convert to csv format stream > done.'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: In my opinion, the log prints during the regular user journey should be at "debug" level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reviewing and suggesting, I have changed it to logger.debug
.
ctx.response.body = stream; | ||
ctx.response.set( | ||
'Content-disposition', | ||
`attachment; filename=${filename}.csv` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a greate idea to set a filename for CSV API!
packages/serve/test/test-utils.ts
Outdated
|
||
/* istanbul ignore file */ | ||
export const arrayToStream = (data: Array<any>): Stream => { | ||
return from2.obj(function (_size, next) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: We can easily create Readable strean without "from2" lib like this:
const arrayToStream = (data) => {
return new Stream.Readable({
objectMode: true,
read() {
this.push(data.pop() || null);
}
});
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for suggesting, I have removed from2
to create it, thanks !
- update "IDataSource" to response clear data result type. - add "toJsonResponse" and "toCsvResponse" function to do format and save to koa context. - add csv and json response middleware. - add test cases of csv and json response middleware. - add test cases of csv format and json format function.
… refactor response format middleware structure. - build csv transform and json string transform by extending stream transform and delegate pipe to handling the backpressue issue. - refactor to use formatter class "BaseResponseFormatter" for structure way to handle formatting json "JsonFormatter" and csv "CsvFormatter". - create response format middleware to load formatter to check format. - remove original csv and json format middleware. - refactor to inject app config, not only middleware config. - merge middleware and response-formatter load extension to the same one.
- update the csv and json stream test cases for calling csv, json formatter. - add response format helpers test cases. - refactor all middlewares to pass app config. - refactor array to stream function in "test-utils.ts".
ebc253e
to
bbcd1c2
Compare
Currently, I keep the config in an original way e.g: |
- remove "loadUsableFormatters" and create common extensions load function.
Description
Support to convert json and csv format when response result. Notion link
Use Case
Definition
Define the schema for supported response content types.
Request API to know the response format
Tell API what content format would like to respond
Method 1: Pass
Accept
Header{ 'Accpet': application/json }
{ 'Accpet': text/csv }
( download file )Method 2: explore the format in the end of API route path, e.g:
/orders
APIjson
.csv
>/orders.csv
( download file )json
>/orders.json
When
Accept
and the ending format of API path is different, no need to throw errors, make the URL path priority high to get.Accept
Header is an array, pick the first format typeresponse-fromat
supported, if not, then throw errors.How To Test / Expected Results
For the test result, please see the below test cases that passed the unit test:
Core package
Serve package
Commit Message
IDataSource
to response clear data result type.toJsonResponse
andtoCsvResponse
function to do format and save to koa context.BaseResponseFormatter
for structure way to handle formatting jsonJsonFormatter
and csvCsvFormatter
.test-utils.ts
.loadUsableFormatters
and create common extensions load function.`