Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Response format (JSON / CSV) #22

Merged
merged 4 commits into from
Aug 2, 2022
Merged

Conversation

kokokuo
Copy link
Contributor

@kokokuo kokokuo commented Jul 15, 2022

Description

Support to convert json and csv format when response result. Notion link

Use Case

  • Step 1 - Define the config for supported response content types
  • Step 2 - Tell API what content format would like to respond

Definition

Define the schema for supported response content types.

# supported response format
response-format:
   enabled: true.  # default is true, if not use built-in response format, disabled it and customize by yourself.
   options: [json, csv]

Request API to know the response format

Tell API what content format would like to respond

  • Method 1: Pass Accept Header

    • JSON format > { 'Accpet': application/json }
    • CSV format > { 'Accpet': text/csv } ( download file )
  • Method 2: explore the format in the end of API route path, e.g: /orders API

    • default format is json.
    • explore format csv > /orders.csv ( download file )
    • explore format json > /orders.json
  • When Accept and the ending format of API path is different, no need to throw errors, make the URL path priority high to get.

  • Accept Header is an array, pick the first format type response-fromat supported, if not, then throw errors.

How To Test / Expected Results

For the test result, please see the below test cases that passed the unit test:

Core package

螢幕快照 2022-07-15 下午9 33 20

Serve package

螢幕快照 2022-07-15 下午9 32 33

Commit Message

  • 6898f8c - feat(core, serve): add csv and json response format middleware.
    • update IDataSource to response clear data result type.
    • add toJsonResponse and toCsvResponse function to do format and save to koa context.
    • add csv and json response middleware.
    • add test cases of csv and json response middleware.
    • add test cases of csv format and json format function.
  • 10dc76d - fix(serve): fix flowing mode backpressure issue for csv, json stream, refactor response format middleware structure.
    • build csv transform and json string transform by extending stream transform and delegate pipe to handle the backpressure issue.
    • refactor to use formatter class BaseResponseFormatter for structure way to handle formatting json JsonFormatter and csv CsvFormatter.
    • create response format middleware to load formatter to check format.
    • remove original csv and json format middleware.
    • refactor to inject app config, not only middleware config.
    • merge middleware and response-formatter load extension to the same one.
  • bbcd1c2 - fix(serve): fix the middleware and original utils response test cases.
    • update the csv and json stream test cases for calling csv, json formatter.
    • add response format helpers test cases.
    • refactor all middlewares to pass app config.
    • refactor array to stream function in test-utils.ts.
  • b3fea7c - feat(serve): refactor load extensions in serve
    • remove loadUsableFormatters and create common extensions load function.`

@kokokuo kokokuo marked this pull request as ready for review July 15, 2022 13:33
@kokokuo kokokuo requested a review from oscar60310 July 15, 2022 13:33
Copy link
Contributor

@oscar60310 oscar60310 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic work! I have some suggestion about node stream ~

dataStream: Stream.Readable;
columns: DataColumn[];
}) => {
const csvStream = new Stream.Readable({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about using Transform Steam instead of creating another readable steam?

We used "Flowing Mode" of stream (listening on data event), but we didn't handle backpressure via .pause() .resume() functions. It might let us push too much data than the buffer should contain.

Here are some experiences to show how to handle backpressure and their result:

  1. Push without pause: Like our case now, we always keep pushing data even if the buffer is full.

    const Stream = require('stream');
    const dest = require('fs').createWriteStream('test.json', 'utf8');
    
    let n = 1e7;
    const source = new Stream.Readable({
      objectMode: true,
      read() {
        source.push({ data: n-- });
        if (n === 0) {
          this.push(null);
        }
      }
    });
    
    
    const csvStream = new Stream.Readable({
      objectMode: false,
      read() {
        return null;
      }
    });
    
    source
      .on('data', function (dataRow) {
        csvStream.push(JSON.stringify(dataRow), 'utf-8');
      })
      .on('end', () => {
        csvStream.push(null);
      });
    
    csvStream.on('end', () => {
      const used = process.memoryUsage().heapUsed / 1024 / 1024;
      console.log(`${Math.round(used * 100) / 100} MB`);
    }).pipe(dest)
    
    // 1463.76 MB
  2. Handle Backpressure via .pause .resume functions:

    const Stream = require('stream');
    const dest = require('fs').createWriteStream('test.json', 'utf8');
    
    let n = 1e7;
    const source = new Stream.Readable({
      objectMode: true,
      read() {
        source.push({ data: n-- });
        if (n === 0) {
          this.push(null);
        }
      }
    });
    
    
    const csvStream = new Stream.Readable({
      objectMode: false,
      read() {
        source.resume();
      }
    });
    
    source
      .on('data', function (dataRow) {
        const mightContinue = csvStream.push(JSON.stringify(dataRow), 'utf-8');
        if (!mightContinue) {
          this.pause();
        }
      })
      .on('end', () => {
        csvStream.push(null);
      });
    
    
    csvStream.on('end', () => {
      const used = process.memoryUsage().heapUsed / 1024 / 1024;
      console.log(`${Math.round(used * 100) / 100} MB`);
    }).pipe(dest)
    
    // 16.35 MB
  3. Use transform stream

    const Stream = require('stream');
    const dest = require('fs').createWriteStream('test.json', 'utf8');
    
    let n = 1e7;
    const source = new Stream.Readable({
      objectMode: true,
      read() {
        source.push({ data: n-- });
        if (n === 0) {
          this.push(null);
        }
      }
    });
    
    const csvStream = new Stream.Transform({
      objectMode: true,
      transform(chunk, encoding, callback) {
        callback(null, JSON.stringify(chunk), 'utf-8');
      }
    });
    
    
    source.pipe(csvStream).on('end', () => {
      const used = process.memoryUsage().heapUsed / 1024 / 1024;
      console.log(`${Math.round(used * 100) / 100} MB`);
    }).pipe(dest)
    
    // 14.55 MB

Here are some useful links about node stream:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wwwy3y3 Could you also show some experience with Node Steams?

Copy link
Contributor Author

@kokokuo kokokuo Jul 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @oscar60310 for reviewing and giving suggestions, I have read the article your provided, and also refactor the part by extending Stream.Transform to build a CsvTransfom to use, then delegate pipe method to handle the back pressure.

Btw, I refactor the original structure, now I make the ResponseFormatMiddleware to load built-in formatters and extension formatters to transform the format and add to ctx response.

so you could check the code in CsvFormatter, not the original respondToCtx code, thanks!

read: () => null,
});

const dataResult: string[] = [];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not push data into memory while handling streams. We can use 3rd party or Transform steam to convert JSON objects to JSON strings.

Here are some expriences using Transform steams:

  1. Origin: Push into memory:
    const Stream = require('stream')
    const source = new Stream.Readable({
      objectMode: true,
    });
    
    const dest = require('fs').createWriteStream('test.json', 'utf8');
    
    const jsonStream = new Stream.Readable({
      objectMode: false,
      read: () => null,
    });
    
    
    const toBuffer = (str) => {
      return Buffer.from(str, 'utf8');
    };
    
    const dataResult = [];
    // Read data stream and convert the format to json format stream.
    source
      // assume data stream "objectMode" is true to get data row directly. e.g: { name: 'jack', age: 18, hobby:['book', 'travel'] }
      .on('data', (dataRow) => {
        // collect and stringify all data rows
        dataResult.push(JSON.stringify(dataRow));
      })
      .on('end', () => {
        // transform format data to buffer
        jsonStream.push(toBuffer('['));
        jsonStream.push(toBuffer(dataResult.join()));
        jsonStream.push(toBuffer(']'));
        jsonStream.push(null);
      });
    
    let n = 1e7;
    while (n--) {
      source.push({ data: n });
    }
    source.push(null);
    
    
    
    jsonStream.on('end', () => {
      const used = process.memoryUsage().heapUsed / 1024 / 1024;
      console.log(`${Math.round(used * 100) / 100} MB`);
    }).pipe(dest)
    
    // 422.04 MB
  2. Transform steam:
    const Stream = require('stream')
    const source = new Stream.Readable({
      objectMode: true,
    });
    const dest = require('fs').createWriteStream('test.json', 'utf8');
    
    
    const toBuffer = (str) => {
      return Buffer.from(str, 'utf8');
    };
    
    const jsonToString = new Stream.Transform({
      objectMode: true,
      construct(callback) {
        this.first = true;
        callback(null);
      },
      transform(chunk, encoding, callback) {
        if (this.first) {
          this.push(toBuffer('['));
          this.first = false;
        } else {
          this.push(toBuffer(','));
        }
        this.push(toBuffer(JSON.stringify(chunk)))
        callback(null);
      },
      final(callback) {
        this.push(toBuffer(']'))
        callback(null);
      }
    })
    
    let jsonStream = source.pipe(jsonToString);
    
    
    let n = 1e7;
    while (n--) {
      source.push({ data: n });
    }
    source.push(null);
    
    
    jsonStream.on('end', () => {
      const used = process.memoryUsage().heapUsed / 1024 / 1024;
      console.log(`${Math.round(used * 100) / 100} MB`);
    }).pipe(dest);
    
    // 20.52 MB

Copy link
Contributor Author

@kokokuo kokokuo Jul 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for @oscar60310 reviewing and suggestion, same as above, I build a JsonStringTransfom by extending Stream.Transform, then delegate pipe method to handle the back pressure in flowing mode.

Also, now I use the ResponseFormatMiddleware to load built-in formatters and extension formatters to transform the format and add to ctx response.

so you could check the code in JsonFormatter, not the original respondToJson code, thanks!

* add the custom header key "X-Response-Formatted" to notice other response middleware
*/
protected setFormattedNotice(context: KoaRouterContext) {
context.response.set(this.formattedIdentifier, 'true');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to save the value to context directly instead of setting it in headers because we don't need to expose the internal context to clients.

context[this.formattedIdentifier] =  true;

Copy link
Contributor Author

@kokokuo kokokuo Jul 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reviewing and suggestion, because now I use the ResponseFormatMiddleware to load built-in formatters and extension formatters, then according to the user-provided support formats to check that all formatters which formatted or not, so you will see the new design way to handle it.

The handle method in ResponseFormatMiddleware like below:

public async handle(context: KoaRouterContext, next: RouteMiddlewareNext) {
    // return to skip the middleware, if disabled
    if (!this.enabled) return next();

    const formatters = await loadUsableFormatters(this.config.extensions);
    // get supported and request format to use.
    const format = checkUsableFormat({
      context,
      formatters,
      supportedFormats: this.supportedFormats,
      defaultFormat: this.defaultFormat,
    });

    context.request.path = context.request.path.split('.')[0];
    // go to next to run middleware and route
    await next();
    // format the response and route handler ran.
    formatters[format].formatToResponse(context);
    return;
  }

Comment on lines 82 to 94
protected setFormattedNotice(context: KoaRouterContext) {
context.response.set(this.formattedIdentifier, 'true');
}

/**
* add the custom header key "X-Response-Formatted" to notice other response middleware
*/
protected isResponseFormatted(context: KoaRouterContext) {
if (context.response.headers[this.formattedIdentifier] === 'true')
return true;
return false;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's wise to check whether the response had been formatted before doing the job, but since we should have only ONE formator at the same time, could we use another middleware to decide which format we should use?

getFormatMiddleware.ts

  public async handle(context: KoaRouterContext, next: RouteMiddlewareNext) {
    const format = context.request.url.getExtension(); // json / csv .....
    context.request.url = context.request.url.split(`.${this.format}`)[0];
    context.format = context.request.accepts(this.format);
    if(!context.format) throw error
    return next()
  }
}

jsonResponseMiddleware.ts

  public async handle(context: KoaRouterContext, next: RouteMiddlewareNext) {
    if (!this.enabled) return next();
    if (!this.isFormatSupported()) return next(); // Check context.format === 'json'
    await next();

    respondToJson(context);

  }

In this design, we can set the priority of each formator by changing the order of "format" property. i.e. ['csv', 'json'] / ['json', 'csv']

Copy link
Contributor Author

@kokokuo kokokuo Jul 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reviewing and suggestion, after the previous discussion, I have changed to a similar way, but not all delegated to middleware, I created the BaseResponseFormatter and make the ResponseFormatMiddleware to load all formatters which inherit BaseResponseFormatter.

const request = context.request;

//if url is end with the format, start to formatting
if (request.url.endsWith(`.${this.format}`)) return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do request.url contain parameters like /foo/bar.json?limit=1,offset=0 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reviewing and finding issue, I have been fixed, please check it

Comment on lines 28 to 40
// middleware is enabled or not, default is enabled beside you give "enabled: false" in config.
protected enabled: boolean;
constructor(name: string, config: MiddlewareConfig) {
super(name, config);

const value = this.getConfig()?.['enabled'] as boolean;
this.enabled = isUndefined(value) ? true : value;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did it mean we only have enabled option for built-in middlewares?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, because built-in middlewares have been added default, so we provide enabled to let user could close if they would like to build itself, otherwise, the enabled for customized middlewares may use for other purposes, so I do not enforce the customized middlewares also use enabled.

csvStream.push(toBuffer('\n'));
})
.on('error', (err: Error) => {
logger.debug(`read stream failed, detail error ${err}`);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use logger.warn instead to indicate the error in production env?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reviewing and suggesting, I have changed to logger.warn

})
.on('end', () => {
csvStream.push(null);
logger.info('convert to csv format stream > done.');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: In my opinion, the log prints during the regular user journey should be at "debug" level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reviewing and suggesting, I have changed it to logger.debug.

ctx.response.body = stream;
ctx.response.set(
'Content-disposition',
`attachment; filename=${filename}.csv`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a greate idea to set a filename for CSV API!


/* istanbul ignore file */
export const arrayToStream = (data: Array<any>): Stream => {
return from2.obj(function (_size, next) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: We can easily create Readable strean without "from2" lib like this:

const arrayToStream = (data) => {
  return new Stream.Readable({
    objectMode: true,
    read() {
      this.push(data.pop() || null);
    }
  });
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for suggesting, I have removed from2 to create it, thanks !

kokokuo added 3 commits July 28, 2022 12:29
- update "IDataSource" to response clear data result type.
- add "toJsonResponse" and "toCsvResponse" function to do format and save to koa context.
- add csv and json response middleware.
- add test cases of csv and json response middleware.
- add test cases of csv format and json format function.
… refactor response format middleware structure.

- build csv transform and json string transform by extending stream transform and delegate pipe to handling the backpressue issue.
- refactor to use formatter class "BaseResponseFormatter" for structure way to handle formatting json "JsonFormatter" and csv "CsvFormatter".
- create response format middleware to load formatter to check format.
- remove original csv and json format middleware.
- refactor to inject app config, not only middleware config.
- merge middleware and response-formatter load extension to the same one.
- update the csv and json stream test cases for calling csv, json formatter.
- add response format helpers test cases.
- refactor all middlewares to pass app config.
- refactor array to stream function in "test-utils.ts".
@kokokuo kokokuo force-pushed the feature/response-format branch from ebc253e to bbcd1c2 Compare July 28, 2022 04:31
@kokokuo
Copy link
Contributor Author

kokokuo commented Jul 28, 2022

Currently, I keep the config in an original way e.g: response-format and support formats still put in middlewares in the config until the first united extension sample and code structure appeared.

- remove "loadUsableFormatters" and create common extensions load function.
@oscar60310 oscar60310 merged commit c85e6c0 into develop Aug 2, 2022
@oscar60310 oscar60310 deleted the feature/response-format branch August 2, 2022 07:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants