You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was looking for an easy way to handle CSV as strings (not files) today, and had to hack together a solution based on query_csv(), below.
That could be a documentation issue (or even lack of attention on my end), but it seems that there are low-level (browser) APIs and a high-level (node) API, but no middle ground.
It might be nice to expose some friendly API to support middle ground use cases. The boilerplate is rather long now.
Or, for my case, just support CSV strings, if that is in scope of the library.
If I miss something and there is an easier solution, please do share. :-)
Sorry for the typescript, if you're interested, I can try to submit a pure-JS PR, provided you share your preferences on the API design.
interfaceQueryFileOptions{/** * Whether the CSV file has a header row. * Default false. */hasHeader: boolean;/** * Comment prefix to treat lines as comments. * Lines starting with this prefix will be ignored. */commentPrefix: string;/** * Input CSV encoding. Default 'utf-8'. */encoding: 'utf-8'|'latin-1'|'binary';/** * Input CSV field delimiter. Default ','. */inputDelimiter: string;/** * Input CSV parsing policy. Default 'quoted_rfc'. */inputPolicy: 'simple'|'quoted'|'quoted_rfc';/** * Output CSV field delimiter. Default `inputDelimiter`. */outputDelimiter: string;/** * Output CSV formatting policy. Default `inputPolicy`. */outputPolicy: 'simple'|'quoted'|'quoted_rfc';}functionloadOptions(query: string,partialOptions?: Partial<QueryFileOptions>): QueryFileOptions{constdefaultOptions: Partial<QueryFileOptions>={hasHeader: true,encoding: 'utf-8',inputDelimiter: ',',inputPolicy: 'quoted_rfc'};constoptions={ ...defaultOptions, ...partialOptions};options.outputDelimiter??=options.inputDelimiter;options.outputPolicy??=options.inputPolicy;if(options.encoding==='latin-1'){options.encoding='binary';}if(options.inputDelimiter==='"'&&options.inputPolicy==='quoted'){thrownewRbqlIOHandlingError('Double quote delimiter is incompatible with "quoted" policy');}if(!isAscii(query)&&options.encoding==='binary'){thrownewRbqlIOHandlingError('To use non-ascii characters in query enable UTF-8 encoding instead of latin-1/binary');}if((!isAscii(options.inputDelimiter!)||!isAscii(options.outputDelimiter!))&&options.encoding==='binary'){thrownewRbqlIOHandlingError('To use non-ascii characters in query enable UTF-8 encoding instead of latin-1/binary');}returnoptionsasQueryFileOptions;}// TODO: Extend for several tables?classJoinRegistryimplementsRBQLJoinTableRegistry{privaterecordIterator: RBQLInputIterator;constructor(recordIterator: RBQLInputIterator){this.recordIterator=recordIterator;}get_iterator_by_table_id(tableId: string){if(tableId!=='b'){thrownewRbqlIOHandlingError(`Unable to find join table "${tableId}"`);}returnthis.recordIterator;}// TODO: Implement?get_warnings(_warnings: string[]){}}asyncfunctionqueryImpl(query: string,inputStream: Readable,outputStream: Writable,options: QueryFileOptions,joinStream?: Readable): Promise<void>{constinputIterator=newCSVRecordIterator(inputStream,null,options.encoding,options.inputDelimiter,options.inputPolicy,options.hasHeader,null,'a','a');constoutputWriter=newCSVWriter(outputStream,true,options.encoding,options.outputDelimiter,options.outputPolicy);constjoinRegistry=joinStream&&newJoinRegistry(newCSVRecordIterator(joinStream,null,options.encoding,options.inputDelimiter,options.inputPolicy,options.hasHeader,null,'b','b'));// TODO: Handle warningsconstwarnings: string[]=[];// Promise passthrough, no await.returnrbql.query(query,inputIterator,outputWriter,warnings,joinRegistry);}asyncfunctionquery(data: string,query: string,joinData?: string,partialOptions?: Partial<QueryFileOptions>): Promise<string>{constoptions=loadOptions(query,partialOptions);constinputStream=Readable.from(data,{objectMode: false});constoutputStream=newWritableStreamBuffer();constjoinStream=joinData ? Readable.from(joinData,{objectMode: false}) : undefined;awaitqueryImpl(query,inputStream,outputStream,options,joinStream);// TODO: Get rid of this ||.returnoutputStream.getContentsAsString(options.encoding)||'(Error!)';}}
The text was updated successfully, but these errors were encountered:
Thanks, I like this idea, the only thing I would change is that I think streams can be more efficient than strings, and functionality-wise they are more basic/generic, so we actually need a stream API, similar to what your internal queryImpl does, and as your snippet shows it is fairly trivial to convert a string into a stream, so the API caller can easily do this outside the function.
IMO it should look something like this: async function query_csv_stream(query_text, input_stream, input_delim, input_policy, output_stream, output_delim, output_policy, output_warnings, join_stream=null, with_headers=false, comment_prefix=null, user_init_code='', options=null)
I was looking for an easy way to handle CSV as strings (not files) today, and had to hack together a solution based on
query_csv()
, below.That could be a documentation issue (or even lack of attention on my end), but it seems that there are low-level (browser) APIs and a high-level (node) API, but no middle ground.
It might be nice to expose some friendly API to support middle ground use cases. The boilerplate is rather long now.
Or, for my case, just support CSV strings, if that is in scope of the library.
If I miss something and there is an easier solution, please do share. :-)
Sorry for the typescript, if you're interested, I can try to submit a pure-JS PR, provided you share your preferences on the API design.
The text was updated successfully, but these errors were encountered: