Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streaming BulkLoad for large table imports #523

Closed
chdh opened this issue Mar 2, 2017 · 1 comment
Closed

Streaming BulkLoad for large table imports #523

chdh opened this issue Mar 2, 2017 · 1 comment

Comments

@chdh
Copy link
Collaborator

chdh commented Mar 2, 2017

The current BulkLoad implementation requires that all data rows have to be stacked in memory before they can be sent to the database server. This is not suitable for large table imports.

A streaming bulk insert implementation must use data flow control to apply back pressure when the data arrives faster than it can be sent to the database. This could be modeled after the standard Node.js stream.Writable interface: Return false from write() to pause, emit a 'drain' event to resume.

It has to be determined whether a future BulkLoad streaming class should extend the stream.Writable class of the Node.js API (or the readable-stream package), or just implement the same interface, The objective is to make it compatible with stream.Readable.pipe().

@arthurschreiber
Copy link
Collaborator

I'll take another look soon and will let you know of how I think this would be best implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants