-
-
Notifications
You must be signed in to change notification settings - Fork 70
Bulk insertion
Oleg V. Kozlyuk edited this page Dec 30, 2019
·
18 revisions
ClickHouse is specialized in storing huge volumes of logs/metrics data. While it can accept single-row inserts (and this client supports them), the specialized MergeTree family engines work best when data is inserted in bulk. ClickHouse.Client supports this scenario via specialized ClickHouseBulkCopy
class
Using ClickHouseBulkCopy
requires:
- Target connection (
ClickHouseConnection
instance) - Target table name (
DestinationTableName
property) - Data source (
IDataReader
orIEnumerable<object[]>
)
-
ClickHouseBulkCopy
utilizes TPL Dataflow to process batches of data, in 4 parallel insertion 'threads' by default - Following parameters are tweakable:
BatchSize
,MaxDegreeOfParallelism
- Before copying, a
SELECT * FROM <table> LIMIT 0
query is performed to get information about target table structure. Types of provided objects must (reasonably) match the target table structure
using var connection = new ClickHouseConnection("Host=<host>;Driver=Binary;<..other..>");
using var bulkCopyInterface = new ClickHouseBulkCopy(connection)
{
DestinationTableName = "<database>.<table>",
BatchSize = 100000
};
// Example data to test
var values = Enumerable.Range(0, count).Select(i => new object[] { (long)i });
await bulkCopyInterface.WriteToServerAsync(values);
Console.WriteLine(bulkCopyInterface.RowsWritten);