Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve ColumnWriteBuffer recycling #395

Merged

Conversation

AndreyNudko
Copy link
Contributor

Return buffers to the pool after executing prepared INSERT statement, so they can be reused again even if statement is not closed

Return buffers to the pool after executing prepared INSERT statement, so they can be reused again even if statement is not closed
@AndreyNudko
Copy link
Contributor Author

One [I think fairly common] pattern for uploading data to db is something akin to the following pseudo-code

while(isRunnning) {
  try (Connection con = openConnection();
       PreparedStatement stmt = prepareInsertStatement(con))
  {
    while(isRunning) {
      pollSomeQueueAndBatchNextMessage(stmt);
      if (timeToWriteToDb()) {
        stmt.executeUpdate();
      }
    }
  } catch(SQLException e) {
    handleError(e);
  }
}

Essentially you prepare statement once and keep using it until there's some connection error, in which case you throw current connection away and start again.
It turns out even after PR #382 this pattern still causes some memory churn.
What happens is that executeUpdate() sets blockInit=false, then re-initilizes existing block; this works ok - re-init returns buffer to the pool and immediately obtains it from pool again.
However next setObject() calls initBlockIfPossible(), which throws current block away w/o returning ColumnWriterBuffer to the pool
It then creates new block, but now pool is empty and new ColumnWriteBuffer-s have to be instantiated
This PR improves ColumnWriteBuffer recycling by returning them to the pool when marking block as uninitialized
This should improve situation for code like above, as well as for cases when JDBC connection pool implements statement caching.

@codecov
Copy link

codecov bot commented Dec 25, 2021

Codecov Report

Merging #395 (bb66e4b) into master (830d40e) will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master     #395   +/-   ##
=========================================
  Coverage     62.43%   62.43%           
  Complexity     1250     1250           
=========================================
  Files           135      135           
  Lines          6640     6640           
  Branches        516      516           
=========================================
  Hits           4146     4146           
  Misses         2219     2219           
  Partials        275      275           
Impacted Files Coverage Δ
...c/statement/ClickHousePreparedInsertStatement.java 53.21% <100.00%> (+0.30%) ⬆️
...housepower/jdbc/statement/ClickHouseStatement.java 54.65% <100.00%> (+0.53%) ⬆️
...rc/main/java/com/github/housepower/data/Block.java 89.70% <0.00%> (-1.48%) ⬇️
.../com/github/housepower/buffer/ByteArrayWriter.java 100.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 830d40e...bb66e4b. Read the comment docs.

@sundy-li sundy-li merged commit 2234592 into housepower:master Dec 25, 2021
@sundy-li
Copy link
Member

LGTM, thanks!

@AndreyNudko AndreyNudko deleted the improve_column_buffer_recycling branch December 25, 2021 12:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants