Skip to content

Conversation

@maskit
Copy link
Member

@maskit maskit commented Nov 6, 2017

delete_stream() removes a stream from priority_tree, but here, the stream destroy itself without calling delete_stream(). The stream can be used during processing HTTP2_SESSION_EVENT_XMIT.

I think this fixes #2207.

@maskit maskit added the HTTP/2 label Nov 6, 2017
@maskit maskit added this to the 8.0.0 milestone Nov 6, 2017
@maskit maskit self-assigned this Nov 6, 2017
@maskit maskit requested review from masaori335 and shinrich November 6, 2017 03:14
if (terminate_stream && reentrancy_count == 0) {
Http2ClientSession *h2_parent = static_cast<Http2ClientSession *>(parent);
SCOPED_MUTEX_LOCK(lock, h2_parent->connection_state.mutex, this_ethread());
h2_parent->connection_state.delete_stream(this);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These block looks similar to L667-L681 ( Http2Stream::destroy() ).

// release_stream and delete_stream indirectly call each other and seem to have a lot of commonality
// Should get resolved at somepoint.
Http2ClientSession *h2_parent = static_cast<Http2ClientSession *>(parent);
SCOPED_MUTEX_LOCK(lock, h2_parent->connection_state.mutex, this_ethread());
h2_parent->connection_state.release_stream(this);
// Current Http2ConnectionState implementation uses a memory pool for instantiating streams and DLL<> stream_list for storing
// active streams. Destroying a stream before deleting it from stream_list and then creating a new one + reusing the same chunk
// from the memory pool right away always leads to destroying the DLL structure (deadlocks, inconsistencies).
// The following is meant as a safety net since the consequences are disastrous. Until the design/implementation changes it
// seems
// less error prone to (double) delete before destroying (noop if already deleted).
if (h2_parent->connection_state.delete_stream(this)) {
Warning("Http2Stream was about to be deallocated without removing it from the active stream list");
}

How about calling delete_stream() before release_stream() in Http2Stream::destroy() ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the block in destroy() is a part of safety net. That may strengthen the safety net, however, I think destroy should do only destroying, and we shouldn't depend on the safety net.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree that the delete_stream/release_stream/destroy relationship isn't very satisfying. I'll look a bit, and see if we can unify things a bit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a different fix that I'm running on a prod box for 30 minutes (a new record!). I made a change in release_stream. It was removing the stream from the connection_state. This means that the actual delete_stream would exit immediately without cleaning up the priority tree.

I must run out right now, but I'll put up another PR later this evening. Assuming the prod box doesn't go crazy before.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My fix is in PR #2781

@maskit
Copy link
Member Author

maskit commented Nov 7, 2017

[approve ci autest]

@zwoop
Copy link
Contributor

zwoop commented Nov 7, 2017

Testing on Docs now.

@zwoop zwoop merged commit faa3743 into apache:master Nov 7, 2017
@zwoop
Copy link
Contributor

zwoop commented Nov 7, 2017

Cherry-picked to 7.1.2.

@zwoop zwoop modified the milestones: 8.0.0, 7.1.2 Nov 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

H2 BAD_ACCESS at Mutex_lock under load

5 participants