Skip to content

Conversation

@bneradt
Copy link
Contributor

@bneradt bneradt commented Jul 14, 2022

This implements the HTTP/2 to origin feature.

@bneradt bneradt added the HTTP/2 label Jul 14, 2022
@bneradt bneradt added this to the 10-Dev milestone Jul 14, 2022
@bneradt bneradt self-assigned this Jul 14, 2022
@bneradt bneradt marked this pull request as draft July 14, 2022 20:59
@maskit
Copy link
Member

maskit commented Jul 15, 2022

Since H2 to Origin has been unmarged for very long time, I'm slightly lowering a bar for merging this.

I still think issue #5230 should be fixed eventually (not just for performance but also privacy), but I'm going to let you off the hook. I actually have an incomplete patch (which works for simple use case at minimum) to pass an HttpHdr for request to HttpSM without marshaling and unmarshaling. Maybe we can add sensitive flag later to keep HPACK header representation. For now, I'd like to see some warning or note on the doc that says header representation won't be kept and sensitive headers can be stored into HPACK dynamic table.

// If the number of clients is 0, HTTP2_SESSION_EVENT_FINI is not received or sent, and session is active,
// then mark the connection as inactive
session->do_clear_session_active();
session->set_no_activity_timeout();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be safe, but I'm wondering if inactivity timeout currently works for incoming H2 sessions... Even if it works, the timeout might be based on another setting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recall running into idle H2 sessions timing out. Definitely for outbound connections. I don't recall if I saw it for inbound or not. I assume this would be an issue for incoming H2 sessions as well. But probably more apparent in the case of origin sessions sitting in a reuse pool.

}

uint32_t buf_len = resp_hdr->length_get() * 2; // Make it double just in case
uint32_t buf_len = send_hdr->length_get() * 2; // Make it double just in case
Copy link
Member

@maskit maskit Jul 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you make another PR for this kind of name changes (e.g. send_hdr, peer_settings, local_settings, etc.)? That would reduce the volume of this PR (which is great for reviewers so they can focus on the logical changes for H2 to Origin), and would also ease maintaining this branch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good: #9084.

ink_assert(c_write > 0);
if (c->write_vio == nullptr) {
consumer_handler(VC_EVENT_ERROR, c);
} else if (c->write_vio->ntodo() == 0 && c->alive) {
Copy link
Member

@maskit maskit Jul 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not just this line, but I don't get why these changes to HttpTunnel is needed for H2 to Origin. I'm wondering (and concerned) if these affect H1 and client transactions.

@bneradt
Copy link
Contributor Author

bneradt commented Jul 18, 2022

I appreciate the thoughtful review, @maskit. And thank you, too, for repeating some of your comments from the previous PR. I'm working through your observations now and will spin off other PRs per your suggestions.

field->value_set(headers->m_heap, headers->m_mime, value, value_len);
}
// Remove the host header field, redundant to the authority field
// For istio/envoy, having both was causing 404 responses
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting, because RFC 9113 says:

An intermediary that forwards a request over HTTP/2 MAY retain any
Host header field.

Removing Host header should be fine, but I wonder why it causes 404. It could be because of mismatch between the generated :authority pseudo header and the retained Host header.

The RFC also says:

Clients MUST NOT generate a request with a Host header field that
differs from the ":authority" pseudo-header field. A server
SHOULD treat a request as malformed if it contains a Host header
field that identifies an entity that differs from the entity in
the ":authority" pseudo-header field.

An intermediary that forwards a request over HTTP/2 MUST construct
an ":authority" pseudo-header field using the authority
information from the control data of the original request, unless
the original request's target URI does not contain authority
information (in which case it MUST NOT generate ":authority").
Note that the Host header field is not the sole source of this
information; see Section 7.2 of [HTTP].

and ATS does gather the information from the request line if available. We might want to check generated :authority is really appropriate.

I don't remember why we didn't remove Host header. There may be no reason, but it might be because removing it is unnecessary/optional.

Copy link
Member

@maskit maskit Aug 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a TODO comment at right after your changes (on line 651). We can remove the comment.

Also, I finished reviewing changes in http2_convert_header_from_2_to_1_1 and http2_convert_header_from_1_1_to_2. No additional change is needed except the TODO comment I mentioned above. If you make a separate PR for these changes, I can approve it immediately.

I'd say we can make the changes on master. Since we both modified the functions and my change is going to be on master, that would be easier to manage than having two different changes on master and 10-Dev.

MIMEField *host = headers->field_find(MIME_FIELD_HOST, MIME_LEN_HOST);
if (host == nullptr) {
host = headers->field_create(MIME_FIELD_HOST, MIME_LEN_HOST);
headers->field_attach(host);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, Host header was unnecessary for ATS internal process if the request URL is absolute form, but adding it makes sense.

This block is fine at all and no change is needed, but I realized one thing while I'm checking RFC 9110. According to the RFC, Host header should be at the beginning of a header field list.

A user agent that sends Host SHOULD send it as the first
field in the header section of a request.

That makes sense and we do that for pseudo headers though, that's not easy to do with our MIMEHdr structure, because it doesn't allow us to insert/prepend headers.

I also checked RFC 9113 (HTTP/2) and it says:

All pseudo-header fields MUST appear in a field block before all
regular field lines. Any request or response that contains a pseudo-
header field that appears in a field block after a regular field line
MUST be treated as malformed (Section 8.1.1).

So it's MUST for pseudo headers, but it's SHOULD for Host header. It would be nice if we could have Host header at the beginning but I don't think it has to be done on this PR.

if (this->is_draining()) { // For a case we already checked Connection header and it didn't exist
Http2SsnDebug("Preparing for graceful shutdown because of draining state");
this->connection_state.set_shutdown_state(HTTP2_SHUTDOWN_NOT_INITIATED);
} /*else if (this->connection_state.get_stream_error_rate() >
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this part should be completely removed. If ATS is used as a forward proxy, it could be attacked by malicious/misbehaving origin servers.

There's no graceful shutdown mechanism defined on the spec, but we could stop making new requests on the connection that has high error rate to make the connection inactive and close it sooner rather than later.

We could also mark the origin server as a bad H2 server and use H1 on future connections. It should work as fallback mechanism. I think we need this kind of marking in the future anyway to choose a transport protocol (TCP or QUIC) for an origin server.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if we are going to say H2 to Origin is experimental support, like QUIC, we can ignore security concerns for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could add this back in. As I dimly recall in my testing, the shutdown due to spurious errors seemed high. But could have been due to other reasons.

Copy link
Member

@maskit maskit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an approval for changes around H2.

I'm not sure about connection management part. I still think we should make the change separately, but if that's not an option, please wait for another approval from someone else.

@bneradt
Copy link
Contributor Author

bneradt commented Jan 26, 2023

This is an approval for changes around H2.

I'm not sure about connection management part. I still think we should make the change separately, but if that's not an option, please wait for another approval from someone else.

Thank you @maskit . Again, I appreciate all the feedback you've provided on this feature.

@keesspoelstra is carefully reviewing and testing this in the meantime too. I'll wait upon at least his review.

zwoop and others added 7 commits January 31, 2023 14:53
* Fix an error on SSL config reload (plus some cleanup).

This seems to have eliminated some ERROR diags we were seeing in Yahoo Prod when
doing config reloads.

The SSLSecret public functions no longer return pointers into the unorded_map of
secrets, they return a copy of the secret data. This seemed thread unsafe. A
periodic poll running in the background can update the secret data for an entry
for a secret name in the map.

To avoid exporting pointers, I had to change the prototype of TSSslSecretGet().
Hopefully there are no existing plugins that are already using this TS API function,
so breaking this rule will be moot. I added a new TS API type, TSAllocatedVarLenData,
in order to be able to cleanly return a copy of the secret data.

* YTSATS-4067: Fix deadlock with secret_map_mutex (apache#740)

1. getOrLoadSecret grabbed the secret_map_mutex and called loadSecret.
2. loadSecret dispatched to Continations that registered for the
   TS_EVENT_SSL_SECRET event. This would try to grab the Continuation's
   lock.
3. In the meantime, those Continuations could call setSecret which would
   try to grab the secret_map_mutex. If this Continuation did this while
   holding the lock that step 2 is waiting upon, then there will be a
   deadlock between the Continuation lock and the secret_map_mutex
   between the two threads.

This patch avoids the deadlock by releasing the secret_map_mutex lock
before calling loadSecret. It also updates the secret_map when loading
secrets from a file in loadSecret.

---------

Co-authored-by: Brian Neradt <brian.neradt@verizonmedia.com>
@zwoop zwoop modified the milestones: 10-Dev, 10.0.0 Feb 2, 2023
zwoop and others added 6 commits February 1, 2023 17:39
This implements the HTTP/2 to origin feature.
When ConnectingEntry handled a connection failure, it could clear the
error value passed to set_connect_fail. This would incorrectly result in
the server not being marked down by hostdb. This patch ensures that the
error code passed to set_connect_fail is always an error value when
handling a connection failure.
@bneradt
Copy link
Contributor Author

bneradt commented Feb 2, 2023

Going to close this and re-open on master now that 10-Dev is master.

See: #9366

@bneradt bneradt closed this Feb 2, 2023
@bneradt bneradt mentioned this pull request Feb 2, 2023
@zwoop zwoop removed this from the 10.0.0 milestone Feb 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants