Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Media-type parser #61987

Merged
merged 32 commits into from
Sep 17, 2020
Merged

Media-type parser #61987

merged 32 commits into from
Sep 17, 2020

Conversation

pgomulka
Copy link
Contributor

@pgomulka pgomulka commented Sep 4, 2020

Splitting method XContentType.fromMediaTypeOrFormat into two separate methods. This will help to validate media type provided in Accept or Content-Type headers.
Extract parsing logic from XContentType (fromMediaType and fromFormat methods) to a separate MediaTypeParser class. This will help reuse the same parsing logic for XContentType and TextFormat (used in sql)

Media-Types type/subtype; parameters parsing is in defined https://tools.ietf.org/html/rfc7231#section-3.1.1.1

based on #61845

part of Allow parsing Content-Type and Accept headers with version #61427

@pgomulka pgomulka added WIP :Core/Infra/REST API REST infrastructure and utilities labels Sep 4, 2020
@pgomulka pgomulka self-assigned this Sep 4, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (:Core/Infra/REST API)

@elasticmachine elasticmachine added the Team:Core/Infra Meta label for core/infra team label Sep 4, 2020
@pgomulka
Copy link
Contributor Author

pgomulka commented Sep 7, 2020

@elasticmachine run elasticsearch-ci/2

@pgomulka
Copy link
Contributor Author

pgomulka commented Sep 7, 2020

@elasticmachine run elasticsearch-ci/packaging-sample-windows

@pgomulka pgomulka requested review from bpintea and jaymode September 8, 2020 11:48
@pgomulka pgomulka changed the title WIP Media-type parser Media-type parser Sep 8, 2020
@pgomulka pgomulka removed the WIP label Sep 8, 2020
@pgomulka pgomulka requested a review from jakelandis September 8, 2020 13:53
Copy link
Contributor

@bpintea bpintea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SQL related changes look good, but I wonder if we can't simplify the change a bit.

Comment on lines 68 to 69
XContentType acceptHeader = null;
XContentType contentTypeHeader = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if this media-type origin split is strictly necessary: the previous code tried to fetch a token from somewhere - in order of preference: enforced format (CBOR), format URL attribute, Accept header, Content-Type header and repeatedly checking accept against null - then validate that token against a media type or as a format (#fromMediaTypeOrFormat()).

The new code checks if it's a format (in getXContentType() @ L143), or then if it's a media type, by one of these XContentType instances (@ L147). But the source preferences is already stored in the way the accept member is set (by those subsequent null checks), so could we not simply serially invoke #fromMediaType(accept) and #fromFormat(accept) and return the first non-null value to simplify the code? Similar to what we do in TextFormat#fromMediaTypeOrFormat().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, I have refactored this as per your suggestions

@@ -32,7 +34,7 @@
/**
* Templating class for displaying SQL responses in text formats.
*/
enum TextFormat {
enum TextFormat implements MediaType {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the extension necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in order to use a MediaType parser for both TextFormat and XContentType i had to introduce an interface that is implemented by both

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Thanks.

@@ -92,7 +97,8 @@ protected RestChannelConsumer prepareRequest(RestRequest request, NodeClient cli
* that doesn't parse it'll throw an {@link IllegalArgumentException}
* which we turn into a 400 error.
*/
XContentType xContentType = accept == null ? XContentType.JSON : XContentType.fromMediaTypeOrFormat(accept);
//TODO PG this all logic needs a review from SQL team
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove reminder.

@@ -197,6 +204,7 @@ String maybeEscape(String value, Character delimiter) {
boolean hasHeader(RestRequest request) {
String header = request.param(URL_PARAM_HEADER);
if (header == null) {
//TODO PG in most places we only assume one accept header
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know. Having multiple header fields with same name is legal though, I guess we should leave it as is.

@@ -82,7 +82,8 @@ private XContentType detectContentType(String content) {
//There must be a __<content_type__:: prefix so the minimum length before detecting '__::' is 3
int endOfContentName = content.indexOf("__::", 3);
if (endOfContentName != -1) {
return XContentType.fromMediaTypeOrFormat(content.substring(2, endOfContentName));
//TODO PG what do we expect here?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove this todo, answered in another comment.


package org.elasticsearch.common.xcontent;

public interface MediaType {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add javadocs to the class and methods?

}

/**
* parsing media type that follows https://tools.ietf.org/html/rfc2616#section-3.7
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right. the rfc2616 is obsolete.
rfc7231 is the updated one

Map<String, String> parameters = new HashMap<>();
for (int i = 1; i < split.length; i++) {
String[] keyValueParam = split[i].trim().split("=");
// should we validate that there are no spaces between key = value?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per the RFC, it is not allowed so I guess we should

Copy link
Contributor

@bpintea bpintea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SQL bits LGTM.

@pgomulka
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/bwc

@pgomulka
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/default-distro

is(nullValue()));

assertThat(mediaTypeParser.parseMediaType(mediaType + "; key = value"),
is(nullValue()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a test for assertThat(mediaTypeParser.parseMediaType(mediaType + "; key=") I think it will error with index out of bounds.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would mean that the result of String[] keyValueParam = split[i].trim().split("="); has length = 1
it is checked in a next line.

if (keyValueParam.length != 2 || hasSpaces(keyValueParam[0]) || hasSpaces(keyValueParam[1])) {

added a testcase


/**
* A media type object that contains all the information provided on a Content-Type or Accept header
* // TODO PG to be extended with getCompatibleAPIVersion and more
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you leave these specific TODO's out. IMO TODO's left in the code base should be rare and general enough such that any future dev can understand and implement it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree. removed the todo


/**
* Returns a corresponding format for a MediaType. i.e. json for application/json media type
* Can differ from the MediaType's subtype i.e plain/text but format is txt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

Copy link
Contributor

@jakelandis jakelandis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for iterations. (a couple very minor new comments, with this PR or in the future)

@pgomulka
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/1
failed test is being fixed in #62411

Copy link
Member

@jaymode jaymode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Co-authored-by: Jay Modi <jaymode@users.noreply.github.com>
@pgomulka pgomulka merged commit 86ba732 into elastic:master Sep 17, 2020
jakelandis added a commit to jakelandis/elasticsearch that referenced this pull request Oct 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/REST API REST infrastructure and utilities Team:Core/Infra Meta label for core/infra team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants