fields & fieldlists interfaces and implementation #122

jonemo · 2023-01-19T08:11:27Z

This separates the Field and Fields interfaces, implementation, and tests from my WIP branch http-interface-update-2. The goals of doing this are

to unblock @dlm6693's work on the AWS sigv4 signer
to facilitate the discussion about quoting and escaping of header values with @nateprewitt

This PR only contains the interface and implementation of Field and Fields. All the work in actually using them for requests and responses happens on the http-interface-update-2 branch.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

python-packages/smithy-python/smithy_python/_private/http/__init__.py

python-packages/smithy-python/smithy_python/interfaces/http.py

python-packages/smithy-python/tests/unit/test_http_fields.py

JordonPhillips · 2023-01-19T14:20:52Z

python-packages/smithy-python/smithy_python/_private/http/__init__.py

@@ -14,11 +14,13 @@
 # TODO: move all of this out of _private


+from collections import OrderedDict


You don't need this - all dicts are insertion-ordered in python now as part of the contract, and you don't seem to be be relying on any of the remaining niche features of OrderedDict

I know. And I agree that this should be discussed. For now this is taken verbatim from the spec which says this:

entries: OrderedDict[str, Field] # OrderedMap<String, Field>

See also this comment thread above: #122 (comment)

cc @nateprewitt to chime in since he wrote that spec

The code in the specification was to convey the ideas to a broad audience of language teams. I have some oddities like this in the orderedness of Python dictionaries isn't necessarily widely known. So the pseudo-code examples shouldn't be taken as gospel beyond maybe interface framing.

Hah, I'm glad we talked about it then. So we don't need to keep OrderedDict for it's order-maintaining property. But now I know why the order matters and it means that I need to fix my Fields.__eq__ implementation. And the new __eq__ will benefit form the equality definition of OrderedDict, so I'm going to keep it anyway!

python-packages/smithy-python/smithy_python/interfaces/http.py

python-packages/smithy-python/tests/unit/test_http_fields.py

python-packages/smithy-python/smithy_python/_private/http/__init__.py

python-packages/smithy-python/smithy_python/interfaces/http.py

python-packages/smithy-python/smithy_python/_private/http/__init__.py

python-packages/smithy-python/smithy_python/interfaces/http.py

JordonPhillips · 2023-01-19T16:50:49Z

python-packages/smithy-python/tests/unit/test_http_fields.py

+        (["v,a,l,1", "val2"], '"v,a,l,1",val2'),
+        # Double quotes are escaped with a single backslash. The second backslash below
+        # is for escaping the actual backslash in the string for Python.
+        (['"quotes"', "val2"], '\\"quotes\\",val2'),


The RFC only mentions escaping in the context of quoted values. You also need to test for escaping backslashes.

Suggested change

(['"quotes"', "val2"], '\\"quotes\\",val2'),

(['"quotes"', "val2"], '"\\"quotes\\"",val2'),

(["foo,bar\\", "val2"], '"foo,bar\\\\",val2'),

So if I only escape quotes inside of quoted values, then values with pre-existing quotes are indistinguishable from those were smithy-python added the quotes:

raw field value serialized value in header

a,b,c "a,b,c"

"a,b,c" "a,b,c" or "\"a,b,c"\" ?

Is that what we want?

You always escape double quotes, but when you do you need to wrap the whole element in double quotes. So:

['a', '"', 'b'] -> "a,\",b"

I am surprised by that. Nowhere in the discussion so far have we considered putting quotes around multiple Field values.

python-packages/smithy-python/smithy_python/_private/http/__init__.py

nateprewitt · 2023-01-19T18:32:05Z

python-packages/smithy-python/smithy_python/_private/http/__init__.py

@@ -14,11 +14,13 @@
 # TODO: move all of this out of _private


+from collections import OrderedDict


The code in the specification was to convey the ideas to a broad audience of language teams. I have some oddities like this in the orderedness of Python dictionaries isn't necessarily widely known. So the pseudo-code examples shouldn't be taken as gospel beyond maybe interface framing.

python-packages/smithy-python/smithy_python/_private/http/__init__.py

python-packages/smithy-python/smithy_python/interfaces/http.py

Co-authored-by: Nate Prewitt <nate.prewitt@gmail.com>

jonemo · 2023-01-20T06:45:10Z

I updated the field value serialization logic in a way that incorporates all (?) the suggestions and constraints from sigv4 test cases. Specifically:

Spaces in values no longer trigger quoting. That only leaves commas as reason for quoting.
Double quotes are only escaped when quoting was applied. Same for backslashes.
When a string is already quoted (starts and ends with double quotes) it doesn't get modified, even if it contains additional double quotes or backslashes.

I understand how we arrived here, but I don't like it because:

The logic for quoting and escaping is complicated. Even I as the person who just wrote it find it hard to predict the outcome.
It isn't round-trip-able. Because we don't know if surrounding quotes were part of the original value or added by our serializer, the deserializer can't know whether to remove them. Not a problem per se, but the deserialization utility method that @JordonPhillips asked for will be another place with complex difficult to predict behavior.

jonemo · 2023-01-20T07:42:39Z

Note to self, I haven't dealt with this yet:

    All field names are case insensitive and
    case-variance must be treated as equivalent.
    Names MAY be normalized but SHOULD be preserved
    for accuracy during transmission.

python-packages/smithy-python/smithy_python/_private/http/__init__.py

dlm6693 · 2023-01-23T15:50:11Z

python-packages/smithy-python/smithy_python/_private/http/__init__.py

+
+    See :func:`Field.as_string` for quoting and escaping logic.
+    """
+    CHARS_TO_QUOTE = (",", '"')


Let's move this to a module-level constant.

Why? It's an implementation detail of this function. I would prefer to not make it part of the public interface of this module.

Why not? We don't need to redefine it every time this function gets called. It is a constant after all. Assuming that's why you made it upper case.

Why not?

Because then others can start importing it and changing it becomes a backwards-incompatible change.

I made it lowercase to make it look less like a module-level constant.

python-packages/smithy-python/smithy_python/_private/http/__init__.py

JordonPhillips · 2023-01-24T14:31:56Z

python-packages/smithy-python/smithy_python/_private/http/__init__.py

@@ -193,16 +193,16 @@ def __eq__(self, other: object) -> bool:
        )

    def __repr__(self) -> str:
-        return f'Field(name="{self.name}", value=[{self.value}], kind={self.kind})'
+        return f"Field(name={self.name!r}, value={self.value!r}, kind={self.kind!r})"


Yep, you can also use !s for str and !a for ascii.

nateprewitt

A few minor comments/questions but otherwise I think this looks good!

python-packages/smithy-python/smithy_python/_private/http/__init__.py

nateprewitt · 2023-01-25T16:22:21Z

python-packages/smithy-python/smithy_python/_private/http/__init__.py

+        :param encoding: The string encoding to be used when converting the ``Field``
+        name and value from ``str`` to ``bytes`` for transmission.
+        """
+        init_fields = [fld for fld in initial] if initial is not None else []


Was there a reason we're creating a new list here instead of just using initial? Is the concern we're going to get a generator or some other unindexable value?

The reason I kept this list comprehension here is that I wanted to allow any Iterable in the constructor signature. I could limit the type of initial to just list | None, or I could use list() instead of the list comprehension.

nateprewitt · 2023-01-25T16:28:10Z

python-packages/smithy-python/smithy_python/_private/http/__init__.py

+                f"{', '.join(non_unique_names)}."
+            )
+        init_tuples = zip(init_field_names, init_fields)
+        self.entries: OrderedDict[str, interfaces.http.Field] = OrderedDict(init_tuples)


I know we'd arrived at using an OrderedDict in a previous discussion. The reason we'd arrived at this was so we perform this by default for header ordering?

def __eq__(self, other): return dict.__eq__(self, other) and all(map(_eq, self, other))

Hah, took me a minute to understand. Yes, after the introduction of __iter__ this snippet would indeed work.

And yes, I decided to stick with OrderDict because it gives me the ordering check for free as part of the equality check:

>>> tuples1 = [('a', 1), ('b', 2)] >>> tuples2 = [('b', 2), ('a', 1)] >>> dict1, dict2 = dict(tuples1), dict(tuples2) >>> od1, od2 = OrderedDict(tuples1), OrderedDict(tuples2) >>> dict1 == dict2 True >>> od1 == od2 False

nateprewitt · 2023-01-25T16:30:28Z

python-packages/smithy-python/smithy_python/_private/http/__init__.py

+    def get_by_type(self, kind: FieldPosition) -> list[interfaces.http.Field]:
+        """Helper function for retrieving specific types of fields.
+
+        Used to grab all headers or all trailers.
+        """
+        return [entry for entry in self.entries.values() if entry.kind is kind]


I think this works fine for now. I am curious though if we'd ever want to track this on insertion/removal to avoid iterating every header each time.

You mean make entries a dict of dicts that gets accessed like self.entries[FieldPosition.HEADER]["x-my-header-name"]? Of course doable, but would require double bookkeeping because the API also expects the complete list of all fields to be maintained in order.

nateprewitt · 2023-01-25T16:33:49Z

python-packages/smithy-python/smithy_python/interfaces/http.py

+    """
+    Header field. In HTTP this is a header as defined in RFC 9114 Section 6.3.
+    Implementations of other protocols may use this FieldPosition for similar types
+    of metadata.
+    """


I'm curious where this pattern of including the docstring after what it's discussing. I've only ever seen it in this repo and it strikes me as unintuitive. Am I missing a new convention? 😅

I do this because many years ago I learned that this is the one way to make IDEs pick it up as the docstring. I always assumed but never validated that this is also true for documentation generators. It's surprisingly difficult to find documentation for this behavior, but this Stackoverflow answer suggests that I am not the only one who arrived at this conclusion.

A possible explanation for why it works this way is this: Consider the alternative of putting the enum entry docstring before the entry. The first entry's docstring would be immediately adjacent to the enum class docstring. Python would merge those into one string, and docs generators and IDEs wouldn't know which part belongs to the class and which to the entry:

class MyEnum(Enum): """My class docstring""" """... is directly next to my entry docstring!""" FIRST_ENTRY = 0 """My second entry docstring""" SECOND_ENTRY = 1

I believe also for doc generation not doing it below the variable can cause issues depending on which generator you're using.

An alternative that I've come across if preferred is similar to how we format doc strings for functions:

class MyEnum(Enum): """My class docstring Attributes - - - - - - - - - - FIRST_ENTRY: int First entry doc SECOND_ENTRY: int Second entry doc """

python-packages/smithy-python/smithy_python/interfaces/http.py

Co-authored-by: Nate Prewitt <nate.prewitt@gmail.com>

nateprewitt

⛵

jonemo · 2023-01-29T06:21:12Z

@dlm6693 Merging is currently blocked because your latest review is "changes requested". Anything you still want me to do here?

dlm6693 · 2023-01-29T17:22:26Z

@jonemo gonna take one final pass now

dlm6693

📦

jonemo requested review from nateprewitt and dlm6693 January 19, 2023 08:11

fields & fieldlists

bc06d72

jonemo force-pushed the http-fields branch from c4b27b0 to bc06d72 Compare January 19, 2023 08:16

dlm6693 suggested changes Jan 19, 2023

View reviewed changes

JordonPhillips requested changes Jan 19, 2023

View reviewed changes

dlm6693 reviewed Jan 19, 2023

View reviewed changes

python-packages/smithy-python/smithy_python/_private/http/__init__.py Show resolved Hide resolved

nateprewitt reviewed Jan 19, 2023

View reviewed changes

jonemo added 3 commits January 19, 2023 13:42

docstrings

0dd4bcb

Field.value cannot be None, test escaping of backslashes

c4a9b3f

missing "Protocol"

f635bd4

jonemo force-pushed the http-fields branch from aeeecc3 to f635bd4 Compare January 20, 2023 01:35

jonemo and others added 2 commits January 19, 2023 20:05

dots in docstrings

20fe974

Co-authored-by: Nate Prewitt <nate.prewitt@gmail.com>

updated quoting & escaping rules

db11902

jonemo added 3 commits January 20, 2023 00:13

updated equality rules, check for duplicated initial field names

09b5263

no return type annotation for __init__

356cf18

", " instead of "," as field separator

ff24fcb

jonemo force-pushed the http-fields branch from cb7f4bc to 3d42db4 Compare January 20, 2023 07:33

jonemo added 2 commits January 20, 2023 00:47

__iter__ and improved __repr__ for Fields

06e5c5f

move quote_and_escape_field_value to utility method

d13ee5c

jonemo force-pushed the http-fields branch from cb8a939 to d13ee5c Compare January 20, 2023 07:47

jonemo added 5 commits January 21, 2023 21:14

updated field value quoting and escaping rules

cf3d219

drop Field.get_value_list(), Field.add as_tuples()

af481e3

normalize field names in Fields

8060ea4

accept any iterable for field values in Field and fields in Fields

73e2079

type hints for tests

ad6d285

jonemo force-pushed the http-fields branch from 068a924 to ad6d285 Compare January 22, 2023 04:53

jonemo requested a review from JordonPhillips January 23, 2023 07:14

jonemo requested review from dlm6693 and nateprewitt January 23, 2023 07:14

dlm6693 suggested changes Jan 23, 2023

View reviewed changes

JordonPhillips previously approved these changes Jan 23, 2023

View reviewed changes

python-packages/smithy-python/smithy_python/_private/http/__init__.py Outdated Show resolved Hide resolved

grammer, naming, reprs

d3aed2a

jonemo dismissed JordonPhillips’s stale review via d3aed2a January 23, 2023 21:39

Field.value --> Field.values

5d5afd2

JordonPhillips previously approved these changes Jan 24, 2023

View reviewed changes

nateprewitt reviewed Jan 25, 2023

View reviewed changes

Apply suggestions from code review

6aea01a

Co-authored-by: Nate Prewitt <nate.prewitt@gmail.com>

jonemo dismissed JordonPhillips’s stale review via 6aea01a January 27, 2023 22:14

use kwargs everywhere

c9df2f4

jonemo requested a review from dlm6693 January 28, 2023 00:08

nateprewitt approved these changes Jan 28, 2023

View reviewed changes

dlm6693 approved these changes Jan 29, 2023

View reviewed changes

jonemo merged commit 300a17d into develop Jan 29, 2023

jonemo deleted the http-fields branch January 29, 2023 18:00

jonemo mentioned this pull request Feb 4, 2023

HTTP interface updates (part 2) #131

Merged

jonemo mentioned this pull request Feb 21, 2023

Add doc formatting #137

Merged

		@@ -14,11 +14,13 @@
		# TODO: move all of this out of _private


		from collections import OrderedDict

	(['"quotes"', "val2"], '\\"quotes\\",val2'),
	(['"quotes"', "val2"], '"\\"quotes\\"",val2'),
	(["foo,bar\\", "val2"], '"foo,bar\\\\",val2'),

fields & fieldlists interfaces and implementation #122

fields & fieldlists interfaces and implementation #122

Uh oh!

Conversation

jonemo commented Jan 19, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jonemo Jan 19, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jonemo commented Jan 20, 2023

Uh oh!

jonemo commented Jan 20, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nateprewitt left a comment

jonemo Jan 19, 2023 •

edited

Loading

dlm6693 Jan 28, 2023 •

edited

Loading