-
Notifications
You must be signed in to change notification settings - Fork 17
fields & fieldlists interfaces and implementation #122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
python-packages/smithy-python/smithy_python/_private/http/__init__.py
Outdated
Show resolved
Hide resolved
python-packages/smithy-python/smithy_python/_private/http/__init__.py
Outdated
Show resolved
Hide resolved
python-packages/smithy-python/smithy_python/_private/http/__init__.py
Outdated
Show resolved
Hide resolved
python-packages/smithy-python/smithy_python/_private/http/__init__.py
Outdated
Show resolved
Hide resolved
python-packages/smithy-python/smithy_python/_private/http/__init__.py
Outdated
Show resolved
Hide resolved
@@ -14,11 +14,13 @@ | |||
# TODO: move all of this out of _private | |||
|
|||
|
|||
from collections import OrderedDict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need this - all dicts are insertion-ordered in python now as part of the contract, and you don't seem to be be relying on any of the remaining niche features of OrderedDict
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know. And I agree that this should be discussed. For now this is taken verbatim from the spec which says this:
entries: OrderedDict[str, Field] # OrderedMap<String, Field>
See also this comment thread above: #122 (comment)
cc @nateprewitt to chime in since he wrote that spec
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code in the specification was to convey the ideas to a broad audience of language teams. I have some oddities like this in the orderedness of Python dictionaries isn't necessarily widely known. So the pseudo-code examples shouldn't be taken as gospel beyond maybe interface framing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hah, I'm glad we talked about it then. So we don't need to keep OrderedDict
for it's order-maintaining property. But now I know why the order matters and it means that I need to fix my Fields.__eq__
implementation. And the new __eq__
will benefit form the equality definition of OrderedDict
, so I'm going to keep it anyway!
python-packages/smithy-python/smithy_python/_private/http/__init__.py
Outdated
Show resolved
Hide resolved
python-packages/smithy-python/smithy_python/_private/http/__init__.py
Outdated
Show resolved
Hide resolved
python-packages/smithy-python/smithy_python/_private/http/__init__.py
Outdated
Show resolved
Hide resolved
python-packages/smithy-python/smithy_python/_private/http/__init__.py
Outdated
Show resolved
Hide resolved
(["v,a,l,1", "val2"], '"v,a,l,1",val2'), | ||
# Double quotes are escaped with a single backslash. The second backslash below | ||
# is for escaping the actual backslash in the string for Python. | ||
(['"quotes"', "val2"], '\\"quotes\\",val2'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The RFC only mentions escaping in the context of quoted values. You also need to test for escaping backslashes.
(['"quotes"', "val2"], '\\"quotes\\",val2'), | |
(['"quotes"', "val2"], '"\\"quotes\\"",val2'), | |
(["foo,bar\\", "val2"], '"foo,bar\\\\",val2'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if I only escape quotes inside of quoted values, then values with pre-existing quotes are indistinguishable from those were smithy-python added the quotes:
raw field value | serialized value in header |
---|---|
a,b,c |
"a,b,c" |
"a,b,c" |
"a,b,c" or "\"a,b,c"\" ? |
Is that what we want?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You always escape double quotes, but when you do you need to wrap the whole element in double quotes. So:
['a', '"', 'b']
-> "a,\",b"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am surprised by that. Nowhere in the discussion so far have we considered putting quotes around multiple Field values.
python-packages/smithy-python/smithy_python/_private/http/__init__.py
Outdated
Show resolved
Hide resolved
python-packages/smithy-python/smithy_python/_private/http/__init__.py
Outdated
Show resolved
Hide resolved
python-packages/smithy-python/smithy_python/_private/http/__init__.py
Outdated
Show resolved
Hide resolved
@@ -14,11 +14,13 @@ | |||
# TODO: move all of this out of _private | |||
|
|||
|
|||
from collections import OrderedDict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code in the specification was to convey the ideas to a broad audience of language teams. I have some oddities like this in the orderedness of Python dictionaries isn't necessarily widely known. So the pseudo-code examples shouldn't be taken as gospel beyond maybe interface framing.
python-packages/smithy-python/smithy_python/_private/http/__init__.py
Outdated
Show resolved
Hide resolved
Co-authored-by: Nate Prewitt <nate.prewitt@gmail.com>
I updated the field value serialization logic in a way that incorporates all (?) the suggestions and constraints from sigv4 test cases. Specifically:
I understand how we arrived here, but I don't like it because:
|
Note to self, I haven't dealt with this yet:
|
python-packages/smithy-python/smithy_python/_private/http/__init__.py
Outdated
Show resolved
Hide resolved
python-packages/smithy-python/smithy_python/_private/http/__init__.py
Outdated
Show resolved
Hide resolved
python-packages/smithy-python/smithy_python/_private/http/__init__.py
Outdated
Show resolved
Hide resolved
|
||
See :func:`Field.as_string` for quoting and escaping logic. | ||
""" | ||
CHARS_TO_QUOTE = (",", '"') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move this to a module-level constant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? It's an implementation detail of this function. I would prefer to not make it part of the public interface of this module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not? We don't need to redefine it every time this function gets called. It is a constant after all. Assuming that's why you made it upper case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not?
Because then others can start importing it and changing it becomes a backwards-incompatible change.
I made it lowercase to make it look less like a module-level constant.
python-packages/smithy-python/smithy_python/_private/http/__init__.py
Outdated
Show resolved
Hide resolved
python-packages/smithy-python/smithy_python/_private/http/__init__.py
Outdated
Show resolved
Hide resolved
python-packages/smithy-python/smithy_python/_private/http/__init__.py
Outdated
Show resolved
Hide resolved
@@ -193,16 +193,16 @@ def __eq__(self, other: object) -> bool: | |||
) | |||
|
|||
def __repr__(self) -> str: | |||
return f'Field(name="{self.name}", value=[{self.value}], kind={self.kind})' | |||
return f"Field(name={self.name!r}, value={self.value!r}, kind={self.kind!r})" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, you can also use !s
for str
and !a
for ascii
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor comments/questions but otherwise I think this looks good!
python-packages/smithy-python/smithy_python/_private/http/__init__.py
Outdated
Show resolved
Hide resolved
:param encoding: The string encoding to be used when converting the ``Field`` | ||
name and value from ``str`` to ``bytes`` for transmission. | ||
""" | ||
init_fields = [fld for fld in initial] if initial is not None else [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was there a reason we're creating a new list here instead of just using initial
? Is the concern we're going to get a generator or some other unindexable value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I kept this list comprehension here is that I wanted to allow any Iterable in the constructor signature. I could limit the type of initial
to just list | None
, or I could use list()
instead of the list comprehension.
f"{', '.join(non_unique_names)}." | ||
) | ||
init_tuples = zip(init_field_names, init_fields) | ||
self.entries: OrderedDict[str, interfaces.http.Field] = OrderedDict(init_tuples) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know we'd arrived at using an OrderedDict in a previous discussion. The reason we'd arrived at this was so we perform this by default for header ordering?
def __eq__(self, other):
return dict.__eq__(self, other) and all(map(_eq, self, other))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hah, took me a minute to understand. Yes, after the introduction of __iter__
this snippet would indeed work.
And yes, I decided to stick with OrderDict
because it gives me the ordering check for free as part of the equality check:
>>> tuples1 = [('a', 1), ('b', 2)]
>>> tuples2 = [('b', 2), ('a', 1)]
>>> dict1, dict2 = dict(tuples1), dict(tuples2)
>>> od1, od2 = OrderedDict(tuples1), OrderedDict(tuples2)
>>> dict1 == dict2
True
>>> od1 == od2
False
def get_by_type(self, kind: FieldPosition) -> list[interfaces.http.Field]: | ||
"""Helper function for retrieving specific types of fields. | ||
|
||
Used to grab all headers or all trailers. | ||
""" | ||
return [entry for entry in self.entries.values() if entry.kind is kind] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this works fine for now. I am curious though if we'd ever want to track this on insertion/removal to avoid iterating every header each time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean make entries a dict of dicts that gets accessed like self.entries[FieldPosition.HEADER]["x-my-header-name"]
? Of course doable, but would require double bookkeeping because the API also expects the complete list of all fields to be maintained in order.
""" | ||
Header field. In HTTP this is a header as defined in RFC 9114 Section 6.3. | ||
Implementations of other protocols may use this FieldPosition for similar types | ||
of metadata. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious where this pattern of including the docstring after what it's discussing. I've only ever seen it in this repo and it strikes me as unintuitive. Am I missing a new convention? 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do this because many years ago I learned that this is the one way to make IDEs pick it up as the docstring. I always assumed but never validated that this is also true for documentation generators. It's surprisingly difficult to find documentation for this behavior, but this Stackoverflow answer suggests that I am not the only one who arrived at this conclusion.
A possible explanation for why it works this way is this: Consider the alternative of putting the enum entry docstring before the entry. The first entry's docstring would be immediately adjacent to the enum class docstring. Python would merge those into one string, and docs generators and IDEs wouldn't know which part belongs to the class and which to the entry:
class MyEnum(Enum):
"""My class docstring"""
"""... is directly next to my entry docstring!"""
FIRST_ENTRY = 0
"""My second entry docstring"""
SECOND_ENTRY = 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe also for doc generation not doing it below the variable can cause issues depending on which generator you're using.
An alternative that I've come across if preferred is similar to how we format doc strings for functions:
class MyEnum(Enum):
"""My class docstring
Attributes
- - - - - - - - - -
FIRST_ENTRY: int
First entry doc
SECOND_ENTRY: int
Second entry doc
"""
Co-authored-by: Nate Prewitt <nate.prewitt@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⛵
@dlm6693 Merging is currently blocked because your latest review is "changes requested". Anything you still want me to do here? |
@jonemo gonna take one final pass now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
📦
This separates the
Field
andFields
interfaces, implementation, and tests from my WIP branchhttp-interface-update-2
. The goals of doing this areThis PR only contains the interface and implementation of
Field
andFields
. All the work in actually using them for requests and responses happens on thehttp-interface-update-2
branch.By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.