-
Notifications
You must be signed in to change notification settings - Fork 2
Instate msgspec type registration mechanism #199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
d6e488f
to
da57d5c
Compare
|
||
def __init__(self, changes: dict[Path, ChangedFile]) -> None: | ||
self.__changes = MappingProxyType(changes) | ||
def __init__(self, changed_files: Iterable[ChangedFile]) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes it so the caller can directly instantiate the class without first forming a mapping which also simplifies the disk storage format so there is no longer a redundant top-level key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, definitely makes it easier to use 👍
@property | ||
def changes(self) -> MappingProxyType[Path, ChangedFile]: | ||
return self.__changes | ||
def paths(self) -> MappingProxyType[str, ChangedFile]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This provides the previous interface of all changes being represented by a mapping of relative paths to the ChangedFile
class and removes the requirement for pass-through dictionary methods. The keys are now strings because msgspec actually doesn't support any other types for dictionary keys which is why the decoding hook had to manually coerce the type.
A benefit to this is that callers possessing str
instances no longer have to import the Path
class for membership checks.
return cls(changes) | ||
|
||
def __or__(self, other: Self) -> Self: | ||
return type(self)(chain(self.files, other.files)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't really find a way to satisfy this other than having a classmethod
(like the removed from_iter
) that returns an instance of cls
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems very clean to me like this anyway, good trick !
assert seen.path == expected_change.path | ||
assert seen.type == expected_change.type | ||
assert seen.patch == expected_change.patch | ||
assert actual.paths == expected.paths |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think relying on the mapping's equality check is equivalent, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the two are functionally equivalent - and probably faster.
However, individually comparing each file and each file's attribute allowed for easier debugging, since you get a lower amount of output on the stdout (only the actually differing part gets printed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love this new mechanism for registering hooks !
src/dda/types/hooks.py
Outdated
from collections.abc import Callable | ||
|
||
|
||
class Hooks(Struct, frozen=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
class Hooks(Struct, frozen=True): | |
class Hook(Struct, frozen=True): |
It feels very weird to me to have a plural class name, usually I prefer reserving the plural version to refer to collections of instances of that class.
Although here I guess we have an encode hook + a decode hook, ergo "hooks" - still I think it makes sense to consider the (encode hook, decode hook) pair as a single "Hook" object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I switched to the singular version.
self.__changes = MappingProxyType(changes) | ||
def __init__(self, changed_files: Iterable[ChangedFile]) -> None: | ||
self.__changed = MappingProxyType({str(c.path): c for c in changed_files}) | ||
self.__files = tuple(self.__changed.values()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why a tuple
instead of a set
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to maintain the order.
return cls(changes) | ||
|
||
def __or__(self, other: Self) -> Self: | ||
return type(self)(chain(self.files, other.files)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems very clean to me like this anyway, good trick !
src/dda/utils/git/changeset.py
Outdated
@cached_property | ||
def __change_types( | ||
self, | ||
) -> dict[Literal[ChangeType.ADDED, ChangeType.MODIFIED, ChangeType.DELETED], MappingProxyType[str, ChangedFile]]: | ||
added: dict[str, ChangedFile] = {} | ||
modified: dict[str, ChangedFile] = {} | ||
deleted: dict[str, ChangedFile] = {} | ||
for change in self.files: | ||
if change.type == ChangeType.ADDED: | ||
added[str(change.path)] = change | ||
elif change.type == ChangeType.MODIFIED: | ||
modified[str(change.path)] = change | ||
elif change.type == ChangeType.DELETED: | ||
deleted[str(change.path)] = change | ||
else: # no cov | ||
msg = f"Unexpected change type: {change.type}" | ||
raise ValueError(msg) | ||
|
||
return { | ||
ChangeType.ADDED: MappingProxyType(added), | ||
ChangeType.MODIFIED: MappingProxyType(modified), | ||
ChangeType.DELETED: MappingProxyType(deleted), | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we not just sort into buckets dynamically ?
@cached_property | |
def __change_types( | |
self, | |
) -> dict[Literal[ChangeType.ADDED, ChangeType.MODIFIED, ChangeType.DELETED], MappingProxyType[str, ChangedFile]]: | |
added: dict[str, ChangedFile] = {} | |
modified: dict[str, ChangedFile] = {} | |
deleted: dict[str, ChangedFile] = {} | |
for change in self.files: | |
if change.type == ChangeType.ADDED: | |
added[str(change.path)] = change | |
elif change.type == ChangeType.MODIFIED: | |
modified[str(change.path)] = change | |
elif change.type == ChangeType.DELETED: | |
deleted[str(change.path)] = change | |
else: # no cov | |
msg = f"Unexpected change type: {change.type}" | |
raise ValueError(msg) | |
return { | |
ChangeType.ADDED: MappingProxyType(added), | |
ChangeType.MODIFIED: MappingProxyType(modified), | |
ChangeType.DELETED: MappingProxyType(deleted), | |
} | |
@cached_property | |
def __change_types( | |
self, | |
) -> dict[ChangeType, MappingProxyType[str, ChangedFile]]: | |
tmp: defaultdict[ChangeType, dict[str, ChangedFile]] = defaultdict(dict) | |
for change in self.files: | |
tmp[change.type][str(change.path)] = change | |
return {k: MappingProxyType(v) for k, v in tmp.items()} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, don't know what I was thinking. Thanks!
|
||
def __init__(self, changes: dict[Path, ChangedFile]) -> None: | ||
self.__changes = MappingProxyType(changes) | ||
def __init__(self, changed_files: Iterable[ChangedFile]) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, definitely makes it easier to use 👍
assert seen.path == expected_change.path | ||
assert seen.type == expected_change.type | ||
assert seen.patch == expected_change.patch | ||
assert actual.paths == expected.paths |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the two are functionally equivalent - and probably faster.
However, individually comparing each file and each file's attribute allowed for easier debugging, since you get a lower amount of output on the stdout (only the actually differing part gets printed).
@property | ||
def modified(self) -> MappingProxyType[str, ChangedFile]: | ||
"""Set of files that were modified.""" | ||
return self.__change_types[ChangeType.MODIFIED] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should convert our keys to Path
objects within these methods, if the expectation is that our callers will mostly be using Path
s instead of raw str
s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If so then they can always str(p)
whereas membership checks for callers that only have str
objects would require an import.
* Instate msgspec type registration mechanism * Update ChangeSet API * address review 695d20c
There are 2 commits: