-
-
Notifications
You must be signed in to change notification settings - Fork 417
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(types): add some types based on MonkeyType #1152
Conversation
Signed-off-by: Henry Schreiner <henryschreineriii@gmail.com>
return isinstance(value, self.types_to_memoize) | ||
|
||
def serialize(self): | ||
def serialize(self) -> Dict[int, Any]: # type: ignore[override] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This time I just ignored the mismatch.
@@ -134,7 +151,8 @@ def get_regexp_width(expr): | |||
raise ImportError('`regex` module must be installed in order to use Unicode categories.', expr) | |||
regexp_final = expr | |||
try: | |||
return [int(x) for x in sre_parse.parse(regexp_final).getwidth()] | |||
# TODO: bug? Returns an int? | |||
return [int(x) for x in sre_parse.parse(regexp_final).getwidth()] # type: ignore[attr-defined] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sre_parse.parse(regexp_final).getwidth()
returns an int, which you can't iterate over. Not sure what was intended here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getwidth() returns a tuple of (min_width, max_width)
>>> sre_parse.parse('a+').getwidth()
(1, 4294967295)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it's quite an obscure feature. Most re libraries in other languages don't even have it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in typeshed now, but missed today's mypy 0.960 release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, can't resolve until I fix the comment (tomorrow).
return [x for x in l if not (x in dedup or dedup.add(x))] | ||
# This returns None, but that's expected | ||
return [x for x in l if not (x in dedup or dedup.add(x))] # type: ignore[func-returns-value] | ||
# 2x faster (ordered in PyPy and CPython 3.6+, gaurenteed to be ordered in Python 3.7+) | ||
# return list(dict.fromkeys(l)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This surprises mypy, since it's using the None return. But I think using list(dict.fromkeys(l))
is simpler and faster (2x when I was trying it on a million ints).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't write this function, but looks like for someone else it was the opposite. Don't know why.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this was before 3.6+ was required, like during the 2.7 days, then this wouldn't work. And OrderedDict is 2x slower (OrderedDict and the current implementation are nearly identical for me). I'm not testing on the actual data, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you test it on heavily duplicated lists?
It's possible that fromkeys() keeps overwriting existing values, making it slower in that case.
(Just guessing, no idea if it's true)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, 1,000,000 items with 100 unique values.
Signed-off-by: Henry Schreiner <henryschreineriii@gmail.com>
d156be7
to
f2e73d7
Compare
Signed-off-by: Henry Schreiner <henryschreineriii@gmail.com>
@henryiii I'm sorry it took me so long to review this PR! I will try to be more prompt in the future. I have two issues with this PR:
I created a new PR where I revert only these changes. Let me know if you agree with it - #1191 |
That's fine, I'm assuming this will be an iterative process, and that's fine. :) |
Typing is a bit too sparse to run mypyc, so I've tried to fill in a couple of modules (lark.py and utils.py) with the help of monkeytype. This is based on what was seen running the tests more than what things should be, so input and cleanup would be helpful!