-
-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-37884: Optimize Fraction() and statistics.mean() #15329
bpo-37884: Optimize Fraction() and statistics.mean() #15329
Conversation
self._numerator, self._denominator = math._as_integer_ratio(numerator) | ||
return self | ||
except TypeError: | ||
raise TypeError("argument should be a string or a number, " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the name of the encompassing type is "numeric" rather than "number", can we adjust the error message?
raise TypeError("argument should be a string or a number, " | |
raise TypeError("argument type should be str or numeric, " |
Source: https://docs.python.org/3.9/library/stdtypes.html#numeric-types-int-float-complex
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But an instance of a numeric type is a number, is not?
And if use "argument type", it should be "str", not "string".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But an instance of a numeric type is a number, is not?
It is, but I think that it's a bit more useful to users to specify the actual types in this case since it's a TypeError
. Also, when searching the docs for "number", the relevant documentation page ("Built-in Types") does not come up as a suggestion, instead they'll likely encounter the page for the "numbers" module (which would not be relevant to the error). When searching for "numeric", more relevant results are found, including "Built-in Types".
And if use "argument type", it should be "str", not "string".
I'll update the suggestion accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a standard message used in many sites. For example:
>>> float([])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: float() argument must be a string or a number, not 'list'
If you want to change it, please open a separate issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to change it, please open a separate issue.
Alright, I'll take a look at what other areas use this error message and consider whether or not it should be addressed. That would definitely go outside of the scope of this PR.
|
||
Return integer ratio. | ||
|
||
Return a pair of integers, whose ratio is exactly equal to the original |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we specify that the pair of integers returned are contained within a tuple?
Return a pair of integers, whose ratio is exactly equal to the original | |
Return a tuple containing a pair of integers, whose ratio is exactly equal to the original |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was copied from the docstring of float.as_integer_ratio
. Other as_integer_ratio
methods have similar wording.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This suggestion was based on the docstring Raymond recently created for Fraction.as_integer_ratio()
, I made a similar suggestion that he added to the PR.
If it would be helpful, I could create a separate PR to make a similar change to float.as_integer_ratio()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tuple is the pair, so I find "tuple containing a pair" a bit confusing. That almost sounds like ((numerator, denominator),)
or something. Why not be explicit and write "a tuple (numerator, denominator)
"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not be explicit and write "a tuple (numerator, denominator)"?
That would be an improvement. I mostly just wanted to specify that the function returned a tuple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Pair" is a synonym of 2-element tuple. If you think that this term is incorrect, please open a separate issue and analyze all uses of in in the code and in the documentation (there are a lot of occurrences).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Pair" is a synonym of 2-element tuple.
Either is probably fine, I just figured it was worth trying to be more technically descriptive. Thanks.
Am I right that this is the same as #15210 but with a private function instead? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I performed the same tests mentioned in the bpo issue to verify the results and compared between the latest commit to master (master:24fe46081b) to the PR branch (math-as_integer_ratio2:e658d19083).
OS: Arch Linux 5.2.8
CPU: Intel i5-4460
./python -m timeit -s "from fractions import Fraction as F" "F(123)"
200000 loops, best of 5
Master: 1.42 usec
PR: 1.55 usec
./python -m timeit -s "from fractions import Fraction as F" "F(1.23)"
100000 loops, best of 5
Master: 2.92 usec per loop
PR: 2.14 usec
./python -m timeit -s "from fractions import Fraction as F; f = F(22, 7)" "F(f)"
100000 loops, best of 5
Master: 2.47 usec
PR: 1.93 usec
./python -m timeit -s "from statistics import mean; a = [1]*1000" "mean(a)"
500 loops, best of 5
Master: 930 usec
PR: 640 usec
./python -m timeit -s "from statistics import mean; a = [1.23]*1000" "mean(a)"
200 loops, best of 5
Master: 1.31 msec
PR: 1.34 msec
./python -m timeit -s "from statistics import mean; from fractions import Fraction as F; a = [F(22, 7)]*1000" "mean(a)"
200 loops, best of 5
Master: 1.31 msec
PR: 1.09 msec
./python -m timeit -s "from statistics import mean; from decimal import Decimal as D; a = [D('1.23')]*1000" "mean(a)"
100 loops, best of 5
Master: 2.08 msec
PR: 2 msec
It also looks like the Travis doctest and docbuild was failing on the other PR. As far as functionality goes, the changes to the code in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Supporting arbitrary objects with an as_integer_ratio
method is a non-trivial change and should be documented in a NEWS
entry.
Yes, it is.
Agreed. |
return PyTuple_Pack(2, x, _PyLong_One); | ||
} | ||
|
||
if (_PyObject_LookupAttrId(x, &PyId_as_integer_ratio, &as_integer_ratio) < 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the context, I can roughly tell what this conditional is doing. From my understanding, _PyObject_LookupAttrId()
is assessing whether or not the PyObject x contains an as_integer_ratio
attribute. If a value less than zero is returned (usually -1), it does not contain that attribute.
However, I'm not certain that I understand where PyId_as_integer_ratio
is coming from or how PyId
actually works. I was unable to find any documentation on PyId
or _Py_IDENTIFIER()
, so I'm guessing it's an internal part of the C-API (since it's prefixed with an underscore).
My best guess is that a reference to PyId_as_integer_ratio
was created when _Py_IDENTIFIER(as_integer_ratio)
was used.
I'm fairly new to the C-API, so I'm trying to learn more about it so that I can be more helpful in PR reviews that involve it. Particularly the internal implementation details that aren't in the documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is described in the header: Include/cpython/object.h
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thanks for letting me know where to look. The code comments there addressed my question:
PyId_foo is a static variable, either on block level or file level. On first usage, the string "foo" is interned, and the structures are linked.
https://bugs.python.org/issue37884