Skip to content

bpo-37884: Optimize Fraction() and statistics.mean() #15329

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 8 additions & 19 deletions Lib/fractions.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
import operator
import re
import sys
from math import _as_integer_ratio

__all__ = ['Fraction', 'gcd']

Expand Down Expand Up @@ -115,22 +116,7 @@ def __new__(cls, numerator=0, denominator=None, *, _normalize=True):
self = super(Fraction, cls).__new__(cls)

if denominator is None:
if type(numerator) is int:
self._numerator = numerator
self._denominator = 1
return self

elif isinstance(numerator, numbers.Rational):
self._numerator = numerator.numerator
self._denominator = numerator.denominator
return self

elif isinstance(numerator, (float, Decimal)):
# Exact conversion
self._numerator, self._denominator = numerator.as_integer_ratio()
return self

elif isinstance(numerator, str):
if isinstance(numerator, str):
# Handle construction from strings.
m = _RATIONAL_FORMAT.match(numerator)
if m is None:
Expand All @@ -156,10 +142,13 @@ def __new__(cls, numerator=0, denominator=None, *, _normalize=True):
denominator *= 10**-exp
if m.group('sign') == '-':
numerator = -numerator

else:
raise TypeError("argument should be a string "
"or a Rational instance")
try:
self._numerator, self._denominator = _as_integer_ratio(numerator)
return self
except TypeError:
raise TypeError("argument should be a string or a number, "
Copy link
Contributor

@aeros aeros Aug 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the name of the encompassing type is "numeric" rather than "number", can we adjust the error message?

Suggested change
raise TypeError("argument should be a string or a number, "
raise TypeError("argument type should be str or numeric, "

Source: https://docs.python.org/3.9/library/stdtypes.html#numeric-types-int-float-complex

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But an instance of a numeric type is a number, is not?

And if use "argument type", it should be "str", not "string".

Copy link
Contributor

@aeros aeros Aug 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But an instance of a numeric type is a number, is not?

It is, but I think that it's a bit more useful to users to specify the actual types in this case since it's a TypeError. Also, when searching the docs for "number", the relevant documentation page ("Built-in Types") does not come up as a suggestion, instead they'll likely encounter the page for the "numbers" module (which would not be relevant to the error). When searching for "numeric", more relevant results are found, including "Built-in Types".

And if use "argument type", it should be "str", not "string".

I'll update the suggestion accordingly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a standard message used in many sites. For example:

>>> float([])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: float() argument must be a string or a number, not 'list'

If you want to change it, please open a separate issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to change it, please open a separate issue.

Alright, I'll take a look at what other areas use this error message and consider whether or not it should be addressed. That would definitely go outside of the scope of this PR.

"not %s" % type(numerator).__name__) from None

elif type(numerator) is int is type(denominator):
pass # *very* normal case
Expand Down
18 changes: 1 addition & 17 deletions Lib/statistics.py
Original file line number Diff line number Diff line change
Expand Up @@ -225,27 +225,11 @@ def _exact_ratio(x):
x is expected to be an int, Fraction, Decimal or float.
"""
try:
# Optimise the common case of floats. We expect that the most often
# used numeric type will be builtin floats, so try to make this as
# fast as possible.
if type(x) is float or type(x) is Decimal:
return x.as_integer_ratio()
try:
# x may be an int, Fraction, or Integral ABC.
return (x.numerator, x.denominator)
except AttributeError:
try:
# x may be a float or Decimal subclass.
return x.as_integer_ratio()
except AttributeError:
# Just give up?
pass
return math._as_integer_ratio(x)
except (OverflowError, ValueError):
# float NAN or INF.
assert not _isfinite(x)
return (x, None)
msg = "can't convert type '{}' to numerator/denominator"
raise TypeError(msg.format(type(x).__name__))


def _convert(value, T):
Expand Down
30 changes: 30 additions & 0 deletions Lib/test/test_math.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@
import random
import struct
import sys
from decimal import Decimal
from fractions import Fraction


eps = 1E-05
Expand Down Expand Up @@ -293,6 +295,34 @@ def testAcosh(self):
self.assertRaises(ValueError, math.acosh, NINF)
self.assertTrue(math.isnan(math.acosh(NAN)))

@support.cpython_only
def testAsIntegerRatio(self):
as_integer_ratio = math._as_integer_ratio
self.assertEqual(as_integer_ratio(0), (0, 1))
self.assertEqual(as_integer_ratio(3), (3, 1))
self.assertEqual(as_integer_ratio(-3), (-3, 1))
self.assertEqual(as_integer_ratio(False), (0, 1))
self.assertEqual(as_integer_ratio(True), (1, 1))
self.assertEqual(as_integer_ratio(0.0), (0, 1))
self.assertEqual(as_integer_ratio(-0.0), (0, 1))
self.assertEqual(as_integer_ratio(0.875), (7, 8))
self.assertEqual(as_integer_ratio(-0.875), (-7, 8))
self.assertEqual(as_integer_ratio(Decimal('0')), (0, 1))
self.assertEqual(as_integer_ratio(Decimal('0.875')), (7, 8))
self.assertEqual(as_integer_ratio(Decimal('-0.875')), (-7, 8))
self.assertEqual(as_integer_ratio(Fraction(0)), (0, 1))
self.assertEqual(as_integer_ratio(Fraction(7, 8)), (7, 8))
self.assertEqual(as_integer_ratio(Fraction(-7, 8)), (-7, 8))

self.assertRaises(OverflowError, as_integer_ratio, float('inf'))
self.assertRaises(OverflowError, as_integer_ratio, float('-inf'))
self.assertRaises(ValueError, as_integer_ratio, float('nan'))
self.assertRaises(OverflowError, as_integer_ratio, Decimal('inf'))
self.assertRaises(OverflowError, as_integer_ratio, Decimal('-inf'))
self.assertRaises(ValueError, as_integer_ratio, Decimal('nan'))

self.assertRaises(TypeError, as_integer_ratio, '0')

def testAsin(self):
self.assertRaises(TypeError, math.asin)
self.ftest('asin(-1)', math.asin(-1), -math.pi/2)
Expand Down
14 changes: 13 additions & 1 deletion Modules/clinic/mathmodule.c.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

78 changes: 78 additions & 0 deletions Modules/mathmodule.c
Original file line number Diff line number Diff line change
Expand Up @@ -3306,9 +3306,87 @@ math_comb_impl(PyObject *module, PyObject *n, PyObject *k)
}


/*[clinic input]
math._as_integer_ratio
x: object
/

Return integer ratio.

Return a pair of integers, whose ratio is exactly equal to the original
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we specify that the pair of integers returned are contained within a tuple?

Suggested change
Return a pair of integers, whose ratio is exactly equal to the original
Return a tuple containing a pair of integers, whose ratio is exactly equal to the original

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was copied from the docstring of float.as_integer_ratio. Other as_integer_ratio methods have similar wording.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This suggestion was based on the docstring Raymond recently created for Fraction.as_integer_ratio(), I made a similar suggestion that he added to the PR.

If it would be helpful, I could create a separate PR to make a similar change to float.as_integer_ratio().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tuple is the pair, so I find "tuple containing a pair" a bit confusing. That almost sounds like ((numerator, denominator),) or something. Why not be explicit and write "a tuple (numerator, denominator)"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not be explicit and write "a tuple (numerator, denominator)"?

That would be an improvement. I mostly just wanted to specify that the function returned a tuple.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Pair" is a synonym of 2-element tuple. If you think that this term is incorrect, please open a separate issue and analyze all uses of in in the code and in the documentation (there are a lot of occurrences).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Pair" is a synonym of 2-element tuple.

Either is probably fine, I just figured it was worth trying to be more technically descriptive. Thanks.

number and with a positive denominator.
[clinic start generated code]*/

static PyObject *
math__as_integer_ratio(PyObject *module, PyObject *x)
/*[clinic end generated code: output=2e4f43d93f6e7850 input=b54b48dd6bbe22ea]*/
{
_Py_IDENTIFIER(as_integer_ratio);
_Py_IDENTIFIER(numerator);
_Py_IDENTIFIER(denominator);
PyObject *ratio, *as_integer_ratio, *numerator, *denominator;

if (PyLong_CheckExact(x)) {
return PyTuple_Pack(2, x, _PyLong_One);
}

if (_PyObject_LookupAttrId(x, &PyId_as_integer_ratio, &as_integer_ratio) < 0) {
Copy link
Contributor

@aeros aeros Aug 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the context, I can roughly tell what this conditional is doing. From my understanding, _PyObject_LookupAttrId() is assessing whether or not the PyObject x contains an as_integer_ratio attribute. If a value less than zero is returned (usually -1), it does not contain that attribute.

However, I'm not certain that I understand where PyId_as_integer_ratio is coming from or how PyId actually works. I was unable to find any documentation on PyId or _Py_IDENTIFIER(), so I'm guessing it's an internal part of the C-API (since it's prefixed with an underscore).

My best guess is that a reference to PyId_as_integer_ratio was created when _Py_IDENTIFIER(as_integer_ratio) was used.

I'm fairly new to the C-API, so I'm trying to learn more about it so that I can be more helpful in PR reviews that involve it. Particularly the internal implementation details that aren't in the documentation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is described in the header: Include/cpython/object.h.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks for letting me know where to look. The code comments there addressed my question:

PyId_foo is a static variable, either on block level or file level. On first usage, the string "foo" is interned, and the structures are linked.

return NULL;
}
if (as_integer_ratio) {
ratio = _PyObject_CallNoArg(as_integer_ratio);
Py_DECREF(as_integer_ratio);
if (ratio == NULL) {
return NULL;
}
if (!PyTuple_Check(ratio)) {
PyErr_Format(PyExc_TypeError,
"unexpected return type from as_integer_ratio(): "
"expected tuple, got '%.200s'",
Py_TYPE(ratio)->tp_name);
Py_DECREF(ratio);
return NULL;
}
if (PyTuple_GET_SIZE(ratio) != 2) {
PyErr_SetString(PyExc_ValueError,
"as_integer_ratio() must return a 2-tuple");
Py_DECREF(ratio);
return NULL;
}
}
else {
if (_PyObject_LookupAttrId(x, &PyId_numerator, &numerator) < 0) {
return NULL;
}
if (numerator == NULL) {
PyErr_Format(PyExc_TypeError,
"required a number, not '%.200s'",
Py_TYPE(x)->tp_name);
return NULL;
}
if (_PyObject_LookupAttrId(x, &PyId_denominator, &denominator) < 0) {
Py_DECREF(numerator);
return NULL;
}
if (denominator == NULL) {
Py_DECREF(numerator);
PyErr_Format(PyExc_TypeError,
"required a number, not '%.200s'",
Py_TYPE(x)->tp_name);
return NULL;
}
ratio = PyTuple_Pack(2, numerator, denominator);
Py_DECREF(numerator);
Py_DECREF(denominator);
}
return ratio;
}


static PyMethodDef math_methods[] = {
{"acos", math_acos, METH_O, math_acos_doc},
{"acosh", math_acosh, METH_O, math_acosh_doc},
MATH__AS_INTEGER_RATIO_METHODDEF
{"asin", math_asin, METH_O, math_asin_doc},
{"asinh", math_asinh, METH_O, math_asinh_doc},
{"atan", math_atan, METH_O, math_atan_doc},
Expand Down