Arithmetic on tensor module elements, manifold objects: Always Return a Copy #30302

mjungmath · 2020-08-06T09:59:08Z

This question arose from ticket #30239, comment:36.

Should FiniteRankFreeModule and manifold objects always return a mutable copy, even for trivial operations? At least, this would be a consistent behavior.

As pointed out by Matthias, this already holds true for FreeModule:

sage: M = FreeModule(QQ, 3)
sage: v = M([1,2,3])
sage: w = v + 0
sage: w == v
True
sage: w is v
False

I feel quite torn about this, but slightly tend to the copy-version.

Addendum:

For FreeModule, we also have the following behavior:

sage: M = FreeModule(QQ, 3)
sage: M(0)
(0, 0, 0)
sage: M.zero()
(0, 0, 0)
sage: M.zero() is M(0)
False

I don't think that a parent should do that, especially when it already has a zero method. Should that be changed?

CC: @egourgoulhon @tscrim @mkoeppe

Component: misc

Branch/Commit: u/gh-mjungmath/return_copy_always_scalarfields @ b7b0b3f

Issue created by migration from https://trac.sagemath.org/ticket/30302

The text was updated successfully, but these errors were encountered:

mkoeppe · 2020-08-06T11:44:50Z

comment:3

Both M() and M(0) return a new mutable element. That's how one creates a new vector, whose components can then be modified. If you want the immutable 0 element, you can use M.zero().

The arithmetic operations should either always create an immutable element or always create a new mutable element. It would be very inconvenient if the result of an operation was sometimes immutable, sometimes a new mutable element, depending on the input. (And it would be highly problematic if sometimes it would return an existing mutable element.)

egourgoulhon · 2020-08-06T14:38:56Z

comment:4

My concern here would be performance. For complicated symbolic expressions, creating a copy can have a significant CPU cost. The main concern is about scalar fields, because tensor fields arithmetics always end up to scalar fields arithmetics, the scalar fields being the tensor components in a given frame. For the time being, ScalarField._add_() starts with

        if self._is_zero:
            return other
        if other._is_zero:
            return self

If we change this to return a copy (I understand the arguments put forward by Matthias), then I would advocate for extensive benchmarks, with complicated tensor fields (those in the doctests are too simple), such as in those mentionned here.

mkoeppe · 2020-08-06T16:54:39Z

comment:5

Symbolic expressions are immutable and therefore do not need to be copied.

The cost of making a copy should be dominated by copying the dictionaries in the Components objects.

mjungmath · 2020-08-06T17:14:20Z

comment:6

If making a copy is so costly, we should try to optimize it.

If I remember correctly, until Sage 9 or so, all elements emerged from arithmetics were created from scratch, even the trivial ones. Even if copying would slow the current code down a bit, it should be still faster than versions before Sage 9.

egourgoulhon · 2020-08-06T19:04:27Z

comment:7

Replying to @mjungmath:

If making a copy is so costly, we should try to optimize it.

Copying a symbolic expression with hundreds of terms will always remain slower than returning self or other.

If I remember correctly, until Sage 9 or so, all elements emerged from arithmetics were created from scratch, even the trivial ones.

That's not true: already in Sage 7.4, we have the code snippet shown in comment:4.

mjungmath · 2020-08-06T19:04:59Z

comment:8

Replying to @mjungmath:

If I remember correctly, until Sage 9 or so, all elements emerged from arithmetics were created from scratch, even the trivial ones. Even if copying would slow the current code down a bit, it should be still faster than versions before Sage 9.

My memory was partially wrong. For scalar fields, the _is_zero variable was already implemented, for tensor fields it was not.

Furtermore, I've compared some computation times:

Current state:

sage: M = Manifold(2, 'M', structure='topological') # the 2-dimensional sphere S^2
sage: U = M.open_subset('U') # complement of the North pole
sage: c_xy.<x,y> = U.chart() # stereographic coordinates from the North pole
sage: V = M.open_subset('V') # complement of the South pole
sage: c_uv.<u,v> = V.chart() # stereographic coordinates from the South pole
sage: M.declare_union(U,V)   # S^2 is the union of U and V
sage: xy_to_uv = c_xy.transition_map(c_uv, (x/(x^2+y^2), y/(x^2+y^2)),
....:                                intersection_name='W',
....:                                restrictions1= x^2+y^2!=0,
....:                                restrictions2= u^2+v^2!=0)
sage: uv_to_xy = xy_to_uv.inverse()
sage: f = M.scalar_field({c_xy: 1/(1+x^2+y^2), c_uv: (u^2+v^2)/(1+u^2+v^2)},
....:                    name='f')
sage: %timeit f+0
The slowest run took 177.32 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 5: 4.22 µs per loop
sage: %timeit f*1
The slowest run took 50.53 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 5: 11.8 µs per loop

Returning a copy:

sage: M = Manifold(2, 'M', structure='topological') # the 2-dimensional sphere S^2
sage: U = M.open_subset('U') # complement of the North pole
sage: c_xy.<x,y> = U.chart() # stereographic coordinates from the North pole
sage: V = M.open_subset('V') # complement of the South pole
sage: c_uv.<u,v> = V.chart() # stereographic coordinates from the South pole
sage: M.declare_union(U,V)   # S^2 is the union of U and V
sage: xy_to_uv = c_xy.transition_map(c_uv, (x/(x^2+y^2), y/(x^2+y^2)),
....:                                intersection_name='W',
....:                                restrictions1= x^2+y^2!=0,
....:                                restrictions2= u^2+v^2!=0)
sage: uv_to_xy = xy_to_uv.inverse()
sage: f = M.scalar_field({c_xy: 1/(1+x^2+y^2), c_uv: (u^2+v^2)/(1+u^2+v^2)},
....:                    name='f')
sage: %timeit f+0
The slowest run took 23.78 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 5: 60.6 µs per loop
sage: %timeit f*1
The slowest run took 24.43 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 5: 33.4 µs per loop

That's a heavy loss of comutational time.

mjungmath · 2020-08-06T19:05:58Z

comment:9

Replying to @egourgoulhon:

Replying to @mjungmath:

If making a copy is so costly, we should try to optimize it.

Copying a symbolic expression with hundreds of terms will always remain slower than returning self or other.

If I remember correctly, until Sage 9 or so, all elements emerged from arithmetics were created from scratch, even the trivial ones.

That's not true: already in Sage 7.4, we have the code snippet shown in comment:4.

Yes sorry, I was mistaken. See above.

mkoeppe · 2020-08-06T19:11:38Z

comment:10

Replying to @mjungmath:

Returning a copy:

... could you share the changes that you made for benchmarking this?

egourgoulhon · 2020-08-06T19:14:10Z

comment:11

Replying to @mkoeppe:

Symbolic expressions are immutable and therefore do not need to be copied.

You are right. Actually they are not copied when copying chart functions (and hence would not be copied when copying scalar fields): the code of ChartFunction.copy() is indeed:

        resu = type(self)(self.parent())
        for kk, vv in self._express.items():
            resu._express[kk] = vv
        resu._expansion_symbol = self._expansion_symbol
        resu._order = self._order
        return resu

Here if kk is 'SR', vv is a symbolic expression.

mjungmath · 2020-08-06T19:16:07Z

Branch: u/gh-mjungmath/return_copy_always_scalarfields

mkoeppe · 2020-08-06T19:16:28Z

comment:13

Another implementation strategy if you consider Components objects an implementation detail that is not exposed to the user: Change them to "copy-on-write" semantics

New commits:

`4f3b69f`	Trac 30291: simple checks for trivial cases in `_mul_`, `_add_` and _div_
`b7b0b3f`	`return copies`

mkoeppe · 2020-08-06T19:16:28Z

Commit: b7b0b3f

mjungmath · 2020-08-06T19:17:44Z

comment:14

I've pushed my changes.

mjungmath · 2020-08-06T19:22:37Z

comment:15

Here's the output of

sage: %prun %timeit f+0

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   366666    1.064    0.000    1.584    0.000 chart_func.py:318(__init__)
   366666    1.033    0.000    2.745    0.000 chart_func.py:879(copy)
   122222    0.716    0.000    3.990    0.000 scalarfield.py:1480(copy)
       10    0.505    0.051    5.468    0.547 <magic-timeit>:1(inner)
   366666    0.395    0.000    0.463    0.000 chart.py:457(__getitem__)
   122222    0.391    0.000    0.483    0.000 scalarfield.py:1060(__init__)
   122222    0.176    0.000    0.543    0.000 scalarfield.py:1154(is_trivial_zero)
    61111    0.150    0.000    2.201    0.000 scalarfield.py:2423(_add_)
    61111    0.145    0.000    2.659    0.000 scalarfield.py:2632(_lmul_)
    61111    0.111    0.000    0.230    0.000 chart_func.py:810(is_trivial_zero)
    61111    0.103    0.000    2.762    0.000 unital_algebras.py:54(from_base_ring)
   549999    0.099    0.000    0.099    0.000 {method 'parent' of 'sage.structure.element.Element' objects}
   488888    0.087    0.000    0.087    0.000 {method 'items' of 'dict' objects}
   122222    0.084    0.000    0.313    0.000 scalarfield.py:1212(<genexpr>)
   366885    0.068    0.000    0.068    0.000 {built-in method builtins.isinstance}
   366671    0.058    0.000    0.058    0.000 {built-in method builtins.len}
    61111    0.052    0.000    0.073    0.000 calculus_method.py:278(is_trivial_zero)
   122222    0.047    0.000    0.047    0.000 scalarfield.py:1327(_init_derived)
    61111    0.046    0.000    0.046    0.000 chart_func.py:460(expr)
   122222    0.045    0.000    0.045    0.000 subset.py:483(manifold)
   122222    0.041    0.000    0.041    0.000 {method 'is_trivial_zero' of 'sage.symbolic.expression.Expression' objects}
    61112    0.038    0.000    0.326    0.000 {built-in method builtins.all}
    61111    0.015    0.000    0.015    0.000 {method 'values' of 'dict' objects}
    ...

with my pushed changes.

mjungmath · 2020-08-06T19:38:29Z

comment:16

        resu = type(self)(self.parent())
-        for kk, vv in self._express.items():
-            resu._express[kk] = vv
+       resu._express = self._express.copy()
        resu._expansion_symbol = self._expansion_symbol

is slightly faster btw:

sage: %timeit f+0
The slowest run took 5.64 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 5: 59.3 µs per loop

And changing back to _is_zero instead of is_trivial_zero() yields a further minimal improvement:

sage: %timeit f+0
The slowest run took 17.22 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 5: 55.5 µs per loop

mjungmath · 2020-08-06T19:45:21Z

comment:17

However, regarding my prun test, I think that

        resu = type(self)(self.parent())

is the costly line. And that one shouldn't be an issue for more complicated expressions.

mjungmath · 2020-08-07T11:34:43Z

comment:18

Replying to @mkoeppe:

Another implementation strategy if you consider Components objects an implementation detail that is not exposed to the user: Change them to "copy-on-write" semantics

Something like this https://pypi.org/project/cowdict/ ?

mjungmath · 2020-08-07T11:55:39Z

comment:19

I wonder, is copying the chart function even necessary? If new expressions are set/added, a new chart function is created anyway,isn't it?

If the user wants the chart function directly, one can create a copy there. Or we state all chart functions belonging to scalar fields automatically as immutable (available since #30310). If the user wants to modify it, he must copy it manually.

mkoeppe · 2020-08-07T15:33:36Z

comment:20

Replying to @mjungmath:

Replying to @mkoeppe:

Another implementation strategy if you consider Components objects an implementation detail that is not exposed to the user: Change them to "copy-on-write" semantics

Something like this https://pypi.org/project/cowdict/ ?

Yes, same idea but with a different granularity. Add a flag to Component instances that keeps track of whether it belongs to a unique FiniteRankFreeModule instance. In FiniteRankFreeModule, don't copy the component, just mark it non-unique; instead, each time that you want to mutate a component, check first whether it's unique and if not, first copy it, then mutate. But complex code like this should only be written if absolutely necessary.

mjungmath · 2020-08-07T15:44:31Z

comment:21

Replying to @mkoeppe:

Replying to @mjungmath:

Replying to @mkoeppe:

Another implementation strategy if you consider Components objects an implementation detail that is not exposed to the user: Change them to "copy-on-write" semantics

Something like this https://pypi.org/project/cowdict/ ?

Yes, same idea but with a different granularity. Add a flag to Component instances that keeps track of whether it belongs to a unique FiniteRankFreeModule instance. In FiniteRankFreeModule, don't copy the component, just mark it non-unique; instead, each time that you want to mutate a component, check first whether it's unique and if not, first copy it, then mutate. But complex code like this should only be written if absolutely necessary.

That sounds like something we should attack.

However, the ingredient of scalar fields is ChartFunction, not Component. What do you think about my proposal in comment:19?

Or, alternatively, we apply the very same idea you proposed for Component to ChartFunction.

Edit:

But complex code like this should only be written if absolutely necessary.

I just overread this tiny but important part.

What about simply making those Components immutable which are bound to tensors? Similar as my proposal in comment:19. Each time the components of a tensor are changed via set_comp or add_comp, a whole new Component is created from scratch (self._new_comp is invoked). Making those components immutable which are used for tensors, wouldn't change much. Then we don't have to copy whole Components anymore, but only the dictionaries.

mkoeppe · 2020-08-07T16:04:48Z

comment:22

Replying to @mjungmath:

Replying to @mkoeppe:

Replying to @mjungmath:

Replying to @mkoeppe:

Another implementation strategy if you consider Components objects an implementation detail that is not exposed to the user: Change them to "copy-on-write" semantics

[...]

Add a flag to Component instances that keeps track of whether it belongs to a unique FiniteRankFreeModule instance. [...]

That sounds like something we should attack.

I think they're lower-hanging fruit for improving efficiency. For example, #30307 could make copying components faster.

However, the ingredient of scalar fields is ChartFunction, not Component. What do you think about my proposal in comment:19?

Sorry, I'm not up to speed on chart functions yet.

mjungmath · 2020-08-07T16:10:02Z

comment:24

Watch my edit. Sorry for the chaos.

tscrim · 2020-08-07T23:03:15Z

comment:25

Another useful thing for benchmarking is %lprun; see this tutorial.

mkoeppe · 2021-02-13T20:51:01Z

comment:27

Setting new milestone based on a cursory review of ticket status, priority, and last modification date.

mjungmath added this to the sage-9.2 milestone Aug 6, 2020

mjungmath added the p: major / 3 label Aug 6, 2020

This comment has been minimized.

Sign in to view

mkoeppe changed the title ~~Always Return a Copy~~ Arithmetic on tensor module elements, manifold objects: Always Return a Copy Aug 7, 2020

mkoeppe modified the milestones: sage-9.2, sage-9.3 Aug 29, 2020

mkoeppe modified the milestones: sage-9.3, sage-9.4 Feb 13, 2021

mkoeppe modified the milestones: sage-9.4, sage-9.5 Jul 19, 2021

mkoeppe modified the milestones: sage-9.5, sage-9.6 Dec 14, 2021

mkoeppe modified the milestones: sage-9.6, sage-9.7 Mar 5, 2022

mkoeppe modified the milestones: sage-9.7, sage-9.8 Aug 31, 2022

mkoeppe removed this from the sage-9.8 milestone Jan 29, 2023

egourgoulhon mentioned this issue Jan 30, 2023

SageManifolds metaticket 2 #30525

Open

kwankyu mentioned this issue Jan 7, 2023

Guide for parents with immutable elements #32353

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Arithmetic on tensor module elements, manifold objects: Always Return a Copy #30302

Arithmetic on tensor module elements, manifold objects: Always Return a Copy #30302

mjungmath commented Aug 6, 2020

This comment has been minimized.

This comment has been minimized.

mkoeppe commented Aug 6, 2020

egourgoulhon commented Aug 6, 2020

mkoeppe commented Aug 6, 2020

mjungmath commented Aug 6, 2020

egourgoulhon commented Aug 6, 2020

mjungmath commented Aug 6, 2020

mjungmath commented Aug 6, 2020

mkoeppe commented Aug 6, 2020

egourgoulhon commented Aug 6, 2020

mjungmath commented Aug 6, 2020

mkoeppe commented Aug 6, 2020

mkoeppe commented Aug 6, 2020

mjungmath commented Aug 6, 2020

mjungmath commented Aug 6, 2020

mjungmath commented Aug 6, 2020

mjungmath commented Aug 6, 2020

mjungmath commented Aug 7, 2020

mjungmath commented Aug 7, 2020

mkoeppe commented Aug 7, 2020

mjungmath commented Aug 7, 2020

mkoeppe commented Aug 7, 2020

mjungmath commented Aug 7, 2020

tscrim commented Aug 7, 2020

mkoeppe commented Feb 13, 2021

Arithmetic on tensor module elements, manifold objects: Always Return a Copy #30302

Arithmetic on tensor module elements, manifold objects: Always Return a Copy #30302

Comments

mjungmath commented Aug 6, 2020

This comment has been minimized.

This comment has been minimized.

mkoeppe commented Aug 6, 2020

egourgoulhon commented Aug 6, 2020

mkoeppe commented Aug 6, 2020

mjungmath commented Aug 6, 2020

egourgoulhon commented Aug 6, 2020

mjungmath commented Aug 6, 2020

mjungmath commented Aug 6, 2020

mkoeppe commented Aug 6, 2020

egourgoulhon commented Aug 6, 2020

mjungmath commented Aug 6, 2020

mkoeppe commented Aug 6, 2020

mkoeppe commented Aug 6, 2020

mjungmath commented Aug 6, 2020

mjungmath commented Aug 6, 2020

mjungmath commented Aug 6, 2020

mjungmath commented Aug 6, 2020

mjungmath commented Aug 7, 2020

mjungmath commented Aug 7, 2020

mkoeppe commented Aug 7, 2020

mjungmath commented Aug 7, 2020

mkoeppe commented Aug 7, 2020

mjungmath commented Aug 7, 2020

tscrim commented Aug 7, 2020

mkoeppe commented Feb 13, 2021