Inplace Composite and ScalarLoop Ops with multiple outputs #1322

ricardoV94 · 2025-03-25T22:25:16Z

Closes #138

This allows inplacing arbitrary number of inputs in Elemwise Composite (i.e., fused) Ops. The restriction was there because in the original codegen writing to the aliased outputs could affect the computations of other outputs that still referenced the just overridden aliased inputs.

The fix was trivial, only use the output names in the end after all nodes are computed. Until then the variables are stored in temporary variables as already happens for regular intermediate nodes in the Composite.

Did the same for ScalarLoop. The final outputs are only defined once the inner loop is finished, so no worries about aliasing too soon.

Also a typo-bug in the Numba vectorize code that's fixed.

📚 Documentation preview 📚: https://pytensor--1322.org.readthedocs.build/en/1322/

ricardoV94 · 2025-03-25T22:25:52Z

Marked as TODO as I think the ScalarLoop can already be inplaced as well as it uses temporary variables for the inner carry. Just need to add a test.

codecov · 2025-03-25T23:01:04Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.04%. Comparing base (8a7356c) to head (d03dc8a).
Report is 19 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1322      +/-   ##
==========================================
+ Coverage   81.98%   82.04%   +0.05%     
==========================================
  Files         188      203      +15     
  Lines       48489    48837     +348     
  Branches     8673     8689      +16     
==========================================
+ Hits        39756    40067     +311     
- Misses       6582     6619      +37     
  Partials     2151     2151

Files with missing lines	Coverage Δ
pytensor/link/numba/dispatch/vectorize_codegen.py	`92.03% <ø> (+0.88%)`	⬆️
pytensor/scalar/basic.py	`80.48% <100.00%> (-0.04%)`	⬇️
pytensor/scalar/loop.py	`94.37% <100.00%> (+4.25%)`	⬆️
pytensor/tensor/rewriting/elemwise.py	`91.02% <100.00%> (-0.11%)`	⬇️

... and 29 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot

Pull Request Overview

This PR adds support for inplacing multiple outputs in Composite and ScalarLoop operations while also fixing a typo in the Numba vectorize code. Key changes include:

New tests in various test suites (scalar, numba, and tensor) to verify the inplacing behavior of multi-output composite ops.
Updates to the ScalarLoop op in pytensor/scalar/loop.py to allow passing extra keyword arguments (and corresponding changes in clone).
Minor API adjustments in the numba dispatch code and candidate outputs handling in elemwise rewriting.

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
tests/scalar/test_loop.py	Adds tests for elemwise inplacing with multiple outputs.
tests/link/numba/test_elemwise.py	Updates tests to cover inplace output type and multi-output inplacing.
tests/tensor/rewriting/test_elemwise.py	Adjusts multi-output inplacing tests with new parametrization.
pytensor/scalar/loop.py	Modifies init and clone methods to accept extra kwargs.
pytensor/scalar/basic.py	Updates super() calls and bumps the c_code_cache_version_outer.
pytensor/tensor/rewriting/elemwise.py	Removes candidate_input_idxs and uses a direct range for candidate outputs.
pytensor/link/numba/dispatch/vectorize_codegen.py	Fixes a typo by replacing _get_value() with _getvalue().

Comments suppressed due to low confidence (2)

pytensor/tensor/rewriting/elemwise.py:165

The candidate_input_idxs abstraction was removed in favor of using 'range(len(node.outputs))'. Please double-check that this change does not inadvertently allow in–place modifications on outputs that should remain protected.

candidate_outputs = [i for i in range(len(node.outputs)) if i not in baseline]

pytensor/link/numba/dispatch/vectorize_codegen.py:268

The call to _getvalue() replaces the previous _get_value() method; confirm that this API change is consistent across all relevant objects.

outputs[inplace_idx]._getvalue()

Copilot

Pull Request Overview

This PR enables inplacing for Composite and ScalarLoop operations with multiple outputs by deferring output assignment until all node computations are complete. Key changes include:

Adjusting in-place optimizations in Composite and ScalarLoop ops.
Updating tests to pass explicit dtypes and include trust_input flags.
Refining code generation and cloning logic in scalar and tensor rewriting components.

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
tests/tensor/test_math_scipy.py	Added trust_input and explicit dtype parameters in function calls.
tests/tensor/rewriting/test_elemwise.py	Modified test for multi-output inplacing with parametrized linker.
tests/scalar/test_loop.py	Enhanced tests for in-place behavior in ScalarLoop ops.
tests/link/numba/test_elemwise.py	Adjusted test naming and added new test for multiple inplaced outputs.
pytensor/tensor/rewriting/elemwise.py	Simplified candidate output indexing for inplacing.
pytensor/scalar/loop.py	Updated clone signature, in-place node code generation, and renaming for clarity.
pytensor/scalar/basic.py	Revised str method formatting for cleaner output.
pytensor/link/numba/dispatch/vectorize_codegen.py	Fixed a typo in the function call accessing a value.

Comments suppressed due to low confidence (3)

tests/tensor/rewriting/test_elemwise.py:1127

The new assertion for 'destroy_map' is stricter than before and assumes that the first output is always the one inplaced. Please verify that this change covers all valid scenarios for multi-output composites.

assert destroy_map == {0: [0]}

tests/scalar/test_loop.py:301

Using 'is' to verify in-place aliasing for numpy arrays might lead to false negatives if equivalent but non-identical objects are returned. Confirm that this identity check is necessary and robust for in-place testing.

assert xv_res is (n_test, x0v_test, y0v_test, cv_test)[mutate_arg_idx]

pytensor/scalar/loop.py:308

The newly introduced assignment block that maps carry variables to outputs should be reviewed to ensure it does not cause redundant or conflicting assignments in complex loop scenarios.

            _c_code += f"{out} = {carry};\n"

jessegrabowski

lgtm

pytensor/scalar/basic.py

jessegrabowski · 2025-04-08T10:17:38Z

pytensor/scalar/loop.py

@@ -115,7 +116,7 @@ def fgraph(self):
        self._fgraph = fgraph
        return self._fgraph

-    def clone(self):
+    def clone(self, name=None, **kwargs):


unrelated but I just checked and the signature of clone varies quite a bit over the codebase. That's pretty maddening!

Yeah, I don't think it's standardized, unlike node.clone

jessegrabowski · 2025-04-08T18:38:47Z

tests/tensor/test_math_scipy.py

@@ -431,11 +431,13 @@ def test_gammaincc_ddk_performance(benchmark):
    x = vector("x")

    out = gammaincc(k, x)
-    grad_fn = function([k, x], grad(out.sum(), wrt=[k]), mode="FAST_RUN")
+    grad_fn = function(


What's the story with this test? Changes seem unrelated to the rest of the PR

To better benchmark the Op I set trust_input=True, which required I be careful with the dtypes. The ScalarLoop is used in the gradient of gammaincc

ricardoV94 added bug Something isn't working enhancement New feature or request numba performance memory_optimization labels Mar 25, 2025

ricardoV94 force-pushed the inplace_composite branch from 11bb648 to 32dcec9 Compare March 26, 2025 21:55

ricardoV94 changed the title ~~Inplace Composite Ops with multiple outputs~~ Inplace Composite and ScalarLoop Ops with multiple outputs Mar 26, 2025

ricardoV94 requested a review from Copilot March 26, 2025 21:56

Copilot AI reviewed Mar 26, 2025

View reviewed changes

ricardoV94 force-pushed the inplace_composite branch from 32dcec9 to 8df9789 Compare March 27, 2025 08:04

ricardoV94 added 4 commits March 27, 2025 11:22

Trust input in test_math_scipy benchmark tests

ae1177e

Allow inplace of Elemwise Composite with multiple outputs

a12acfd

Allow inplace of Elemwise ScalarLoop

789b509

Fix bug in Numba inplace vectorize code with multiple outputs

5fc2cb8

ricardoV94 force-pushed the inplace_composite branch from 8df9789 to 5fc2cb8 Compare March 27, 2025 10:22

ricardoV94 marked this pull request as ready for review March 27, 2025 10:42

ricardoV94 requested review from jessegrabowski and Copilot March 27, 2025 11:12

Copilot AI reviewed Mar 27, 2025

View reviewed changes

jessegrabowski approved these changes Apr 8, 2025

View reviewed changes

Remove useless int() in f-strings

d03dc8a

ricardoV94 merged commit 3c66aa6 into pymc-devs:main Apr 8, 2025
73 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Inplace Composite and ScalarLoop Ops with multiple outputs #1322

Inplace Composite and ScalarLoop Ops with multiple outputs #1322

Uh oh!

ricardoV94 commented Mar 25, 2025 •

edited

Loading

Uh oh!

ricardoV94 commented Mar 25, 2025

Uh oh!

codecov bot commented Mar 25, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

jessegrabowski left a comment

Uh oh!

Uh oh!

jessegrabowski Apr 8, 2025

Uh oh!

ricardoV94 Apr 8, 2025

Uh oh!

jessegrabowski Apr 8, 2025

Uh oh!

ricardoV94 Apr 8, 2025

Uh oh!

Uh oh!

Uh oh!

Inplace Composite and ScalarLoop Ops with multiple outputs #1322

Inplace Composite and ScalarLoop Ops with multiple outputs #1322

Uh oh!

Conversation

ricardoV94 commented Mar 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ricardoV94 commented Mar 25, 2025

Uh oh!

codecov bot commented Mar 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

jessegrabowski left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jessegrabowski Apr 8, 2025

Choose a reason for hiding this comment

Uh oh!

ricardoV94 Apr 8, 2025

Choose a reason for hiding this comment

Uh oh!

jessegrabowski Apr 8, 2025

Choose a reason for hiding this comment

Uh oh!

ricardoV94 Apr 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ricardoV94 commented Mar 25, 2025 •

edited

Loading

codecov bot commented Mar 25, 2025 •

edited

Loading