Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pstats.Stats.get_stats_profile can't handle functions with the same name #126850

Open
skmendez opened this issue Nov 15, 2024 · 2 comments
Open
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@skmendez
Copy link

skmendez commented Nov 15, 2024

Bug report

Bug description:

MCVE

import cProfile
import time
import pstats


class A:
    def foo(self):
        time.sleep(1)


class B:
    def foo(self):
        time.sleep(2)


pr = cProfile.Profile()
pr.enable()
A().foo()
B().foo()
pr.create_stats()

ps = pstats.Stats(pr).sort_stats("cumtime")
ps.print_stats(5)

when I ran this in a jupyter notebook, this outputted:

         61 function calls in 3.003 seconds

   Ordered by: cumulative time
   List reduced from 21 to 5 due to restriction <5>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        3    0.000    0.000    3.003    1.001 /usr/local/my_venv/lib/python3.10/site-packages/IPython/core/interactiveshell.py:3424(run_code)
        3    0.000    0.000    3.002    1.001 {built-in method builtins.exec}
        2    3.002    1.501    3.002    1.501 {built-in method time.sleep}
        1    0.000    0.000    2.002    2.002 /usr/tmp/ipykernel_104016/1099215389.py:12(foo)
        1    0.000    0.000    1.000    1.000 /usr/tmp/ipykernel_104016/1099215389.py:7(foo)

which correctly shows two different versions of a function called foo. However, due to the API design of get_stats_profile, only one of the functions will be stored in the dict FunctionProfile.func_profiles, the one which is last in the insertion order, which in this case since I sorted by cumtime will be A.foo:

print(ps.get_stats_profile().func_profiles["foo"])

outputs:

FunctionProfile(ncalls='1', tottime=0.0, percall_tottime=0.0, cumtime=1.0, percall_cumtime=1.0, file_name='/usr/tmp/ipykernel_104016/1099215389.py', line_number=7)

Maybe get_stats_profile should key the func_profile dictionary on something that's actually unique per profile entry, like filename:lineno(function)?

CPython versions tested on:

3.10

Operating systems tested on:

Linux

@skmendez skmendez added the type-bug An unexpected behavior, bug, or error label Nov 15, 2024
@skmendez
Copy link
Author

cc: @Olshansk who added this in #15495

@picnixz picnixz added the stdlib Python modules in the Lib dir label Nov 15, 2024
@Olshansk
Copy link
Contributor

tl;dr Great idea, see the idea below to create a composite key for it. Would you be down to make a cpython contribution?

References:

Here are some links as a reference:

Potential (Untested) Approach

I use ChatGPT to see a quick and easy way to fix this.

Step 1: Define a new structure UniqueFunctionName

class UniqueFunctionName:
    """
    A class to uniquely represent a function name with its file name and line number.
    """
    def __init__(self, func_name, file_name, line_number):
        self.func_name = func_name
        self.file_name = file_name
        self.line_number = line_number

    def to_str(self):
        """
        Converts the unique function name to a string representation.
        Example: "function_name (filename.py:42)"
        """
        return f"{self.func_name} ({self.file_name}:{self.line_number})"

    @classmethod
    def from_str(cls, unique_str):
        """
        Parses a string representation to recreate a UniqueFunctionName instance.
        Example input: "function_name (filename.py:42)"
        """
        try:
            func_name, location = unique_str.split(" (")
            location = location.rstrip(")")
            file_name, line_number = location.split(":")
            return cls(func_name, file_name, int(line_number))
        except ValueError:
            raise ValueError(f"Invalid format for unique function name: {unique_str}")

    def __repr__(self):
        return f"UniqueFunctionName(func_name='{self.func_name}', file_name='{self.file_name}', line_number={self.line_number})"

Step 2: Use the new structure in the func_profiles map

+ unique_func_name = UniqueFunctionName(func_name, file_name, line_number)
+ unique_func_name = unique_func.to_str()
- func_profiles[func_name] = func_profile

Step 3: Submit the contribution

@skmendez You down to take this on? I think going through the cpython contribution process teaches a lot about OSS contributions. It'll also uncover the last (hardest) 10% of pushing this over the finish line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

3 participants