Skip to content

DOC: Correct pandas sources should be discoverable by sphinx autodoc #38397

Open
@plcplc

Description

@plcplc
  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

"""
A module demonstrating the inability to automatically discover link targets of references to pandas objects.

.. autofunction:: foo
"""
import pandas as pd

def foo(df: pd.DataFrame) -> None:
    """
    :param df: A data frame
    """

Problem description

[this should explain why the current behaviour is a problem and why the expected output is a better solution]

In the current state of things, sphinx autodoc will load in the module and via runtime inspection of pd.DataFrame.__module__ discover that the fully qualified name of the type of the parameter df is pandas.core.frame.DataFrame. However, the documentation for DataFrame is indexed as pandas.DataFrame as per the pandas objects.inv file, so sphinx fails to make the connection between those two names for the same type.

Instead I have to resort to manually specifying the type as :type df: pandas.DataFrame, which reduces the value gained from using autodoc in the first place. It also leads to hours spent trying to figure out how my setup was misconfigured.

I notice also that if I do:

"""
A module demonstrating the ability to automatically discover link targets of references to pandas objects.

.. autofunction:: foo
"""
import pandas as pd

pd.DataFrame.__module__ = "pandas"

def foo(df: pd.DataFrame) -> None:
    """
    :param df: A data frame
    """

Then the automatic references are recognised properly, and I get a clickable hyperlink to the official pandas docs for DataFrame.

Expected Output

I expected the indexed names in objects.inv to align with the runtime inspected type names.

I don't know if the code in sphinx or the code in pandas ought to change, but my feeling is that it would be unproductive to expect sphinx to divine meaning from the full dynamic breadth of possibilities that python offers. It would seem to me that there should be some structuring convention that, if adhered to would fix this.

Output of pd.show_versions()

(I don't think this is relevant for this bug)

[paste the output of pd.show_versions() here leaving a blank line after the details tag]

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions