Skip to content

GH61405 Expose arguments in DataFrame.query #61413

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
1 change: 1 addition & 0 deletions doc/source/whatsnew/v3.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -401,6 +401,7 @@ Other API changes
- Index set operations (like union or intersection) will now ignore the dtype of
an empty ``RangeIndex`` or empty ``Index`` with object dtype when determining
the dtype of the resulting Index (:issue:`60797`)
- :meth:`DataFrame.query` does not accept ``**kwargs`` anymore and requires passing keywords for desired arguments (:issue:`61405`)

.. ---------------------------------------------------------------------------
.. _whatsnew_300.deprecations:
Expand Down
95 changes: 86 additions & 9 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -4477,18 +4477,58 @@ def _get_item(self, item: Hashable) -> Series:

@overload
def query(
self, expr: str, *, inplace: Literal[False] = ..., **kwargs
self,
expr: str,
*,
parser: Literal["pandas", "python"] = ...,
engine: Literal["python", "numexpr"] | None = ...,
local_dict: dict[str, Any] | None = ...,
global_dict: dict[str, Any] | None = ...,
resolvers: list[Mapping] | None = ...,
level: int = ...,
inplace: Literal[False] = ...,
) -> DataFrame: ...

@overload
def query(self, expr: str, *, inplace: Literal[True], **kwargs) -> None: ...
def query(
self,
expr: str,
*,
parser: Literal["pandas", "python"] = ...,
engine: Literal["python", "numexpr"] | None = ...,
local_dict: dict[str, Any] | None = ...,
global_dict: dict[str, Any] | None = ...,
resolvers: list[Mapping] | None = ...,
level: int = ...,
inplace: Literal[True],
) -> None: ...

@overload
def query(
self, expr: str, *, inplace: bool = ..., **kwargs
self,
expr: str,
*,
parser: Literal["pandas", "python"] = ...,
engine: Literal["python", "numexpr"] | None = ...,
local_dict: dict[str, Any] | None = ...,
global_dict: dict[str, Any] | None = ...,
resolvers: list[Mapping] | None = ...,
level: int = ...,
inplace: bool = ...,
) -> DataFrame | None: ...

def query(self, expr: str, *, inplace: bool = False, **kwargs) -> DataFrame | None:
def query(
self,
expr: str,
*,
parser: Literal["pandas", "python"] = "pandas",
engine: Literal["python", "numexpr"] | None = None,
local_dict: dict[str, Any] | None = None,
global_dict: dict[str, Any] | None = None,
resolvers: list[Mapping] | None = None,
level: int = 0,
inplace: bool = False,
) -> DataFrame | None:
"""
Query the columns of a DataFrame with a boolean expression.

Expand All @@ -4507,11 +4547,41 @@ def query(self, expr: str, *, inplace: bool = False, **kwargs) -> DataFrame | No

See the documentation for :meth:`DataFrame.eval` for details on
referring to column names and variables in the query string.
parser : {'pandas', 'python'}, default 'pandas'
The parser to use to construct the syntax tree from the expression. The
default of ``'pandas'`` parses code slightly different than standard
Python. Alternatively, you can parse an expression using the
``'python'`` parser to retain strict Python semantics. See the
:ref:`enhancing performance <enhancingperf.eval>` documentation for
more details.
engine : {'python', 'numexpr'}, default 'numexpr'

The engine used to evaluate the expression. Supported engines are

- None : tries to use ``numexpr``, falls back to ``python``
- ``'numexpr'`` : This default engine evaluates pandas objects using
numexpr for large speed ups in complex expressions with large frames.
- ``'python'`` : Performs operations as if you had ``eval``'d in top
level python. This engine is generally not that useful.

More backends may be available in the future.
local_dict : dict or None, optional
A dictionary of local variables, taken from locals() by default.
global_dict : dict or None, optional
A dictionary of global variables, taken from globals() by default.
resolvers : list of dict-like or None, optional
A list of objects implementing the ``__getitem__`` special method that
you can use to inject an additional collection of namespaces to use for
variable lookup. For example, this is used in the
:meth:`~DataFrame.query` method to inject the
``DataFrame.index`` and ``DataFrame.columns``
variables that refer to their respective :class:`~pandas.DataFrame`
instance attributes.
level : int, optional
The number of prior stack frames to traverse and add to the current
scope. Most users will **not** need to change this parameter.
inplace : bool
Whether to modify the DataFrame rather than creating a new one.
**kwargs
See the documentation for :func:`eval` for complete details
on the keyword arguments accepted by :meth:`DataFrame.query`.

Returns
-------
Expand Down Expand Up @@ -4624,8 +4694,15 @@ def query(self, expr: str, *, inplace: bool = False, **kwargs) -> DataFrame | No
if not isinstance(expr, str):
msg = f"expr must be a string to be evaluated, {type(expr)} given"
raise ValueError(msg)
kwargs["level"] = kwargs.pop("level", 0) + 1
kwargs["target"] = None
kwargs: Any = {
"level": level + 1,
"target": None,
"parser": parser,
"engine": engine,
"local_dict": local_dict,
"global_dict": global_dict,
"resolvers": resolvers or (),
}

res = self.eval(expr, **kwargs)

Expand Down
Loading