Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: Add explicit query compiler method for len/shape checks #7397

Closed
noloerino opened this issue Sep 12, 2024 · 0 comments · Fixed by #7398
Closed

PERF: Add explicit query compiler method for len/shape checks #7397

noloerino opened this issue Sep 12, 2024 · 0 comments · Fixed by #7398
Labels
Interfaces and abstractions Issues with Modin's QueryCompiler, Algebra, or BaseIO objects new feature/request 💬 Requests and pull requests for new features Performance 🚀 Performance related issues and pull requests.

Comments

@noloerino
Copy link
Collaborator

Is your feature request related to a problem? Please describe.
Currently, calling len(pd.DataFrame(...)) will materialize the frame's index and compute its length.

Some storage formats (including pandas, via the PandasDataFrame object) have more efficient ways, or built-in caching mechanisms, for computing the dimensions of a frame. Adding an explicit query compiler method (get_axis_len(axis: [0, 1]) -> int) would let us take advantage of this. Accordingly, calls to len(self.index) in frontend code should be replaced with len(self), and calls to len(self.columns) with self._query_compiler.get_axis_length(1) to avoid unnecessary materialization.

@noloerino noloerino added Performance 🚀 Performance related issues and pull requests. new feature/request 💬 Requests and pull requests for new features Interfaces and abstractions Issues with Modin's QueryCompiler, Algebra, or BaseIO objects labels Sep 12, 2024
anmyachev added a commit that referenced this issue Sep 21, 2024
Signed-off-by: Jonathan Shi <jhshi07@gmail.com>
Co-authored-by: Anatoly Myachev <anatoliimyachev@mail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Interfaces and abstractions Issues with Modin's QueryCompiler, Algebra, or BaseIO objects new feature/request 💬 Requests and pull requests for new features Performance 🚀 Performance related issues and pull requests.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant