fix for 61123 read_excel-nrows-param-reads-extra-rows #61127

zanuka · 2025-03-15T05:07:16Z

Issue: GH-61123
When reading Excel files with pd.read_excel and specifying nrows=4, the behavior differs depending on whether there’s a blank row between tables. For a file with two tables (each with a header and 3 data rows), nrows=4 should yield a DataFrame with one header and 3 data rows (shape (3, n)). However:

In test1.xlsx (with a blank row), it correctly reads the first table (header + 3 rows).
In test2.xlsx (no blank row), it incorrectly includes the second table’s header as a data row, resulting in a shape of (4, n).

This inconsistency occurs because read_excel doesn’t properly respect table boundaries when tables are adjacent, despite the nrows limit.

Fix:

Modified pandas/io/excel/_base.py and related reader modules (_openpyxl.py, _pyxlsb.py, _xlrd.py) to ensure nrows limits reading to the specified number of rows, excluding subsequent table headers even when tables are adjacent.
Added a new test test_excel_read_tables_with_and_without_blank_row in pandas/tests/io/excel/test_readers.py to verify that nrows=4 consistently returns a DataFrame with shape (3, 2) (header + 3 data rows) for both cases.

Changes:

Updated Excel reader logic to stop at nrows without parsing beyond table boundaries.
Ensured consistent behavior across openpyxl, pyxlsb, and xlrd engines.
Squashed commits into a single commit for clarity.

Verification:

Tested with test1.xlsx (blank row) and test2.xlsx (no blank row).
Confirmed both now yield a DataFrame with shape (3, 2) and only the first table’s data.

Steps to Test:

Run pytest pandas/tests/io/excel/test_readers.py::TestReaders::test_excel_read_tables_with_and_without_blank_row.
Verify df1.shape == (3, 2) and df2.shape == (3, 2) match the expected output.

Related Files:

pandas/io/excel/_base.py
pandas/io/excel/_openpyxl.py
pandas/io/excel/_pyxlsb.py
pandas/io/excel/_xlrd.py
pandas/tests/io/excel/test_readers.py

Closes #61123

⚡️ Commit from Jolt AI ⚡️ Fix Excel Test Indentation (https://app.usejolt.ai/code-chat/0d4546cc-38b6-4754-ae0a-55afa71f01ab) Description: Fix Excel Test Indentation ⚡️ Commit from Jolt AI ⚡️ Fix Excel Test Indentation (https://app.usejolt.ai/code-chat/0d4546cc-38b6-4754-ae0a-55afa71f01ab) Description: Fix Excel Test Indentation ⚡️ Commit from Jolt AI ⚡️ Fix Excel Test Indentation (https://app.usejolt.ai/code-chat/0d4546cc-38b6-4754-ae0a-55afa71f01ab) Description: Fix Excel Test Indentation fixes tests

zanuka requested a review from rhshadrach as a code owner March 15, 2025 05:07

zanuka closed this Mar 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fix for 61123 read_excel-nrows-param-reads-extra-rows #61127

fix for 61123 read_excel-nrows-param-reads-extra-rows #61127

Uh oh!

zanuka commented Mar 15, 2025

Uh oh!

Uh oh!

Uh oh!

fix for 61123 read_excel-nrows-param-reads-extra-rows #61127

fix for 61123 read_excel-nrows-param-reads-extra-rows #61127

Uh oh!

Conversation

zanuka commented Mar 15, 2025

Uh oh!

Uh oh!