Skip to content

Commit 19c1de1

Browse files
committed
doc changes
deprecate passing non-existing column in .to_excel(..., columns=)
1 parent 9020827 commit 19c1de1

File tree

5 files changed

+30
-18
lines changed

5 files changed

+30
-18
lines changed

doc/source/advanced.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -1009,7 +1009,7 @@ The different indexing operation can potentially change the dtype of a ``Series`
10091009
10101010
series1 = pd.Series([1, 2, 3])
10111011
series1.dtype
1012-
res = series1[[0,4]]
1012+
res = series1.reindex([0, 4])
10131013
res.dtype
10141014
res
10151015

doc/source/indexing.rst

+9-8
Original file line numberDiff line numberDiff line change
@@ -335,7 +335,7 @@ Selection By Label
335335
336336
.. warning::
337337

338-
Starting in 0.21.0, pandas will show a ``FutureWarning`` if indexing with a list-of-lables and not ALL labels are present. In the future
338+
Starting in 0.21.0, pandas will show a ``FutureWarning`` if indexing with a list with missing labels. In the future
339339
this will raise a ``KeyError``. See :ref:`list-like Using loc with missing keys in a list is Deprecated <indexing.deprecate_loc_reindex_listlike>`
340340

341341
pandas provides a suite of methods in order to have **purely label based indexing**. This is a strict inclusion based protocol.
@@ -644,12 +644,12 @@ For getting *multiple* indexers, using ``.get_indexer``
644644
645645
.. _indexing.deprecate_loc_reindex_listlike:
646646

647-
Indexing with missing list-of-labels is Deprecated
648-
--------------------------------------------------
647+
Indexing with list with missing labels is Deprecated
648+
----------------------------------------------------
649649

650650
.. warning::
651651

652-
Starting in 0.21.0, using ``.loc`` or ``[]`` with a list-like containing one or more missing labels, is deprecated, in favor of ``.reindex``.
652+
Starting in 0.21.0, using ``.loc`` or ``[]`` with a list with one or more missing labels, is deprecated, in favor of ``.reindex``.
653653

654654
In prior versions, using ``.loc[list-of-labels]`` would work as long as *at least 1* of the keys was found (otherwise it
655655
would raise a ``KeyError``). This behavior is deprecated and will show a warning message pointing to this section. The
@@ -672,7 +672,6 @@ Previous Behavior
672672

673673
.. code-block:: ipython
674674
675-
676675
In [4]: s.loc[[1, 2, 3]]
677676
Out[4]:
678677
1 2.0
@@ -683,6 +682,8 @@ Previous Behavior
683682
684683
Current Behavior
685684

685+
.. code-block:: ipython
686+
686687
In [4]: s.loc[[1, 2, 3]]
687688
Passing list-likes to .loc with any non-matching elements will raise
688689
KeyError in the future, you can use .reindex() as an alternative.
@@ -720,7 +721,7 @@ Having a duplicated index will raise for a ``.reindex()``:
720721
s = pd.Series(np.arange(4), index=['a', 'a', 'b', 'c'])
721722
labels = ['c', 'd']
722723
723-
.. code-block:: python
724+
.. code-block:: ipython
724725
725726
In [17]: s.reindex(labels)
726727
ValueError: cannot reindex from a duplicate axis
@@ -734,7 +735,7 @@ axis, and then reindex.
734735
735736
However, this would *still* raise if your resulting index is duplicated.
736737

737-
.. code-block:: python
738+
.. code-block:: ipython
738739
739740
In [41]: labels = ['a', 'd']
740741
@@ -959,7 +960,7 @@ when you don't know which of the sought labels are in fact present:
959960
s[s.index.isin([2, 4, 6])]
960961
961962
# compare it to the following
962-
s[[2, 4, 6]]
963+
s.reindex([2, 4, 6])
963964
964965
In addition to that, ``MultiIndex`` allows selecting a separate level to use
965966
in the membership check:

doc/source/whatsnew/v0.21.0.txt

+6-4
Original file line numberDiff line numberDiff line change
@@ -270,10 +270,10 @@ We have updated our minimum supported versions of dependencies (:issue:`15206`,
270270

271271
.. _whatsnew_0210.api_breaking.loc:
272272

273-
Indexing with missing list-of-labels is Deprecated
274-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
273+
Indexing with a list with missing labels is Deprecated
274+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
275275

276-
Previously, selecting at least 1 valid label with a list-like indexer would always succeed, returning ``NaN`` for missing labels.
276+
Previously, selecting with a list of labels, where one or more labels were missing would always succeed, returning ``NaN`` for missing labels.
277277
This will now show a ``FutureWarning``, in the future this will raise a ``KeyError`` (:issue:`15747`).
278278
This warning will trigger on a ``DataFrame`` or a ``Series`` for using ``.loc[]`` or ``[[]]`` when passing a list-of-labels with at least 1 missing label.
279279
See the :ref:`deprecation docs <indexing.deprecate_loc_reindex_listlike>`.
@@ -288,7 +288,6 @@ Previous Behavior
288288

289289
.. code-block:: ipython
290290

291-
292291
In [4]: s.loc[[1, 2, 3]]
293292
Out[4]:
294293
1 2.0
@@ -299,6 +298,8 @@ Previous Behavior
299298

300299
Current Behavior
301300

301+
.. code-block:: ipython
302+
302303
In [4]: s.loc[[1, 2, 3]]
303304
Passing list-likes to .loc or [] with any missing label will raise
304305
KeyError in the future, you can use .reindex() as an alternative.
@@ -628,6 +629,7 @@ Deprecations
628629
- :func:`SeriesGroupBy.nth` has deprecated ``True`` in favor of ``'all'`` for its kwarg ``dropna`` (:issue:`11038`).
629630
- :func:`DataFrame.as_blocks` is deprecated, as this is exposing the internal implementation (:issue:`17302`)
630631
- ``pd.TimeGrouper`` is deprecated in favor of :class:`pandas.Grouper` (:issue:`16747`)
632+
- Passing a non-existant column in ``.to_excel(..., columns=)`` is deprecated and will raise a ``KeyError`` in the future (:issue:`17295`)
631633

632634
.. _whatsnew_0210.deprecations.argmin_min
633635

pandas/io/formats/excel.py

+11-4
Original file line numberDiff line numberDiff line change
@@ -359,11 +359,18 @@ def __init__(self, df, na_rep='', float_format=None, cols=None,
359359

360360
# all missing, raise
361361
if not len(Index(cols) & df.columns):
362-
raise KeyError
362+
raise KeyError(
363+
"passes columns are not ALL present dataframe")
364+
365+
# deprecatedin gh-17295
366+
# 1 missing is ok (for now)
367+
if len(Index(cols) & df.columns) != len(cols):
368+
warnings.warn(
369+
"columns must be a subset of the "
370+
"dataframe columns; this will raise "
371+
"a KeyError in the future",
372+
FutureWarning)
363373

364-
# 1 missing is ok
365-
# TODO(jreback) this should raise
366-
# on *any* missing columns
367374
self.df = df.reindex(columns=cols)
368375
self.columns = self.df.columns
369376
self.float_format = float_format

pandas/tests/io/test_excel.py

+3-1
Original file line numberDiff line numberDiff line change
@@ -1808,7 +1808,9 @@ def test_invalid_columns(self):
18081808
write_frame = DataFrame({'A': [1, 1, 1],
18091809
'B': [2, 2, 2]})
18101810

1811-
write_frame.to_excel(path, 'test1', columns=['B', 'C'])
1811+
with tm.assert_produces_warning(FutureWarning,
1812+
check_stacklevel=False):
1813+
write_frame.to_excel(path, 'test1', columns=['B', 'C'])
18121814
expected = write_frame.reindex(columns=['B', 'C'])
18131815
read_frame = read_excel(path, 'test1')
18141816
tm.assert_frame_equal(expected, read_frame)

0 commit comments

Comments
 (0)