Skip to content

Commit 03aa067

Browse files
committed
Merge pull request #4715 from jreback/hdf_format
API: change nomeclature in HDFStore to use format=fixed(f) | table(t)
2 parents 3073835 + e6b4e60 commit 03aa067

File tree

6 files changed

+231
-167
lines changed

6 files changed

+231
-167
lines changed

doc/source/io.rst

+20-12
Original file line numberDiff line numberDiff line change
@@ -1794,27 +1794,31 @@ similar to how ``read_csv`` and ``to_csv`` work. (new in 0.11.0)
17941794
17951795
os.remove('store_tl.h5')
17961796
1797-
.. _io.hdf5-storer:
1797+
.. _io.hdf5-fixed:
17981798
1799-
Storer Format
1800-
~~~~~~~~~~~~~
1799+
Fixed Format
1800+
~~~~~~~~~~~~
1801+
1802+
.. note::
1803+
1804+
This was prior to 0.13.0 the ``Storer`` format.
18011805
18021806
The examples above show storing using ``put``, which write the HDF5 to ``PyTables`` in a fixed array format, called
1803-
the ``storer`` format. These types of stores are are **not** appendable once written (though you can simply
1807+
the ``fixed`` format. These types of stores are are **not** appendable once written (though you can simply
18041808
remove them and rewrite). Nor are they **queryable**; they must be
18051809
retrieved in their entirety. These offer very fast writing and slightly faster reading than ``table`` stores.
1806-
This format is specified by default when using ``put`` or by ``fmt='s'``
1810+
This format is specified by default when using ``put`` or ``to_hdf`` or by ``format='fixed'`` or ``format='f'``
18071811
18081812
.. warning::
18091813
1810-
A ``storer`` format will raise a ``TypeError`` if you try to retrieve using a ``where`` .
1814+
A ``fixed`` format will raise a ``TypeError`` if you try to retrieve using a ``where`` .
18111815
18121816
.. code-block:: python
18131817
1814-
DataFrame(randn(10,2)).to_hdf('test_storer.h5','df')
1818+
DataFrame(randn(10,2)).to_hdf('test_fixed.h5','df')
18151819
1816-
pd.read_hdf('test_storer.h5','df',where='index>5')
1817-
TypeError: cannot pass a where specification when reading a non-table
1820+
pd.read_hdf('test_fixed.h5','df',where='index>5')
1821+
TypeError: cannot pass a where specification when reading a fixed format.
18181822
this store must be selected in its entirety
18191823
18201824
@@ -1827,7 +1831,11 @@ Table Format
18271831
format. Conceptually a ``table`` is shaped very much like a DataFrame,
18281832
with rows and columns. A ``table`` may be appended to in the same or
18291833
other sessions. In addition, delete & query type operations are
1830-
supported. This format is specified by ``fmt='t'`` to ``append`` or ``put``.
1834+
supported. This format is specified by ``format='table'`` or ``format='t'``
1835+
to ``append`` or ``put`` or ``to_hdf``
1836+
1837+
This format can be set as an option as well ``pd.set_option('io.hdf.default_format','table')`` to
1838+
enable ``put/append/to_hdf`` to by default store in the ``table`` format.
18311839
18321840
.. ipython:: python
18331841
:suppress:
@@ -1854,7 +1862,7 @@ supported. This format is specified by ``fmt='t'`` to ``append`` or ``put``.
18541862
18551863
.. note::
18561864
1857-
You can also create a ``table`` by passing ``fmt='t'`` to a ``put`` operation.
1865+
You can also create a ``table`` by passing ``format='table'`` or ``format='t'`` to a ``put`` operation.
18581866
18591867
.. _io.hdf5-keys:
18601868
@@ -2363,7 +2371,7 @@ Starting in 0.11, passing a ``min_itemsize`` dict will cause all passed columns
23632371
External Compatibility
23642372
~~~~~~~~~~~~~~~~~~~~~~
23652373
2366-
``HDFStore`` write storer objects in specific formats suitable for
2374+
``HDFStore`` write ``table`` format objects in specific formats suitable for
23672375
producing loss-less roundtrips to pandas objects. For external
23682376
compatibility, ``HDFStore`` can read native ``PyTables`` format
23692377
tables. It is possible to write an ``HDFStore`` object that can easily

doc/source/release.rst

+2-1
Original file line numberDiff line numberDiff line change
@@ -108,10 +108,11 @@ pandas 0.13
108108
- removed the ``warn`` argument from ``open``. Instead a ``PossibleDataLossError`` exception will
109109
be raised if you try to use ``mode='w'`` with an OPEN file handle (:issue:`4367`)
110110
- allow a passed locations array or mask as a ``where`` condition (:issue:`4467`)
111-
- the ``fmt`` keyword now replaces the ``table`` keyword; allowed values are ``s|t``
112111
- add the keyword ``dropna=True`` to ``append`` to change whether ALL nan rows are not written
113112
to the store (default is ``True``, ALL nan rows are NOT written), also settable
114113
via the option ``io.hdf.dropna_table`` (:issue:`4625`)
114+
- the ``format`` keyword now replaces the ``table`` keyword; allowed values are ``fixed(f)|table(t)``
115+
the ``Storer`` format has been renamed to ``Fixed``
115116
- ``JSON``
116117

117118
- added ``date_unit`` parameter to specify resolution of timestamps. Options

doc/source/v0.13.0.txt

+5-5
Original file line numberDiff line numberDiff line change
@@ -79,17 +79,17 @@ API changes
7979
- allow a passed locations array or mask as a ``where`` condition (:issue:`4467`).
8080
See :ref:`here<io.hdf5-where_mask>` for an example.
8181

82-
- the ``fmt`` keyword now replaces the ``table`` keyword; allowed values are ``s|t``
83-
the same defaults as prior < 0.13.0 remain, e.g. ``put`` implies 's' (Storer) format
84-
and ``append`` imples 't' (Table) format
82+
- the ``format`` keyword now replaces the ``table`` keyword; allowed values are ``fixed(f)`` or ``table(t)``
83+
the same defaults as prior < 0.13.0 remain, e.g. ``put`` implies 'fixed` or 'f' (Fixed) format
84+
and ``append`` imples 'table' or 't' (Table) format
8585

8686
.. ipython:: python
8787

8888
path = 'test.h5'
8989
df = DataFrame(randn(10,2))
90-
df.to_hdf(path,'df_table',fmt='t')
90+
df.to_hdf(path,'df_table',format='table')
9191
df.to_hdf(path,'df_table2',append=True)
92-
df.to_hdf(path,'df_storer')
92+
df.to_hdf(path,'df_fixed')
9393
with get_store(path) as store:
9494
print store
9595

pandas/core/generic.py

+9
Original file line numberDiff line numberDiff line change
@@ -678,6 +678,15 @@ def to_hdf(self, path_or_buf, key, **kwargs):
678678
and if the file does not exist it is created.
679679
``'r+'``
680680
It is similar to ``'a'``, but the file must already exist.
681+
format : 'fixed(f)|table(t)', default is 'fixed'
682+
fixed(f) : Fixed format
683+
Fast writing/reading. Not-appendable, nor searchable
684+
table(t) : Table format
685+
Write as a PyTables Table structure which may perform worse but
686+
allow more flexible operations like searching / selecting subsets
687+
of the data
688+
append : boolean, default False
689+
For Table formats, append the input data to the existing
681690
complevel : int, 1-9, default 0
682691
If a complib is specified compression will be applied
683692
where possible

0 commit comments

Comments
 (0)