@@ -44,16 +44,6 @@ The categorical data type is useful in the following cases:
44
44
* As a signal to other Python libraries that this column should be treated as a categorical
45
45
variable (e.g. to use suitable statistical methods or plot types).
46
46
47
- .. note ::
48
-
49
- In contrast to R's `factor ` function, categorical data is not converting input values to
50
- strings and categories will end up the same data type as the original values.
51
-
52
- .. note ::
53
-
54
- In contrast to R's `factor ` function, there is currently no way to assign/change labels at
55
- creation time. Use `categories ` to change the categories after creation time.
56
-
57
47
See also the :ref: `API docs on categoricals<api.categorical> `.
58
48
59
49
.. _categorical.objectcreation :
@@ -113,19 +103,18 @@ Categorical data has a specific ``category`` :ref:`dtype <basics.dtypes>`:
113
103
DataFrame Creation
114
104
~~~~~~~~~~~~~~~~~~
115
105
116
- Columns in a ``DataFrame `` can be batch converted to categorical, either at the time of construction
117
- or after construction. The conversion to categorical is done on a column by column basis; labels present
118
- in a one column will not be carried over and used as categories in another column.
106
+ Similar to the previous section where a single column was converted to categorical, all columns in a
107
+ ``DataFrame `` can be batch converted to categorical either during or after construction.
119
108
120
- Columns can be batch converted by specifying ``dtype="category" `` when constructing a ``DataFrame ``:
109
+ This can be done during construction by specifying ``dtype="category" `` in the ``DataFrame `` constructor :
121
110
122
111
.. ipython :: python
123
112
124
113
df = pd.DataFrame({' A' : list (' abca' ), ' B' : list (' bccd' )}, dtype = " category" )
125
114
df.dtypes
126
115
127
- Note that the categories present in each column differ; since the conversion is done on a column by column
128
- basis, only labels present in a given column are categories:
116
+ Note that the categories present in each column differ; the conversion is done column by column, so
117
+ only labels present in a given column are categories:
129
118
130
119
.. ipython :: python
131
120
@@ -135,15 +124,15 @@ basis, only labels present in a given column are categories:
135
124
136
125
.. versionadded :: 0.23.0
137
126
138
- Similarly, columns in an existing ``DataFrame `` can be batch converted using :meth: `DataFrame.astype `:
127
+ Analogously, all columns in an existing ``DataFrame `` can be batch converted using :meth: `DataFrame.astype `:
139
128
140
129
.. ipython :: python
141
130
142
131
df = pd.DataFrame({' A' : list (' abca' ), ' B' : list (' bccd' )})
143
132
df_cat = df.astype(' category' )
144
133
df_cat.dtypes
145
134
146
- This conversion is likewise done on a column by column basis :
135
+ This conversion is likewise done column by column:
147
136
148
137
.. ipython :: python
149
138
@@ -191,7 +180,7 @@ are consistent among all columns.
191
180
categories for each column, the ``categories `` parameter can be determined programatically by
192
181
``categories = pd.unique(df.values.ravel()) ``.
193
182
194
- If you already have `codes ` and `categories `, you can use the
183
+ If you already have `` codes `` and `` categories ` `, you can use the
195
184
:func: `~pandas.Categorical.from_codes ` constructor to save the factorize step
196
185
during normal constructor mode:
197
186
@@ -216,6 +205,16 @@ To get back to the original ``Series`` or NumPy array, use
216
205
s2.astype(str )
217
206
np.asarray(s2)
218
207
208
+ .. note ::
209
+
210
+ In contrast to R's `factor ` function, categorical data is not converting input values to
211
+ strings; categories will end up the same data type as the original values.
212
+
213
+ .. note ::
214
+
215
+ In contrast to R's `factor ` function, there is currently no way to assign/change labels at
216
+ creation time. Use `categories ` to change the categories after creation time.
217
+
219
218
.. _categorical.categoricaldtype :
220
219
221
220
CategoricalDtype
0 commit comments