Description
A bunch of different classes have one or more of the attributes data
, _data
, values
, _values
, plus an assortment of external_values
, internal_values
, formatting_values
, get_values
. These mean different things in different places.
Maintenance would be easier if the naming conventions were more uniform. Index
has all four of these attributes and I'm not sure there exists a nice backwards-compatible way to reconcile them with the naming in Series
/DataFrame
. Any thoughts? Does anything else think this matters?
(Motivating example: "Where are all the places in the code that touch a BlockManager. Let's just grep for \.data
...")
The lowest-hanging fruit for cleanup here is in the Accessor classes. StringAccessor, SeriesPlotMethods, and FramePlotMethods all define _data
to point back to their parent Series/Index, Series, and Frame, respectively. I suggest that _data
be replaced with just _parent
. The other two existing accessors CategoricalAccessor
and CombinedDatetimelikeProperties
use categories
and values
for these, respectively. Ideally these would get standardized to _parent
in the process.
Another option would be to change NDFrame._data
to something like NDFrame._mgr
so it there is little risk of name-overlap. I expect this would meet more resistance than the accessor cleanup idea.