Variable labels as a dataframe field

I use both Stata and Pandas. Many Stata users save variable labels to describe the columns in a clearer way than the names. Running this in Stata

```
sysuse auto.dta
describe
```

gives something like

| variable name | storage type | variable label |
| --- | --- | --- |
| make | str18 | Make and Model |
| price | int | Price |
| mpg | int | Mileage (mpg) |

For me (maybe for others too) it would be useful to have an optional field in a DataFrame with a column label dictionary. The keys would be the columns (not necessarily all of them) and the values the string labels.

This is used in the pandas.io.stata.StataReader field `variable_labels`(see [the docs](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.io.stata.StataReader.variable_labels.html)], that allows you to import these labels when one reads in a Stata `.dta` file.

I know I could just carry around a dictionary with this information, but I think it's cleaner and less error prone to set it and save it within a DataFrame.

Additionally, storing this would allow doing a cycle on Stata/Pandas without loss of information, since the `to_stata` would check if this field exists. (`to_stata` might already have the option to pass the `variable_labels` dictionary as an option, but I didn't see it documented at least)

My coding prowess is quite limited, but I'd be happy to at least write test code and help out if somebody starts out.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Variable labels as a dataframe field #11179

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Variable labels as a dataframe field #11179

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions