Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reading Stata value labels fails for all value labels prior to an empty value label #219

Closed
NilsEnevoldsen opened this issue Nov 23, 2020 · 8 comments
Labels

Comments

@NilsEnevoldsen
Copy link

NilsEnevoldsen commented Nov 23, 2020

Reported as tidyverse/haven#551

Variables whose value labels appear on the label list prior to one with an empty value label are not assigned their value labels.

Step 1:

input x y z
0 0 0
end

lab def xlab 0 "Foo"
lab val x xlab

lab def ylab 0 "Bar"
lab val y ylab

lab def ylab 0 "", modify // This is one way to create an "empty" value label. It does *not* create a value label with an empty string for value 0.

lab def zlab 0 "FooBar"
lab val z zlab

label list
save example_input

Step 2: readstat example_input.dta example_output.dta

Actual example_output.dta :

Screen Shot 2020-11-22 at 9 06 35 PM

Expected example_output.dta :

Screen Shot 2020-11-22 at 9 06 49 PM

(Minor detail: the blueness of the 0 indicates the presence of an [empty] value label for variable y; I have no strong opinion on whether this value label should be preserved in the output file.)

@evanmiller
Copy link
Contributor

Can you please provide a copy of example_input.dta? Thanks

@NilsEnevoldsen
Copy link
Author

Yes, I have attached four example_inputs. I used Stata 14 to generate them with the following code:

save example_input_14
saveold example_input_13, version(13)
saveold example_input_12, version(12)
saveold example_input_11, version(11)

example_input_14.dta (version 118) and example_input_13.dta (version 117) exhibit the incorrect behavior when processed by ReadStat. example_input_12.dta (version 115) and example_input_11.dta (also version 115? I'm unclear on this) exhibit correct behavior when processed by ReadStat. (In the latter cases, they do both drop ylab, which I don't feel strongly about.)

example_input.zip

evanmiller added a commit that referenced this issue Nov 28, 2020
See #219. This should fix empty value labels for DTA versions 117 and
later
@evanmiller
Copy link
Contributor

Thanks. See if this commit fixes things:

263ba39

@NilsEnevoldsen
Copy link
Author

I get an error on every DTA file with 1.1.5-rc0.

% readstat example_input_16.dta example_output_16.dta
Error beginning row: Unable to write data
Converted 1 variables and 1 rows in 0.00 seconds
Error processing example_input_16.dta: The parsing was aborted (callback returned non-zero value)

@evanmiller
Copy link
Contributor

Please try the dev branch rather than the RC

@NilsEnevoldsen
Copy link
Author

Ah, right, I just saw 93e33e2

@NilsEnevoldsen
Copy link
Author

Yes, this fix works for me, with no immediately apparent side effects. I tested example DTAs from Stata versions 11 through 16, as well as my own "in the wild" problematic example, and that reported at tidyverse/haven#551.

Thank you for your prompt fix! Will this make it into 1.1.5?

@evanmiller
Copy link
Contributor

Will this make it into 1.1.5?

Yes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants