Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ieduplicates: issue with variable formats #103

Closed
luizaandrade opened this issue Nov 1, 2017 · 4 comments
Closed

ieduplicates: issue with variable formats #103

luizaandrade opened this issue Nov 1, 2017 · 4 comments

Comments

@luizaandrade
Copy link
Collaborator

When a variable in Stata has a different format than the one that is saved in excel, the commands throws an error saying observations were deleted.

For example, one you have a date variable, it is saved as a string in the excel report, but is a number in Stata, so the command can't match observations. Same happens for labeled variables and missing values. For now, I think the best solution is to make variables that cause this errors strings, but we should find a way to fix this or at least make the help file informative about this possiblity.

@luizaandrade
Copy link
Collaborator Author

@kbjarkefur the solution I found for this problem when I was creating the reports was to first convert the variables to string, then create the report. Do you think there's a more efficient way to do this?

@kbjarkefur
Copy link
Contributor

Is it this error message that you get:

One or several observations in the Excel report are no longer found in the data set.

I agree that what you suggest is a good solution. To tostring() before generating the report, and to tostring() the variables before merging with the report. I only see one issue, we do not want to return the data set with any modifications to the variables. So we need to be able to revert this.

The solutions is probably to make a clone of the variable under a new name. Then the last thing we do is clone it back. Also, I think this error (if it is the error I quoted above) can not be caused in variables listed in keepvar() only in uniquevar(), so that might reduce the work we need to do.

*Create a tempvar clone and order it next to the original var 
*this will have to be repeated over all uniquevars
tempvar iedup1
clonevar `iedup1' = uniquevar1
order `iedup1', after(uniquevar1)

tostring uniquevar1 , replace //we should probably use *force* here as well

*generate report, merge back the report etc.

* before returning the data set delete the var we modified and replace it with the clone
drop uniquevar1
clonevar uniquevar1 = `iedup1'
order uniquevar1 , after(`iedup1')

What do you think?

@luizaandrade luizaandrade added the minor bug Bug unlikely to lead to incorrect analysis label Feb 1, 2018
@luizaandrade luizaandrade self-assigned this Apr 3, 2018
@kbjarkefur
Copy link
Contributor

Since no changes are ever made to the uniquevars through the report, we do not need to keep the clones after we have merged. Something like this.

*Create a tempvar clone and order it next to the original var 
*this will have to be repeated over all uniquevars
tempvar iedup1
clonevar `iedup1' = uniquevar1
order `iedup1', after(uniquevar1)

tostring uniquevar1 , replace //we should probably use *force* here as well

*generate report, merge back the report etc.

* before returning the data set delete the var we modified and replace it with the clone
drop  `iedup1'

@luizaandrade
Copy link
Collaborator Author

I cannot reproduce this error. I've managed to include both a date variable and a labelled factor without problems, so I will close this issue and we can come back to this if we ever come across this error again.

@luizaandrade luizaandrade added error could not be replicated and removed minor bug Bug unlikely to lead to incorrect analysis labels Apr 24, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants