Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solution: Converting data frame to JSON #9

Open
wjones127 opened this issue Jun 23, 2016 · 3 comments
Open

Solution: Converting data frame to JSON #9

wjones127 opened this issue Jun 23, 2016 · 3 comments

Comments

@wjones127
Copy link

I found a working solution to Wes's problem of converting a dataframe into the JSON format he wanted.

The approach differs from Wes's mainly in that I just do the nesting of subquestions for every question, and later prune out the subquestions that have NA as their id. His approach was pretty close, but there is a bit of a frustrating limitation to the nest() function: it can't be applied twice in a row to the same data frame. You can see in the nest_subquestions function that I had to create separate data frames with the nested subquestions and options, and then merge them together; this couldn't be done in the same data frame without getting an error.

library(dplyr)
library(tidyr)
library(jsonlite)
library(purrr)

dummyDF <- data_frame(sectionId = c(rep(1,9),rep(2,3)),
                                            questionId = c(rep(1,3),rep(2,6),rep(3,3)),
                                            subquestionId = c(rep(NA,3),rep("2a",3),rep("2b",3),rep(NA,3)),
                                            deptManagerQId = c(rep("m1",3),rep("m2",3),rep("m3",3),rep("m4",3)),
                                            deptEmployeeQId = c(rep("e1",3),rep("e3",3),rep("e4",3),rep("e7",3)),
                                            optionId = rep(c(1,2,3),4),
                                            text = rep(c("yes","neutral","no"),4))

# Define nesting functions
# i.e. for each set of questions in a section...
nest_subquestions <- function(questions)  {
    with_subquestions <- questions %>% 
        group_by(questionId) %>%
        nest(subquestionId, deptManagerQId, deptEmployeeQId, .key=subquestions)  %>%
        mutate(subquestions = subquestions)

    with_options <- questions %>%
        group_by(questionId) %>%
        nest(optionId, text, .key=options)

    left_join(with_subquestions, with_options, by="questionId")
}

# Define pruning function
remove_null <- function(subquestions) {
    print(is.na(subquestions$subquestionId[1]))
    if (is.na(subquestions$subquestionId[1])) {
        subquestions <- NA
    }
}
# For each set of questions...
prune_subquestions <- function(questions) {
    questions %>%
        mutate(subquestions = ifelse(sapply(subquestions, has_null_id),
                                                                 NA,
                                                                 subquestions))
}
# For each question...
has_null_id <- function(subquestions) {
    is.na(subquestions$subquestionId[1])
}

list2 <- dummyDF %>% nest(-sectionId, .key=questions) %>% 
    mutate(questions = map(questions, nest_subquestions),
                 questions = map(questions, prune_subquestions)) %>%
    list(sections = .)

list2 %>% toJSON(pretty = T, na = c("null")) 
@ismayc
Copy link

ismayc commented Jun 23, 2016

Beautiful! Well done, @wjones127.

@smithjd
Copy link

smithjd commented Jun 24, 2016

I'm looking at your code @wjones127 , and wondering whether you found that using the map function from purr was essential to your solution or whether it was more of an experiment? Wes was using the more traditional seq_along...

@wjones127
Copy link
Author

Yeah I had purrr in there because I though I might need some of it's functionality, but I think in the final code all the map calls could be replaced with lapply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants