How to change a dataframe in R into a named list?

advertisements

I have a dataframe ("samp") in R with student IDs and the exams that each student did.

    student_id math_exam spanish_exam
       <int>     <dbl>     <dbl>
1          1       0         1
2          2       1         0
3          3       0         0
4          4       1         1

I would like to make a named list where I have the student ID and the names of the exams that the student has taken instead of the 0 and 1. So for student 1 it will show just the spanish exam but for student 4 it will show math exam, spanish exam.

I think I am close with using the replace command so I did some basic testing and see if I can just replace all the 1's with a column name:

    replace(samp, grepl(1, samp, perl=TRUE), names(samp)[2])

But instead I replaced everything with the same column name like so:

   student_id math_exam spanish_exam
 1  math_exam math_exam    math_exam
 2  math_exam math_exam    math_exam
 3  math_exam math_exam    math_exam
 4  math_exam math_exam    math_exam

I tried to just specify a column like samp$math_exam but get the same results. Is using replace a good idea? I am still fairly new with R so apologies if I'm asking too much. Any guidance on this would be wonderful! Thank you


Here's a way to slice the data.frame, melt the dataset to a long format and return only the exams taken.

library(tidyr)

xy <- data.frame(student_id = 1:4, math_exam = c(0, 1, 0, 1), spanish_exam = c(1, 0, 0, 1))

xy <- split(xy, xy$student_id)

result <- lapply(xy, FUN = function(x) {
  out <- gather(x, key = exam, value = taken, -student_id)
  out[out$taken == 1, ][, -3]
})

do.call(rbind, result)

    student_id         exam
1            1 spanish_exam
2            2    math_exam
4.1          4    math_exam
4.2          4 spanish_exam

If you fancy a dplyr solution...

library(dplyr)

xy %>%
  group_by(student_id) %>%
  gather(key = exam, value = taken, -student_id) %>%
  filter(taken == 1) %>%
  select(-taken)

Source: local data frame [4 x 2]
Groups: student_id [3]

  student_id         exam
       <int>        <chr>
1          2    math_exam
2          4    math_exam
3          1 spanish_exam
4          4 spanish_exam