Extract data from one data frame to another data frame with a different line length

advertisements

I have two data.frames as follows:

df1 <- data.frame(A=c("lee","eeu","ees"), B=c("lee","ggu","1su"), C=c(1,1,1)

    A   B C
1 lee lee 1
2 eeu ggu 1
3 ees 1su 1

df2 <- data.frame (X=c("lee","1su","eeu","ggu"), Y=c("3k3","4k4","5k","2ee"), Z=c("ggg","","","ooo"), ZA=c("vvv","","",""))

    X   Y   Z  ZA
1 lee 3k3 ggg vvv
2 1su 4k4
3 eeu  5k
4 ggu 2ee ooo

I want to expand df1 by matching df1$B with df2$X. When df1$B = df2$X, I want to add additional rows to the new_df1 with new B = other entries in df2 on the same row, but keeping A and C the same.

new_df1 is expected to be as follows:

 A   B  C
lee 3k3 1 ### df1$B1= df2$X1= lee
lee ggg 1
lee vvv 1
eeu 2ee 1 ### df1$B2= df2$X4= ggu
eeu ooo 1
ees 4k4 1 ### df1$B3= df2$X2= lsu

My past experience on using lapply seems to be very memory-demanding, is it possible to be done without using lapply?


I think what you wnat is a subset of this:

require(reshape2)
merge(df1,melt(df2, id.var="X"), by.x="B", by.y="X", all=TRUE)
     B    A  C variable value
1  1su  ees  1        Y   4k4
2  1su  ees  1        Z
3  1su  ees  1       ZA
4  ggu  eeu  1        Y   2ee
5  ggu  eeu  1        Z   ooo
6  ggu  eeu  1       ZA
7  lee  lee  1        Y   3k3
8  lee  lee  1        Z   ggg
9  lee  lee  1       ZA   vvv
10 eeu <NA> NA        Y    5k
11 eeu <NA> NA        Z
12 eeu <NA> NA       ZA

I assigned that object to "M1" (and later noticed that it did not need all=TRUE)

M1 <- merge(df1,melt(df2, id.var="X"), by.x="B", by.y="X")
subset(M1, value != "" , select=c(A,value, C) )
    A value C
1 ees   4k4 1
4 eeu   2ee 1
5 eeu   ooo 1
7 lee   3k3 1
8 lee   ggg 1
9 lee   vvv 1