I have two data.frames as follows:
df1 <- data.frame(A=c("lee","eeu","ees"), B=c("lee","ggu","1su"), C=c(1,1,1)
A B C
1 lee lee 1
2 eeu ggu 1
3 ees 1su 1
df2 <- data.frame (X=c("lee","1su","eeu","ggu"), Y=c("3k3","4k4","5k","2ee"), Z=c("ggg","","","ooo"), ZA=c("vvv","","",""))
X Y Z ZA
1 lee 3k3 ggg vvv
2 1su 4k4
3 eeu 5k
4 ggu 2ee ooo
I want to expand df1 by matching df1$B with df2$X. When df1$B = df2$X, I want to add additional rows to the new_df1 with new B = other entries in df2 on the same row, but keeping A and C the same.
new_df1 is expected to be as follows:
A B C
lee 3k3 1 ### df1$B1= df2$X1= lee
lee ggg 1
lee vvv 1
eeu 2ee 1 ### df1$B2= df2$X4= ggu
eeu ooo 1
ees 4k4 1 ### df1$B3= df2$X2= lsu
My past experience on using lapply seems to be very memory-demanding, is it possible to be done without using lapply?
I think what you wnat is a subset of this:
require(reshape2)
merge(df1,melt(df2, id.var="X"), by.x="B", by.y="X", all=TRUE)
B A C variable value
1 1su ees 1 Y 4k4
2 1su ees 1 Z
3 1su ees 1 ZA
4 ggu eeu 1 Y 2ee
5 ggu eeu 1 Z ooo
6 ggu eeu 1 ZA
7 lee lee 1 Y 3k3
8 lee lee 1 Z ggg
9 lee lee 1 ZA vvv
10 eeu <NA> NA Y 5k
11 eeu <NA> NA Z
12 eeu <NA> NA ZA
I assigned that object to "M1" (and later noticed that it did not need all=TRUE)
M1 <- merge(df1,melt(df2, id.var="X"), by.x="B", by.y="X")
subset(M1, value != "" , select=c(A,value, C) )
A value C
1 ees 4k4 1
4 eeu 2ee 1
5 eeu ooo 1
7 lee 3k3 1
8 lee ggg 1
9 lee vvv 1