I have a df called t_compl
with three columns of dates.
date_op_indl date_compl usdato.x
1984-11-22 1984-11-22 1983-09-07
2004-11-16 NA 1994-10-27
1996-09-10 1996-09-10 1982-11-09
1986-05-24 1986-05-24 1982-11-22
1989-08-22 1989-06-13 1983-02-11
I want to create a fourth variable (t_compl$time
) with the time difference between usdato.x
and date_compl
or date_op_indl
in years. I want to choose the earliest date from date_compl
and date_op_indl
.
I have tried
t_compl$time<- ifelse((t_compl$date_compl-t_compl$usdato.x)<=(t_compl$date_op_indl-t_compl$usdato.x), ((t_compl$date_compl-t_compl$usdato.x)/365.25), ((t_compl$date_op_indl-t_compl$usdato.x)/365.25))
Because some date_compl
are missing I want to use compl_op_indl
for the calculation
t_compl$time[is.na(t_compl$time)] <- ((t_compl$date_op_indl-t_compl$usdato.x)/365.25)
and then get this error
Warning message:
In t_compl$time[is.na(t_compl$time)] <- ((t_compl$date_op_indl - :
number of items to replace is not a multiple of replacement length
The time calculations also turn out completely wrong for the compl_date
that were NA
.
How to make this time difference calculation in R?
DF <- read.table(text="date_op_indl date_compl usdato.x
1984-11-22 1984-11-22 1983-09-07
2004-11-16 NA 1994-10-27
1996-09-10 1996-09-10 1982-11-09
1986-05-24 1986-05-24 1982-11-22
1989-08-22 1989-06-13 1983-02-11", header=TRUE)
DF[] <- lapply(DF, as.Date)
Use pmin
to calculate the minimum date for each observation (thereby ignoring NA
values):
DF$time <- difftime(do.call(pmin, c(DF[, c("date_op_indl", "date_compl")], na.rm = TRUE)),
DF$usdato.x, units="days")
# date_op_indl date_compl usdato.x time
# 1 1984-11-22 1984-11-22 1983-09-07 442 days
# 2 2004-11-16 <NA> 1994-10-27 3673 days
# 3 1996-09-10 1996-09-10 1982-11-09 5054 days
# 4 1986-05-24 1986-05-24 1982-11-22 1279 days
# 5 1989-08-22 1989-06-13 1983-02-11 2314 days