A: Why do not I get the function to return the vector as expected?

advertisements
    corr <- function(directory, threshold) {
          files <- list.files(directory, full.names = TRUE)
          nu <- numeric()
          for(i in length(files)) {
            my_data <- read.csv(files[i])
            if (sum(complete.cases(my_data)) >= threshold) {
              vec_sul <- my_data[complete.cases(my_data),]$sulfate
              vec_nit <- my_data[complete.cases(my_data),]$nitrate
              nu <- c(nu, cor(vec_sul, vec_nit))
            }
          }
      nu
    }

I've a list of .csv files sitting inside the directory I wish to pass as an argument to the function illustrated above. I also pass threshold value as the second argument. The objective is to read through all the files in the directory parameter and check if the files have complete cases more than the threshold value passed as the second arg.

Those files that pass this criteria will further be examined and follows the evaluation of the correlation between the two variables inside it: Sulfate and Nitrate. The series of such correlation values associated with the files that have more complete cases than the threshold value will be concatenated to a numerical variable vector. At the end of the loop execution, I want the function to return the vector containing the series of the correlation values evaluated in the "if" loop.

cr <- corr("specdata", 150) When I run the above line of code in console, I get a numerical variable which is null. Could someone help me fix the code?


Though this kind of error has been seen so many times, it still happen. You want

i in 1:length(files)

You get numeric(0) (the "numeric null" you talk about), because your loop only reads in the final file. I guess the final file does not satisfy sum(complete.cases(my_data)) >= threshold so nothing is added to nu, initialized as numeric(0).


Also, I would like to point out that

vec_sul <- my_data[complete.cases(my_data),]$sulfate
vec_nit <- my_data[complete.cases(my_data),]$nitrate
nu <- c(nu, cor(vec_sul, vec_nit))

can be replaced by

nu <- c(nu, with(my_data, cor(sulfate, nitrate, use = "complete.obs")))