R product & ldquo; Unsupported URL pattern & rdquo; Error getting data from https sites

advertisements

R version 3.0.1 (2013-05-16) for Windows 8 knitr version 1.5 Rstudio 0.97.551

I am using knitr to do the markdown of my R code. As part of my analysis I downloaded various data sets from the web, knitr is totally fine with getting data from http sites but from https ones where it generates an unsupported URL scheme message. I know when using the download.file function on a mac the method parameter has to be set to curl to get data from an https however this doesn't help when using knitr.

What do I need to do so that knitr will gather data from Https websites?

Edit: Here is the code chunk that returns an error in Knitr but when run through R works without error.

```{r}
fileurl <- "https://dl.dropbox.com/u/7710864/data/csv_hid/ss06hid.csv"
download.file(fileurl, destfile = "C:/Users/xxx/yyy")
```


Edit (May 2016): As of R 3.3.0, download.file() should handle SSL websites automatically on all platforms, making the rest of this answer moot.

You want something like this:

library(RCurl)
data <- getURL("https://dl.dropbox.com/u/7710864/data/csv_hid/ss06hid.csv",
               ssl.verifypeer=0L, followlocation=1L)

That reads the data into memory as a single string. You'll still have to parse it into a dataset in some way. One strategy is:

writeLines(data,'temp.csv')
read.csv('temp.csv')

You can also separate out the data directly without writing to file:

read.csv(text=data)

Edit: A much easier option is actually to use the rio package:

library("rio")
import("https://dl.dropbox.com/u/7710864/data/csv_hid/ss06hid.csv")

This will read directly from the HTTPS URL and return a data.frame.