Get the first elements (take on the function) of a DPStream

advertisements

I look for a way to retrieve the first elements of a DStream created as:

val dstream = ssc.textFileStream(args(1)).map(x => x.split(",").map(_.toDouble))

Unfortunately, there is no take function (as on RDD) on a dstream //dstream.take(2) !!!

Could someone has any idea on how to do it ?! thanks


You can use transform method in the DStream object then take n elements of the input RDD and save it to a list, then filter the original RDD to be contained in this list. This will return a new DStream contains n elements.

val n = 10
val partOfResult = dstream.transform(rdd => {
  val list = rdd.take(n)
  rdd.filter(list.contains)
})
partOfResult.print