Effect of divisibility and spreading on parallel processing Stream

advertisements

I am working with Stream parallel processing and get to know that if I am using plane Array stream, it gets processed very fast. But if I am using ArrayList, then the processing gets a bit slower. But if I use LinkedList or some Binary Tree, the processing gets even more slower.

All that sounds like the more is the splittability of stream, more faster the processing would be. That means array and array list is most efficient in case of parallel streams. Is it true? If so, Shall we always use ArrayList or Array if we want to process stream in parallel? If so, how to use LinkedList and BlockingQueue in case of parallel stream?

Another thing is the state-fulness of the intermediate functions chosen. If I perform stateless operations like filter(), map(), the performance is high, but if perform the state full operations like distinct(), sorted(), limit(), skip(), it takes a lot of time. So again, the parallel stream get slower. Does that means we should not go for state full intermediate functions in parallel stream? If so, then what is the work around for that?


Not a bad questions IMO.

Of course the array and ArrayList are going to be splittable much better then LinkedList or some type of a Tree. You can look at how their Spliterators are made to convince yourself. They usually start with some batch size (1024 elements) and increase from that. LinkedList does that and Files.lines if I remember correctly. So yes, using an arrays and ArrayList will have a very good parallelization.

If you want a better parallel support for some structures like LinkedList you could write your own spliterator - I think StreamEx did that for Files.lines to start with a smaller batch size... And this is a related question btw.

The other thing is that when you use stateful intermediate operations - you will effectively make intermediate operations that are above that stateful one - into stateful too... Let me provide an example:

  IntStream.of(1, 3, 5, 2, 6)
            .filter(x -> {
                System.out.println("Filtering : " + x);
                return x > 2;
            })
            .sorted()
            .peek(x -> System.out.println("Peek : " + x))
            .boxed()
            .collect(Collectors.toList());

This will print:

Filtering : 1
Filtering : 3
Filtering : 5
Filtering : 2
Filtering : 6
Peek : 3
Peek : 5
Peek : 6

Because you have used sorted and filter is above that, filter has to take all elements and process them - so that sorted is applied to the correct ones.

On the other hand if you dropped sorted:

  IntStream.of(1, 3, 5, 2, 6)
            .filter(x -> {
                System.out.println("Filtering : " + x);
                return x > 2;
            })
            // .sorted()
            .peek(x -> System.out.println("Peek : " + x))
            .boxed()
            .collect(Collectors.toList());

The output is going to be:

 Filtering : 1
 Filtering : 3
 Peek : 3
 Filtering : 5
 Peek : 5
 Filtering : 2
 Filtering : 6
 Peek : 6

Generally I do agree, I try to avoid (if I can) stateful intermediate operations - may be you don't want sorted let's say - may be you can collect to a TreeSet... etc. But I don't overthink it - it I need to use it - I just do and may be measure to see if it's really a bottleneck.

Unless you are really hitting some performance problems around this - I would not take that much into account; especially since you would need lots of elements to actually have some speed benefit from parallel.

Here is a related question that shows that you really really need lots of elements to see a performance gain.