Collectors in Java 8

Oct 31, 2019
hackajob Staff

In our previous Java 8 article, we spoke about the benefits of the Collection API. In today’s write-up, we’ll be taking a closer look at Collectors and how to use them in Java 8. Collectors are closely related to the Stream API and are used as the last step whilst performing Stream operations. Put simply, they ‘collect’ the elements of the Stream into a collection.

What exactly are Collectors?

As per the API documentation, a Collector is ‘a mutable reduction operation that accumulates input elements into a mutable result container, optionally transforming the accumulated result into a final representation after all input elements have been processed’.

In laments terms, a Collector is an interface that can be used to reduce a stream to a Collection or value. What’s more, Collectors also have an additional optional feature that allows them to convert elements in the input stream to some other format after the Collection process.

Stream.collect

As we mentioned earlier, a Collector is applied to a Stream and can then be used to reduce the Stream to a collection. There’s a ‘Collect’ method available on the 'Stream' interface, with the ‘Stream.Collect’ method accepting a 'Collector' instance as an argument and therefore reducing the elements in the input stream as per the specified Collector.

For arguments sake, suppose you want to convert a ‘Stream’ into a ‘List’. The following code uses a Collector for this:

Stream<String> inputStream =Stream.of("January","February","March");

List<String> list = inputStream.collect(Collectors.toList());

list.forEach(str -> System.out.print(str+" "));

The code above invokes the ‘collect’ method on the input stream. As mentioned earlier, the ‘Stream.collect’ method accepts a ‘Collector’ instance as input. Here, the ‘Collectors.toList’ method is used, with the ‘Collectors’ class has several static methods that return different types of 'Collectors' overall. Don’t worry, we’ll be going over them in the next section.

From there, the ‘Collectors.toList’ methods returns a 'Collector' that can convert a 'Stream' into a List. The ‘Stream.collect’ uses this ‘Collector’ to obtain a List from the input ‘Stream’. When this code is executed, it’ll print the following output the console:

January February March

Collectors Class

As mentioned earlier, the ‘Collectors’ class has static factory methods which can be used to create various types of collector instances. As a result, each method in this class can be used to obtain a ‘Collector’ that can perform a particular type of reduction operation on a ‘Stream’. The ‘Collector’ can then be applied to a ‘Stream’ via the ‘Stream.collect’ method. As an example, we’ve already seen the ‘Collector.toList’ method which creates a ‘Collector’ for converting a ‘Stream’ to a List. Some of the other methods in the ‘Collectors’ class are explained below:

toSet/toCollection

The ‘toSet’ and ‘toCollection’ methods can be used to get a ‘Collector’ that converts an input ‘Stream’ to a ‘Set’ or ‘Collection’. They’re fairly similar to the ‘toList’ method demonstrated above, with the following code showcasing the ‘toSet’ method explicitly:

Stream<String> animals = Stream.of("cat","dog","lion");

Set<String> animalsSet = animals.collect(Collectors.toSet());

animalsSet.forEach(str -> System.out.print(str+" "));

The example above will print the following output:

cat dog lion

Note that the ‘toList’, ‘toSet’ and ‘toCollection’ methods are most commonly used in the 'Collectors' class.

Counting

In a nutshell, the ‘Collectors.counting’ method returns a ‘Collector’ that counts the number of elements in the 'Stream' on which it’s applied. The following code demonstrates how this works:

List<Integer> list = Arrays.asList(5, 3, 11, 15, 9, 2, 5, 11);

long count = list.stream().collect(Collectors.counting());

System.out.println("Count:"+count);

This code prints the following output:

Count:8

Joining

The ‘Collectors.joining’ method returns a ‘Collector’ that links together the elements within the input ‘Stream’ and turns them into a 'String'. The following lines of code show how this works:

Stream<String> fruits = Stream.of("Hello","World");

String str = fruits.collect(Collectors.joining());

System.out.println("str:"+str);

This code prints the following output:

str:HelloWorld

There’s also an overloaded version of the ‘joining’ method that concatenates the input elements by using a specified delimiter. With this in mind, you can rewrite the above code as:

String str = fruits.collect(Collectors.joining(" "));

This will print the following output:

str:Hello World

It’s also key to note that the ‘Collectors.joining’ method can only be applied to a ‘Stream’ of ‘Strings’, so if you try to use it on a ‘Stream’ having any other data type, it'll result in a compilation error.

MaxBy

The ‘Collectors.maxBy’ method returns a ‘Collector’ that obtains the largest element in the input stream as per the specified ‘Comparator’, with the value being returned as an ‘Optional<T>’. The following code demonstrates this:

List<Integer> numbers = Arrays.asList(12,7,32,99,67,45);

Optional<Integer> max = numbers.stream().collect(Collectors.maxBy(((a,b) -> a-b)));

System.out.println(max);

Remember that the ‘Collectors.maxBy’ method accepts a ‘Comparator’ instance. Here, a lambda expression is specified, so when this code is executed, it prints the following output:

Optional[99]

Just like the ‘maxBy’, there’s also a ‘Collectors.minBy’ method that produces the smallest element in the input ‘Stream’ as per the specified ‘Comparator’.

SummingInt

The ‘Collectors.summingInt’ returns a ‘Collector’ that produces the sum of the result of applying a function to each element in the input stream. The ‘Collector’ applies some function that returns an 'Integer' to each element in the stream and then returns the result of adding all the integers.

The following code demonstrates this:

public class Student {

private int id;

private String name;

private int marks;

//getters and setters

}

public class SummingIntDemo {

public static void main(String[] args) {

List<Student> students = new ArrayList<Student>();

students.add(new Student(1, "Jane", 86));

students.add(new Student(2, "Tom", 72));

students.add(new Student(3, "Bill", 93));

int sum = students.stream().collect(Collectors.summingInt(student -> student.getMarks()));

System.out.println(sum);

}

}

Above, a ‘Student’ class with fields corresponding to name, id and marks is defined and an ‘ArrayList’ of ‘Student’ objects is created. The ‘Collectors.summingInt’ method is invoked, with this method accepting the parameter ‘ToIntFunction’, which is an in-built functional interface. This will accept an input of any data type but returns an ‘int’. Here, a lambda expression that accepts a student object and returns the marks field is used. So the ‘Collector’ applies this function to each ‘Student’ object to obtain the marks, adds up the marks and returns the result.

When this code is executed, it prints the following output:

251

Just like the ‘summingInt’ method, there are also methods including ‘summingLong’ and ‘summingDouble’ which work in a similar way to the ‘summingInt’ method but can also be used to obtain a long or double result respectively.

PartitioningBy

The ‘partitioningBy’ method returns a ‘Collector’ that splits the input stream into two parts based on a condition specified via a 'Predicate'. The following code demonstrates this:

Stream<Integer> input = Stream.of(12, 9, 43, 14, 34);

Map<Boolean, List<Integer>> partitionedData = input.collect(Collectors.partitioningBy(num -> num % 2 == 0));

System.out.println("Partitioned Map:" + partitionedData);

In the code above, a ‘Stream’ of integers is defined and the ‘Collectors.partitionBy’ is used. This method accepts a ‘Predicate’ instance and as we mentioned previously, ‘Predicate’ is an in-built functional interface that accepts an input of any data type and returns a boolean value.

Here, a lambda expression that accepts a number, checks if it is even and returns a boolean value accordingly is used. So the ‘Collector’ applies this condition to every element in the input stream and returns a ‘Map’. Note that the map has 2 entries, one corresponding to the elements that do match the condition and the other corresponding to the elements that do not match the condition. When this code is executed, it prints the following output:

Partitioned Map:{false=[9, 43], true=[12, 14, 34]}

A closer look at the Collector Interface

So far, we’ve seen how to obtain the ‘Collector’ instance via static methods on the ‘Collectors’ class. Now, we’ll be taking a closer look at the ‘Collector’ interface as a whole.

The definition of the Collector interface

The Collector interface is Defined as: ‘Collector<T,A,R>’

Here ‘T’ represents the type of element in the 'Stream'

‘A’ represents the type of result container to hold collected elements

‘R’ represents the type of the final output

There are also four additional methods defined within the ‘Collector’ interface. These work together to convert the input stream to collection - we’ve outlined them below:

1)     Supplier() – This is responsible for creating the collection that holds the collected elements and returns a ‘Supplier’ instance.

2)     Accumulator() - This is responsible for adding the elements from the Stream into the Collection and returns an instance of the ‘BiConsumer’ interface.

3)     Combiner() - This is responsible for combining partial results obtained by separate accumulation operation and is generally applicable in case of parallel streams. The combiner returns an instance of the ‘BinaryOperator’ interface.

4)     Finisher() - This converts the collected elements into a different form which is the final result type and returns an instance of the ‘Function’ interface.

How it works

The ‘Stream.collect’ uses the methods on the ‘Collector’ interface to obtain individual components in the 'Collector' and perform the reduction operation using these components. First, the ‘supplier’ is used to create the ‘Map’ or ‘Collection’ to hold the collected elements. Then, the ‘accumulator’ is used to add the elements of the input stream into the Collection. After this, the ‘combiner’ is used to combine partial results. Finally, the ‘finisher’ is used to convert the collected elements to the appropriate form.

If you want to create a ‘Custom’ collector, you need to implement this ‘Collector’ interface and provide implementations for all the above methods. To save developers the trouble of going through all this, Java 8 has added the ‘Collectors class’ which provides ‘Collector instances’ for common reduction operations.

Overall, ‘Collectors’ are a really crucial feature added by Java 8 and are super useful for either converting a ‘Stream’ into a ‘Collection’ OR for performing other reduction operations upon the ‘Stream’. Fancy learning more about Java 8? Make sure to check out our other articles and take a peek at our website.