Sort and group a java collection

advertisements

I have an object which has a name and a score. I would like to sort a collection of such objects so that they are grouped by name and sorted by maximum score in each group (and within the group by descending score as well).

let me demonstrate what I intend to achieve. assume I have these objects(name, score):

(a, 3)
(a, 9)
(b, 7)
(b, 10)
(c, 8)
(c, 3)

then I would like them to be sorted like this:

(b, 10)
(b, 7)
(a, 9)
(a, 3)
(c, 8)
(c, 3)

is this feasible with a Comparator? I can't figure it out, so any hints would be appreciated.


No, you can't do it with a single sort with a single Comparator.

You have to group by name, and sort the groups, by highest score in group.

Then you need to flatten the groups back to a list.

With Java 8

Edit: Since i wrote this answer, Java 8 has come out, which simplifies the problem a lot:

import java.util.*;
import static java.util.Comparator.*;
import static java.util.stream.Collectors.*;


List<Record> result = records.stream()
    .sorted(comparingInt(Record::getScore).reversed())
    .collect(groupingBy(Record::getName, LinkedHashMap::new, toList()))
    .values().stream()
    .flatMap(Collection::stream)
    .collect(toList());

First we sort by score reversed, and then we group using a LinkedHashMap, which will preserve the insertion order for the keys, so keys with higher score will come first.

Sorting first is OK if the groups are small, so the redundant compares between objects in different groups don't hurt so much.

Also, with this method, duplicates are preserved.


Alternatively, if you don't care about preserving duplicates, you can:

Comparator<Record> highestScoreFirst = comparingInt(Record::getScore).reversed();

List<Record> result = records.stream()
        .collect(groupingBy(Record::getName,
                toCollection(() -> new TreeSet<>(highestScoreFirst))))
        .values().stream()
        .sorted(comparing(SortedSet::first, highestScoreFirst))
        .flatMap(Collection::stream)
        .collect(toList());

Where the records are grouped into sorted TreeSets, instead of sorting the values as the first operation of the stream, and then the sets are sorted by their first, highest value.

Grouping before sorting is appropriate if the groups are big, to cut down on redundant compares.


Implementing Comparable:

And you can make it shorter by having your record implement Comparable

public class Record implements Comparable<Record> {
    @Override
    public int compareTo(Record other) {
        // Highest first
        return -Integer.compare(getScore(), other.getScore());

        /* Or equivalently:
           return Integer.compare(other.getScore(), getScore());
        */
    }
    ...
}


List<Record> result = records.stream()
    .collect(groupingBy(Record::getName, toCollection(TreeSet::new)))
    .values().stream()
    .sorted(comparing(SortedSet::first))
    .flatMap(Collection::stream)
    .collect(toList());


Before Java 8

Edit: Here is a really rough unit test that demonstrates one way to do it. I haven't cleaned it up as much as i would have liked.

Stuff like this is painful in Java, and i would normally use Google Guava for this.

import org.junit.Test;

import java.util.*;

import static java.util.Arrays.asList;
import static org.junit.Assert.assertEquals;

public class GroupSortTest {

    @Test
    public void testGroupSort() {
        List<Record> records = asList(
                new Record("a", 3),
                new Record("a", 9),
                new Record("b", 7),
                new Record("b", 10),
                new Record("c", 8),
                new Record("c", 3));

        List<SortedMap<Integer, Record>> recordsGroupedByName = groupRecordsByNameAndSortedByScoreDescending(records);
        Collections.sort(recordsGroupedByName, byHighestScoreInGroupDescending());
        List<Record> result = flattenGroups(recordsGroupedByName);

        List<Record> expected = asList(
                new Record("b", 10),
                new Record("b", 7),
                new Record("a", 9),
                new Record("a", 3),
                new Record("c", 8),
                new Record("c", 3));

        assertEquals(expected, result);
    }

    private List<Record> flattenGroups(List<SortedMap<Integer, Record>> recordGroups) {
        List<Record> result = new ArrayList<Record>();
        for (SortedMap<Integer, Record> group : recordGroups) {
            result.addAll(group.values());
        }
        return result;
    }

    private List<SortedMap<Integer, Record>> groupRecordsByNameAndSortedByScoreDescending(List<Record> records) {
        Map<String, SortedMap<Integer, Record>> groupsByName = new HashMap<String, SortedMap<Integer, Record>>();
        for (Record record : records) {
            SortedMap<Integer, Record> group = groupsByName.get(record.getName());
            if (null == group) {
                group = new TreeMap<Integer, Record>(descending());
                groupsByName.put(record.getName(), group);
            }
            group.put(record.getScore(), record);
        }
        return new ArrayList<SortedMap<Integer, Record>>(groupsByName.values());
    }

    private DescendingSortComparator descending() {
        return new DescendingSortComparator();
    }

    private ByFirstKeyDescending byHighestScoreInGroupDescending() {
        return new ByFirstKeyDescending();
    }

    private static class ByFirstKeyDescending implements Comparator<SortedMap<Integer, Record>> {
        public int compare(SortedMap<Integer, Record> o1, SortedMap<Integer, Record> o2) {
            return o2.firstKey().compareTo(o1.firstKey());
        }
    }

    private static class DescendingSortComparator implements Comparator<Comparable> {
        public int compare(Comparable o1, Comparable o2) {
            return o2.compareTo(o1);
        }
    }
}