Published on

Batching and Partitioning of List in Java

Authors

Overview

In this tutorial, We will explore how to convert the java List, Set or Iterable in batch i.e., divide List in same size partition or batch. We will see the examples of archiving this using native java as well as library

Note: Keep in mind that the partitions are sublist views of the original collection, which means that changes in the original collection will be reflected in the partitions. Please don't confuse it with deep copy.

Using Custom batch list Implementation

Let's write custom implementation of the AbstractList.

Partition logic of almost all the libraries is similar to this logic only.

public static class PartitionList<T> extends AbstractList<List<T>> {
    private final List<T> list;
    private final int size;

    public static <T> List<List<T>> of(List<T> list, int size) {
        return new PartitionList<>(list, size);
    }

    public static <T> List<List<T>> of(Set<T> set, int size) {
        return new PartitionList<>(set, size);
    }

    private PartitionList(List<T> list, int size) {
        validate(list, size);
        this.list = list;
        this.size = size;
    }

    private PartitionList(Set<T> set, int size) {
        validate(set, size);
        list = set.stream().toList();
        this.size = size;
    }

    private static <T> void validate(Collection<T> list, int size) {
        if (list == null) {
            throw new NullPointerException("List must not be null");
        } else if (size <= 0) {
            throw new IllegalArgumentException("Size must be greater than 0");
        }
    }

    @Override
    public List<T> get(int index) {
        int listSize = size();
        if (index < 0) {
            throw new IndexOutOfBoundsException("Index " + index + " must not be negative");
        } else if (index >= listSize) {
            throw new IndexOutOfBoundsException("Index " + index + " must be less than size " + listSize);
        } else {
            int start = index * size;
            int end = Math.min(start + size, list.size());
            return list.subList(start, end);
        }
    }

    @Override
    public int size() {
        return (int) Math.ceil((double) list.size() / (double) size);
    }

    @Override
    public boolean isEmpty() {
        return list.isEmpty();
    }
}

Let's have an example test.

public static void main(String[] args) {
    List<Integer> data = List.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
    List<List<Integer>> lists = PartitionList.of(data, 4);
    for (List<Integer> list : lists) {
        System.out.println(list);
    }
}

Output:

[1, 2, 3, 4]
[5, 6, 7, 8]
[9, 10]

Using Stream Classes

Most of the time we just want to stream/Iterate through partitions which we created and not more than that, in that case, we can directly create stream of batches or partitions instead of creating a List of batches.

public static <T> Stream<List<T>> getBatchStream(List<T> source, int batchSize) {
    if (batchSize <= 0) {
        throw new IllegalArgumentException("Unexpected batch size " + batchSize);
    }
    int listSize = source.size();
    if (listSize == 0) {
        return Stream.empty();
    }
    int numberOfBatch = (listSize - 1) / batchSize;
    return IntStream.range(0, numberOfBatch + 1)
            .mapToObj(n -> source.subList(n * batchSize, n == numberOfBatch ? listSize : (n + 1) * batchSize));
}

If you see closely first, we validated the input then find numberOfBatch and then iterated the int stream for that amount of time after that created sub list as par the batch number and both manual implementations are using same core logic and that is subList function if List.

public static void main(String[] args) {
    List<Integer> data = List.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
    getBatchStream(data, 3).forEach(System.out::println);
}

Output:

[1, 2, 3]
[4, 5, 6]
[7, 8, 9]
[10]

Using Google guava

Guava provides the function to partition the list in two different ways.

Here is a Guava library.

implementation 'com.google.guava:guava:33.1.0-jre';

Create List of Lists, List<List<T>>

This way we can create the list of batch on which we can move back and forward to traverse through the data.

Example:

public static void main(String[] args) {
    List<Integer> data = List.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
    List<List<Integer>> listBatch = Lists.partition(data, 3);
}

This will create batches for 3 in listBatch.

Create Iterable of List, Iterable<List<T>>

This way we can create the Iterable of batch on which we can move only forward to traverse through the data using Iterable.

There is one good thing about this is that the first argument of the partition method is Iterable<T> iterable so you can pass any type of Iterable object to divide it in to the partition or batch.

public static void main(String[] args) {
    List<Integer> data = List.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
    Iterable<List<Integer>> iterableBatch = Iterables.partition(intCollection, 3);
}

Using Apache Commons Collections

Apache commons collection also facilitates as with the partition function.

Here is the dependency:

implementation 'org.apache.commons:commons-collections4:4.5.0-M1'

ListUtils utility class provides many utility functions for List, and one of those is a divide list of data to partition or batch.

public static void main(String[] args) {
    List<Integer> data = List.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
    List<List<Integer>> batch = ListUtils.partition(data, 3);
}