Delete a sub-array from an array of ints in the fastest way?

advertisements

In Java, we have an array int[] a = new int[10000000]; fully filled in with arbitrary numbers. Quite often in code we need to remove an arbitrary subsequence: that is a set of elements that may be non-contiguous.

The reason to use int[] over LinkedList is a speed gain while passing through elements. Currently there is no removal of elements, so lots of rubbish is stored for the time of running application. Removing elements may give a speed-up, so quite an interesting question.

How to remove a subseqence from an array in a fastest possible way?


It depends on whether you want to shorten the array or if you can allow unused elements at the end of the array. The tool for this is System.arraycopy. To shorten the array, you will need to allocate a new one:

public int[] remove(int[] original, removeStart, removeEnd) {
    int originalLen = original.length;
    int[] a = new int[originalLen - removeEnd - removeStart];
    System.arraycopy(original, 0, // from original[0]
        a, 0,                     // to a[0]
        removeStart);             // this many elements
    System.arraycopy(original, removeEnd, // from original[removeEnd]
        a, removeStart,                   // to a[removeStart]
        originalLen - removeEnd);         // this many elements
    return a;
}

To just compact an array:

System.arraycopy(array, removeEnd, // from array[removeEnd]
    array, removeStart,            // to array[removeStart]
    array.length - removeEnd);     // this number of elements

You don't have to worry about overlapping ranges; arraycopy correctly deals with those.

If you have a discontinuous range of elements to remove, you can either generalize one of these solutions (less moving things around, but more complex code) or you can remove each continuous block separately (easier to program but you will be moving around data that you will be discarding).

If you have scattered indices to remove, I would do it by hand. The design depends on whether it is scattered individual indices or whether it is a collection of ranges. With the latter (this is untested, but it should give you the idea):

/**
 * Simple class to hold the start and end of a range.
 */
public static class Range implements Comparable<Range> {
    int start;
    int end;
    public int compareTo(Range other) {
        if (start < other.start) return -1;
        if (start > other.start) return 1;
        if (end < other.end) return -1;
        if (end > other.end) return 1;
        return 0;
    }
}
/**
 * Remove a list of ranges from an array.
 * @param original the array from which to remove the values.
 * @param toRemove the list of ranges to remove. This must be
 *    sorted in ascending order of range start before calling this method.
 * @param compact flag indicating whether to simply compact the original
 *    array or to copy the values into a new array. If false, will allocate
 *    a new array of the exact size needed to contain the elements not removed.
 */
public int[] remove(int[] original, List<Range> toRemove, boolean compact) {
    int[] a;
    if (compact) {
        a = original;
    } else {
        int len = 0;
        for (Range range : toRemove) len += range.end - range.start;
        a = new int[original.length - len];
    }
    int nextSource = 0;
    int nextDest = 0;
    for (Range range : toRemove) {
        if (nextSource < range.start) {
            System.arraycopy(original, nextSource, a, nextDest,
                range.start - nextSource);
            nextDest += range.start - nextSource;
            nextSource = range.start;
        }
        nextSource = range.end;
    }
    if (nextSource < original.length) {
        System.arraycopy(original, nextSource, a, nextDest,
            original.length - nextSource);
    }
    return a;
}