Extending the binary search algo to find the first and last index of the key value to look for in a table

advertisements

The problem is to extend the binary search algorithm to find all occurrences of a target value in a sorted array in the most efficient way. Concretely speaking, the input of the algorithm are (1) a sorted array of integers, where some numbers may appear more than once, and (2) a target integer to be searched. The output of the algorithm should be a pair of index values, indicating the first and last occurrence of the integer in the array, if it does occur. The source code could be in c#, c, c++.

Also what is the max and min number of comparisons that we might need to find the indexes.


If you are a little clever you can define two different binary search functions. One will return the index of the first appearance of the searched for value and the other will return the last appearance of the searched for value. From your knowledge of binary search, you should be able to determine the maximum and minimum number of comparisons.

Using two binary searches should be the fastest method on average in my opinion. For instance, if you use just one binary search to find the first item and search linearly afterwards the worst case would be if the entire function is the same value. For an array of length 10000, this would give 10013 comparisons in the worst case while using two binary searches would give 28 comparisons in the worst case for the same array. Of course, using the same size of array, the best case for the binary/linear search method would be 14 comparisons while the best case for two binary searches method is 26 comparisons.

** Update

Okay, here is a binary search to find the first appearance of an element in an array. I'll give you a recursive function (you can of course make it iterative and optimize this in other ways). This searches for the int val in the array a of ints. Also, I haven't been careful about finding the midpoint (if the array is really large there could be problems).

int bs1(int a[], int val, int left, int right)
{
    if(right == left) return left;
    int mid = (right+left)/2;

    if(val > a[mid]) return bs1(a, val, mid+1, right);
    else return bs1(a, val, left, mid);
}

However, you should check after you are returned an index that it actually refers to the correct value because if val is not in the array, the returned index will to correspond to the next element larger than val.

A few minor changes to this will make a function that finds the last element. The keys to doing this are using the comparators correctly and remembering that integer division always truncates.