|
Binary Search |
Binary search is a more specialized algorithm than sequential search as it takes advantage of data that has been sorted. The underlying idea of binary search is to divide the sorted data into two halves and to examine the data at the point of the split. Since the data is sorted, we can easily ignore one half or the other depending on where the data we're looking for lies in comparison to the data at the split. This makes for a much more efficient search than linear search.
When learning linear search, you were asked to perform an exercise with a phone book. Go get the phonebook again. Let's say we're looking for the name 'John Smith'. Open up the phone book approximately half way and look at the name at the top of the page. What does it say? Probably a name that begins with an 'M' or some letter in that vicinity. Now think to yourself, does Smith come before or after this in the phonebook? After, right? So you can ignore the entire first half of the phone book. Now open the remaining half about half way. You're probably somewhere near the 'T's. Does Smith come before or after 'T' in the phonebook? Before. So you can ignore the latter half. Continue doing this until you find the name for which you're looking.
What you've just done is a binary search. Binary search involves binary decisions, decisions with two choices. At each step in the process you can eliminate half of the data you're searching. This is the way humans look up most information in big volumes, such as a phonebook or a dictionary. We guess a place in the middle of the book, then move forward or backward depending on the location you're at relative to the location of what you're looking for. This works because all of the data is sorted, in alphabetical order in the case of a phonebook or dictionary.
Binary search is much faster than linear search for most data sets. If you look at each item in order, you may have to look at every item in the data set before you find the one you are looking for. With binary search, you eliminate half of the data with each decision. If there are n items, then after the first decision you eliminate n/2 of them. After the second decision you've eliminated 3n/4 of them. After the third decision you've eliminated 7n/8 of them. Etc. In other words, binary search is O(log n) . You can see that for a large data set, binary search would be much better than linear search.
In general, binary search operates on one of two data structures: arrays and trees.
The first thing to do when coding up any algorithm is to define the algorithm clearly and in such a way that it is easy to turn into code.
The array that we are searching must be sorted for binary search to work. For this example, we'll assume that the input array is sorted in ascending order. The basic idea is that you divide the array being searched into two subarrays, and compare the middle element to the value for which you're searching. There are now three possible cases: 1. The value is equal to the middle element. In this case, the element has been found and you are done. 2. The value is greater than the middle element. In this case, if the value is in the array, it will be in the upper half of the array (ie. one of the elements after the middle element). 3. The value is less than the middle element. In this case, if the value is in the array, it will be one of the elements in the lower half of the array, before the middle element.
For cases 2 or 3, we take the proper subarray (either the array of elements before the middle element or the one after it) and repeat the same process: We compare the middle element in the subarray to the value. If the value is equal to the middle element, we are done. Otherwise, we perform a search on one of these new subarrays.
Now in more detailed terms: 1. Compute the subscript of the middle element of the set being searched. 2. If the array bounds are "improper" then return "value not found." 3. Else if the target is the middle element, return the subscript of the middle element. 4. Else if the target is less than the middle value then go back to step 1 and search the subarray from "first" to "middle - 1." 5. Else go back to step 1 and search the subarray from "middle + 1" to "last."
We should have no problem now turning this into code:
|
int
binary_search(int arr[], int find, int first, int last) |
| #include <stdio.h> #define SIZE 15 int binarySearch( const int [], int, int, int ); void printHeader( void ); void printRow( const int [], int, int, int ); int main() { int a[ SIZE ], i, key, result; for ( i = 0; i <= SIZE - 1; i++ ) a[ i ] = 2 * i; printf( "Enter a number between 0 and 28: " ); scanf( "%d", &key ); printHeader(); result = binarySearch( a, key, 0, SIZE - 1 ); if ( result != -1 ) printf( "\n%d found in array element %d\n", key, result ); else printf( "\n%d not found\n", key ); return 0; } int binarySearch( const int b[], int searchKey, int low, int high ) { int middle; while ( low <= high ) { middle = ( low + high ) / 2; printRow( b, low, middle, high ); if ( searchKey == b[ middle ] ) return middle; else if ( searchKey < b[ middle ] ) high = middle - 1; else low = middle + 1; } return -1; /* searchKey not found */ } /* Print a header for the output */ void printHeader( void ) { int i; printf( "\nSubscripts:\n" ); for ( i = 0; i <= SIZE - 1; i++ ) printf( "%3d ", i ); printf( "\n" ); for ( i = 1; i <= 4 * SIZE; i++ ) printf( "-" ); printf( "\n" ); } /* Print one row of output showing the current part of the array being processed. */ void printRow( const int b[], int low, int mid, int high ) { int i; for ( i = 0; i <= SIZE - 1; i++ ) if ( i < low || i > high ) printf( " " ); else if ( i == mid ) printf( "%3d*", b[ i ] ); /* mark middle value */ else printf( "%3d ", b[ i ] ); printf( "\n" ); } |
|
Example |
Say we're searching for the value 37 in the following array:

Set initial first, middle and last
valuemiddle value to be their average:
We then compare 37 to the value at the middle location. Is 37 = = 45 ? No, it is less than 45. So we update the last pointer to be middle - 1, and readjust the middle pointer accordingly:

Is 37 = = 35 ? No. It is greater than 35. So we update the first and middle pointers accordingly:

Is 37 == 37? Yes! We found it:
