Interpolation Search Algorithm


You stored 2GB of data in your computer and you successfully executed a search in a blink of an eye. Great! You may have used binary search or linear search. What is it trillion bytes of data stored over several servers? Interpolation search is here to help you.

Interpolation Search Algorithm is a search algorithm that has been inspired by the way humans search through a telephone book for a particular name, the key value by which the book's entries are ordered. It is an improvement above binary search. In binary search, we always move to the middle element whereas interpolation search moves to a different element in order to reduce the search space further.

Assumption: The must condition is that the list is sorted.
The element list is assumed to be equally distributed but even if this condition is not satisfied, the algorithm works but with less efficiency.

Algorithm

The mathematics to find the position of the element to be compared is calculated by:

pos = lo + [ (x-list[low])*(high-low) / (list[high]-list[low]) ]

list[] ==> List (initial search space)
x     ==> Element to be searched
low    ==> Starting index in arr[]
high    ==> Ending index in arr[]

The steps involved in this algorithm are:

  • Step 1: In a loop, calculate the value of pos using the above formula.
  • Step 2: If it is a match, return the index of the item, and exit.
  • Step 3: If the item is less than the element at position pos, calculate the target position of the left sub-array. Otherwise calculate the same in the right sub-array.
  • Step 4: Repeat until a match is found or the search space reduces to zero.

Complexity

  • Worst case time complexity: O(N)
  • Average case time complexity: O(log log N)
  • Best case time complexity: O(1)
  • Space complexity: O(1)

On assuming a uniform distribution of the data on the linear scale used for interpolation, the performance can be shown to be O(log log n).
Dynamic Interpolation Search is possible in o(log log n) time using a novel data structure.

Implementations

Implementation of Linear search algorithm in 16 languages that includes C, C++, Java, C#, Racket, Go, Haskell, JavaScript, Kotlin, PHP, Ruby, Rust, Scala, Shell and Swift.

  • C
  • C++
  • Java
  • Go
  • PHP
  • Python

Tab 1


#include <stdio.h>
// Part of Cosmos by OpenGenus Foundation
int interpolationSearch(int* array, int n, int key){
    int high = n-1;
    int low = 0;
    int mid;
    while ( (array[high] != array[low]) && (key >= array[low]) && (key <= array[high])) {
        mid = low + (int)( ((float)(high-low) / (array[high] - array[low])) * (key - array[low]) );
        if(array[mid] == key)
            return mid;
        else if (array[mid] < key)
            low = mid + 1;
        else
            high = mid - 1;
    }
    return -1;
}
void test(){
    // Array of items on which search will be conducted.
    int array[10] = {1, 5, 9, 12, 16, 18, 25, 56, 76, 78};
    int n = sizeof(array)/sizeof(array[0]);
    // Element to be searched
    int key = 25;
    int index = interpolationSearch(array, n, key);
    if(index !=-1)
        printf("Element found at index %d\n", index);
    else
        printf("EElement not found.\n");
}
int main() {
    test(25);
    return 0;
}
  

Tab 2


 /*
 Part of Cosmos by OpenGenus Foundation
 interpolation search synopsis
 warning: in order to follow the convention of STL, the interface is [begin, end) !!!
 // [begin, end)
template::value_type,
         typename _Compare>
_Random_Access_Iter
interpolationSearch(_Random_Access_Iter begin,
                     _Random_Access_Iter end,
                     _Type const &find,
                     _Compare compare);

// [begin, end)
template<typename _Random_Access_Iter,
typename _Type = typename std::iterator_traits<_Random_Access_Iter>::value_type>
_Random_Access_Iter
interpolationSearch(_Random_Access_Iter begin, _Random_Access_Iter end, _Type const &find);
*/

include <functional>

// [begin, end)
template<typename _Random_Access_Iter,
typename _Type = typename std::iterator_traits<_Random_Access_Iter>::value_type,
typename _Compare>
_Random_Access_Iter
interpolationSearch(_Random_Access_Iter begin,
_Random_Access_Iter end,
_Type const &find,
_Compare compare)
{
auto notFound = end;
if (begin != end)
{
--end;
// TODO: replace '<=' with '!=' or else to be more useful,
// (e.g., allow input iterator can be passed)
while (begin <= end)
{
// Now we will enquire the position keeping in mind the uniform distribution in mind
auto pos = begin;
if (compare(*begin, *end))
{
auto len = std::distance(begin, end) * (find - *begin) / (*end - *begin);
std::advance(pos, len);
}
if (pos < begin || pos > end)
break;
else if (compare(*pos, find))
begin = ++pos;
else if (compare(find, *pos))
end = --pos;
else
return pos;
}
}
return notFound;
}
// [begin, end)
template<typename _Random_Access_Iter,
typename _Type = typename std::iterator_traits<_Random_Access_Iter>::value_type>
_Random_Access_Iter
interpolationSearch(_Random_Access_Iter begin, _Random_Access_Iter end, _Type const &find)
{
return interpolationSearch(begin, end, find, std::less<_Type>());
}

Tab 3


public class Interpolation {
	    static int arr[] = new int[]{10, 12, 13, 15, 17, 19, 20, 21, 22, 23,
	                                         24, 33, 35, 42, 49};
	 	public  static int interpolationSearch(int x)
	    {
	        // Find indexes of two corners
	        int lo = 0, hi = (arr.length - 1);
	        // Since array is sorted, an element present
	        // in array must be in range defined by corner
	        while (lo <= hi && x >= arr[lo] && x <= arr[hi])
	        {
	            int pos = lo + (((hi-lo) /
	                  (arr[hi]-arr[lo]))*(x - arr[lo]));
            if (arr[pos] == x)
                return pos;
      
            if (arr[pos] < x)
                lo = pos + 1;
      
            else
                hi = pos - 1;
        }
        return -1;
    }	   
    public static void main(String[] args) 
    {
         int x = 20; 
         int index = interpolationSearch(x);
          
         // If element was found
         if (index != -1)
            System.out.println("Element found at index " + index);
         else
            System.out.println("Element not found.");
    }
}

Tab 4


package main
import "fmt"
func interpolationSearch(array []int, key int) int {
	min, max := array[0], array[len(array)-1]
	low, high := 0, len(array)-1
	for {
		if key < min {
			return low
		}
		if key > max {
			return high + 1
		}
		var guess int
		if high == low {
			guess = high
		} else {
			size := high - low
			offset := int(float64(size-1) * (float64(key-min) / float64(max-min)))
			guess = low + offset
		}
		if array[guess] == key {
			for guess > 0 && array[guess-1] == key {
				guess--
			}
			return guess
		}
		if array[guess] > key {
			high = guess - 1
			max = array[high]
		} else {
			low = guess + 1
			min = array[low]
		}
	}
}
func main() {
	items := []int{1, 2, 9, 20, 31, 45, 63, 70, 100}
	fmt.Println(interpolationSearch(items, 63))
}
  

Tab 5


/*
 * Part of Cosmos by OpenGenus Foundation
 * @param array $array
 * @param int   $search
 * @return int
 */
function interpolationSearch(array $array, $search)
{
    list ($low, $high) = [0, count($array) - 1];
    while (($array[$high] !== $array[$low]) && ($search >= $array[$low]) && ($search <= $array[$high])) {
        $middle = $low + (($search - $array[$low]) * ($high - $low) / ($array[$high] - $array[$low]));
        if ($array[$middle] < $search) {
            $low = $middle + 1;
        } else if ($search < $array[$middle]) {
            $high = $middle - 1;
        } else {
            return $middle;
        }
    }
    return $search === $array[$low] ? $low : -1;
}
echo sprintf("Position of %d is %d\n", $search = 33, interpolationSearch($array = [7, 11, 13, 33, 66], $search));
  

Tab 6


  def interpolationSearch(arr, n, x):
    low = 0
    high = (n - 1)
    while low <= high and x >= arr[low] and x <= arr[high]:
        mid = low + int(
            ((float(high - low) / (arr[high] - arr[low])) * (x - arr[low])))
        if arr[mid] == x:
            return mid
        if arr[mid] < x:
            low = mid + 1
        else:
            high = mid - 1
    return -1
def test(x):
    arr = [10, 12, 13, 16, 18, 19, 20, 21, 22, 23, 24, 33, 35, 42, 47]
    n = len(arr)
    index = interpolationSearch(arr, n, x)
    if index != -1:
        print "The element", x, "is at the index", index
    else:
        print "Element", x, "not found!"
test(123)
  

Optimizations

An average performance of O(log log N) can be achieved by using a dynamic data-structure. Read this: Interpolation search in log log N


Applications

  • A potential fast search algorithm to search within an uniformly distributed and sorted search space (like a phone book)