Open-Source Internship opportunity by OpenGenus for programmers. Apply now.
We have explained how to solve the 2 Sum Closest problem that is Find 2 elements with sum closest to a target efficiently. We have explained 3 different approaches which involves the use of Binary Search and Two Pointer technique.
Table of Contents
- Introduction
- Approach 1 - Brute Force
- Approach 2 - Binary Search
- Approach 3 - Two Pointers
- Overview
Prerequisites: Two Pointer Technique, Binary Search
Introduction
In this article we will cover the basis of the 2 sum closest problem and different approaches to tackling it, as well as the complexities for said algorithms to give a holistic view of the problem. Our aim is to find a pair of integers in a given array whose sum is closest to a target number.
For example:
array: [-1,2,1,-4]
target: 4
There are 6 possible sums available. If we subtract the sum of all 6 pairs from 4 and compare them, we will be able to identify the difference closest to the 0, and therefore the closest pair.
(4-(-1+2)) = 3
(4-(-1+1)) = 4
(4-(-1-4)) = 9
(4-(2+1)) = 1
(4-(2-4)) = 6
As we can see from our calculations the sum closest to 0 was 1, therefore the pair of elements that return the closest to our target of 4 is 2,1.
output: (2,1)
Approach 1
This is approach is using the same method as in the example. If we subtract a sum value from the target, we can find the difference between them, from this, we can deduce that a perfect sum would have a difference of 0 (since that means the sum equals the target). We then just have to iterate through all possible pairs in the array and return the ones whose difference is closest to the target. Using this method, we will be able to find any sum closest to the target with an n sized array however it is naive and so we will later cover a more efficient method.
To solve this problem, there are some simple steps:
- Identify all possible pairs (order of numbers doesn't matter since we are summing them together)
- Subtract each pair's sum from the target
- Compare each value to the target, whichever sum is closest, return its pair
For example:
array: [1,7,4,-3]
target: 5
There are 6 possible pairs therefore we need to do 6 comparisons.
(5-(1+7)) = -3
(5-(1+4)) = 0
(5-(1-3)) = 7
(5-(7+4)) = -6
(5-(7-3)) = 1
(5-(4-3)) = = 4
closest sum to 0: 0
Therefore, closest sum pair to target: (1,4)
output: (1,4)
Python Implementation
def two_sum(arr, target):
pairs = [(a, b) for idx, a in enumerate(arr) for b in arr[idx + 1:]]
sum = []
for a, b in pairs:
sum.append(a+b)
difference = lambda sum : abs(target - sum)
closest_values = min(sum, key=difference)
print(pairs[sum.index(closest_values)])
two_sum(arr=[1,6,-3,4,7,8,3], target=2)
Explanation
array: [1,6,-2,4,7,8,3]
target: 2
Firstly, we use enumeration and list comprehension to create a list of all possible pairs(a,b). Then, we create a new list called sum and add all of the pairs we previously together and store the results in sum. We then use a lambda function to calculate the difference between the target and sums and then the built-in min function to calculate the closest value in sum to the target, we output the pair in pairs whose index matches that of the closest sum. This way we can always find whether there is a sum or if not, the closest one to it!
closest difference: 0
index of 1 in sums: 11
pair in pairs at index 11: (-2,4)
output: (-2,4)
This algorithm has a time complexity of O(N^2) and space complexity of O(1)
Approach 2
To improve our first approach, we can implement a binary search method through our sum array. If we can create an array of sums (like in the previous approach) we can then find the closest value to the target in that array using a binary search, and then output the pair that is the same index as the closest value in sums to get our closest pair.
Below is an example of a binary search, to be applied to a test sum array:
sum array: [2,5,6,7,8,8,9]
target: 4
midpoint: 7
right of midpoint: 8
left of midpoint: 6
8 - 4 = 4, 4 indices away
6 - 4 = 2, 2 indices away
Since 2<4 we know that anything further right of the midpoint will only keep increasing since our array is sorted, therefore we can disregard this half.
new midpoint: 5
right of midpoint: 6
left of midpoint: 2
6 - 4 = 2, 2 indices away
2 - 4 = -2, 2 indices away
We know that once again everything to the right of the midpoint will not get us any closer to the target, therefore we can discard this half.
new midpoint: 2
right of midpoint: 5
left of midpoint: None
5 - 4 = 1, 1 index away
No more elements left to search -> closest value to target = 5
output: 5
Python Implementation
def closest_two(arr,target):
pairs = [(a, b) for idx, a in enumerate(arr) for b in arr[idx + 1:]]
sum = []
for a, b in pairs:
sum.append(a+b)
min_diff = float("inf")
low = 0
high = len(sum)-1
closest_num = None
if len(sum) == 0:
return None
if len(sum) == 1:
return pairs[0]
while low <= high:
mid = (low+high)//2
if mid + 1 < len(sum):
min_diff_right = abs(sum[mid+1] - target)
if mid > 0:
min_diff_left = abs(sum[mid-1] - target)
if min_diff_left < min_diff:
min_diff = min_diff_left
closest_num = sum[mid-1]
if min_diff_right < min_diff:
min_diff = min_diff_right
closest_num = sum[mid+1]
if sum[mid] < target:
low = mid + 1
elif sum[mid] > target:
high = mid - 1
else:
return pairs[sum.index(mid)]
return pairs[sum.index(closest_num)]
arr = sorted([1,6,-2,4,7,8,3])
print(closest_two(arr,target=16))
Explanation
array: [1,6,-2,4,7,8,3]
target: 16
minimum difference: Inf
low: 0
high: length of array - 1
closest number: None
Firstly, we create our sums array as we did in the first approach, by finding all pairs from the input and then creating a new array by adding each pair together and storing their values. Then before constructing our algorithm, we add in some anomaly handling by, if there are no items in the sum array, return None and if there is only one item, then simply return the first pair in pairs, since that is the only possible pair that can be closest.
Next to start our algorithm, firstly we make checks as to not be searching outside of the array boundaries. Then, we check to see if the value between the left and right elements are smaller than the ones found previously. We can then use the binary search implementation (same steps as in example) to move around the midpoint appropriately. We also include the scenario where there is an exact match, in this case we can return the pair in pairs with the same index in sum as our mid value. After iterating through our array we will have a cloesest number value, this is the closest number in the sums array to the target, to get the pair we simply return the pair in pairs with the same index in sum as the closest number variable to get our final 2 numbers, as in approach 1.
output: (7,8)
This approach has a time complexity of O(NlogN) for sorting the array, and O(NlogN) for the binary search.
Approach 3
A more elegant solution to this problem is by utilising two pointers, i and j. We place i at the front of the array and j at the end, then during each iteration, find the elements at index i and j and update the pair of elements (i,j) if abs((i+j)-target) is the minimum out of all the pairs processed. This loops whilst i < j is true.
Using these rules we can increment each pointer:
- If the sum of i and j is less than the target, increment i as i=i+1
- If the sum of i and j is more than the target, decrement j as j=j-1
For example:
array: [2,3,5,8,12,6,9]
target: 10
Step 1: sort array -> [2,3,12,5,6,8,9]
Step 2: iterate through ->
i=0 j=6
2 9
2 + 9 = 11 > target -> j-1
index_i, index_j = 0,6 as best current pair
i=0 j=5
2 8
2 + 8 = 10 = target -> i+1
index_i, index_j = 0,5 as best pair
Our algorithm will iterate through the rest of the numbers incrementing i and decrementing j as shown however the index_i and index_j variables will not change as there is no better pair than (2,8).
Step 3: We finally print the elements in our array at index_i and index_j which in this cause is 2 and 8
output: (2,8)
Python Implementation
def two_sum(arr, target):
index_i, index_j = 0, 0
i, j, diff = 0, len(arr)-1, float("inf")
while j > i:
if abs(arr[i] + arr[j] - target) < diff:
index_i = i
index_j = j
diff = abs(arr[i] + arr[j] - target)
if arr[i] + arr[j] > target:
j -= 1
else:
i += 1
print((arr[index_i], arr[index_j]))
arr=sorted([1,6,-2,4,7,8,3])
two_sum(arr,n=len(arr),target=2)
Explanation
array: [1,6,-2,4,7,8,3]
target: 2
difference: Inf
i, j = 0, length of array-1
index_i, index_j: 0
Step 1: Array gets sorted -> [-2,1,3,4,6,7,8]
Step 2: Iterate through ->
i=0 j=6
-2 8
-2 + 8 = 6 > target -> j - 1
index_i=0 index j=6 as no closer pair
i=0 j=5
-2 7
-2 + 7 = 5 > target -> j - 1
index_i=0 index_j=5 as no closer pair
i=0 j=4
-2 6
-2 + 6 = 4 > target -> j - 1
index_i=0 index_j=4 as no closer pair
i=0 j=3
-2 4
-2 + 4 = 2 = target -> i + 1
index_i=0 index_j=3 as no closer pair
The rest of the array is iterated through but since there are no closer pairs than -2+4 to the target, index_i and index_j do not change.
Step 3: We then can print the numbers with index_i and index_j in our sorted array to be returned with the closest value pair, in this case -2 and 4.
Output: (-2,4)
This approach has a time complexity of O(NlogN) for sorting the array and O(N) for finding the pair. This is the most efficient of our three solutions (will be explained in the overview)
Overview
In this article, we have gone over multiple ways to solve the 2 sum closest problem, one brute force method, one using binary search, and the final using pointers. I have included Python implementations so that you will be able to better understand how to solve this problem in code, these approaches can quite easily be adapted to k sum closest problems.
Comparison
With each approach in this article at OpenGenus, we get more and more efficient. Brute force is the weakest solution due to its exponential complexity and therefore is unable to handle large sized arrays very well. Our binary searching method is definitely better than brute force however, it lags behind in efficiency actually conducting the binary search. The most efficient solution is by using two pointers since finding the pair is in constant time instead of the binary search's logarithmic, meaning that the final approach discussed is the most optimal.