Intersection of two arrays

Internship at OpenGenus

Get FREE domain for 1st year and build your brand new site

In this article, we have explored several approaches to find Intersection of two arrays efficiently. This involve techniques like sorting, binary search, hash map and much more.

Table of contents:

  1. Problem statement: Intersection of two arrays
  2. Method 1: Brute Force
  3. Method 2: Sort both lists
  4. Method 3: Sort one list
  5. Method 4: Use Hash Map
  6. Applications of Intersection of two arrays

Try similar problems based on Array.

Let us get started with Intersection of two arrays.

Problem statement: Intersection of two arrays

In intersection of two arrays (or lists), we find the common elements in two arrays and create a separate array containing those common elements. This concept is the same as calculating the intersection of sets in mathematics, particularly found in set theory. In mathematics, an intersection of two sets A and B is depicted as A ∩ B and is called as A intersection B.

Let A be a list containing elements [1,2,3,4] and B be another list containing elements [3,4,5,6]. Then the intersection of these two lists would be [3,4].
Diagrammatically, we can represent the above example using a Venn diagram as shown below.

intersection-of-sets

We can find intersection of two arrays/lists using multiple approaches:

  • Method 1: Brute Force
  • Method 2: Sort both lists
  • Method 3: Sort one list
  • Method 4: Use Hash Map

Let us get started.

Method 1: Brute Force

The simplest method is to iterate over the items of the first list and for every item in first list, check if the value is present in second list. If it is present, then add the value to the new list using the append() method in Python.

The steps required are:

  1. Create a new empty list.
  2. For every element of either of the list, traverse all elements and check for elements in the other list.
  3. In case element is present, add this element to the new list.

Implementation example

# a function to find intersection of two lists
def intersection(lst1, lst2):
    # a new list to keep track of common elements in lst1 and lst2
    lst3 = []
    #iterate over all elements in lst1
    for value in lst1:
        #check for each element of lst1 if it's present in lst2
        if value in lst2:
            #if true, add the common element to the new list
            lst3.append(value)
    #we can shorten the four lines of code above 
    #using this code - lst3 = [value for value in lst1 if value in lst2]
    return lst3

# example of two lists to test the intersection program
lst1 = [1,2,2,1]
lst2 = [2,2]
# print the final output
print(intersection(lst1, lst2))

Running the above program would give the following result

[2, 2]

Time and Space Complexity

  • In the worst case, this approach would have a runtime of O(m * n), where m,n stand for number of elements in list 1 and list 2 respectively. Even in the best case, at minimum, we would still need to check for each element in list 1 with each element in list 2, so the runtime would still be the same.
  • In the worst case, the space required for this algorithm would be either O(n) or O(m), which means that at most, in the new list, all elements from either list 1 or list 2 would be in the intersection. In the best case scenario, there might be no common elements, and in that case the new list will have zero elements. So the space complexity would be O(1).

Method 2: Sort both lists

The first method is to sort both lists and then scan both lists at the same time. This helps to find the common elements in the lists. For sorting, we will use Merge sort. The following are the steps to perform this method-

  1. First, sort both lists using merge sort-

    • Find the middle point of the array/list to divide it into two parts
    • Sort the left part of the list as per the middle point.
    • Sort the right part from the middle point.
    • Merge both sorted parts.
  2. Scan both lists for common elements (intersection)

    • Create a variable to use as index, for each list (so in total there'd be two variables in this case). Let the two variables be x and y, respectively.
    • Create a new list to add common elements.
    • If lst_1[x] is greater than lst_2[y], increment y.
    • If lst_1[x] is less than lst_2[y], increment x.
    • If both lst_1[x] and lst_2[y] are equal, add this element to new list (for this, we can use either lst_1[x] or lst_2[y]) . Then, increment both x and y.

Implementation Example

The following program shows how to implement this method.

# function to sort lists/arrays
def MergeSort(lst):
    # making sure length of list is 1 , which is the midpoint element
    if len(lst) > 1:
        # find midpoint of the list
        midpoint = len(lst)//2
        # divide the list into two parts (sublists), part to the left and right of midpoint
        lefthalf = lst[:midpoint]
        righthalf = lst[midpoint:]
        
        # keep calling the function till length of list becomes 1
        # (recursion)
        MergeSort(lefthalf)
        MergeSort(righthalf)

        # Assign zero value to all index variables initially
        # k is the index variable for list parameter
        i = j = k = 0
        
        # compare all elements in leftpart and rightpart
        # add elements in sorted order and update the list
        while i < len(lefthalf) and j < len(righthalf):
            if lefthalf[i] < righthalf[j]:
                lst[k] = lefthalf[i]
                i += 1
            else:
                lst[k] = righthalf[j]
                j += 1
            k += 1

        #add any remaining elements in leftpart to the new list  
        while i < len(lefthalf):
            lst[k] = lefthalf[i]
            i += 1
            k += 1
  
        #add any remaining elements in rightpart to the new list  
        while j < len(righthalf):
            lst[k] = righthalf[j]
            j += 1
            k += 1

    return lst

# function to find intersection of two arrays/lists
def intersection(lst_1, lst_2):    

    # create index variable each for both lists
    x=0
    y=0
    
    # create new list to store common elements of lists (intersection)
    lst3 = []
    
    # traverse all elements in both lists using a while loop
    while x < len(lst_1) and y < len(lst_2):
        # if element in list 1 is less than element in list 2, increment x
        if lst_1[x] < lst_2[y]:
            x+=1
        # otherwise, increment y
        elif lst_1[x] > lst_2[y]:
            y+=1
        # if both elements are equal, append to new list and increment both x and y 
        else:
            lst3.append(lst_1[x])
            y+=1
            x+=1      
    
    return lst3

#example of two lists to test the intersection program
lst1 = [1, 2, 2, 1]
lst2 = [2, 2]

# call mergesort function to sort both lists
MergeSort(lst2)
MergeSort(lst1)

# print the final output
print(intersection(lst1,lst2))

The above program produces the following output-

[2, 2]

Time and Space Complexity

  • In this approach we used merge sort. Merge sort takes a runtime of O(nlogn) in each case, best or worst, for both lists. This is because initially, we divide the list 'log n' number of times to get the midpoint. And at each iteration, we have to traverse all 'n' elements of the list. The space complexity of merge sort is O(n) in each case as well.
  • For the second function of intersection, in the worst case, this approach would have a runtime of O(m + n), where m,n stand for number of elements in list 1 and list 2 respectively. Both lists would now be traversed only once. Even in the best case, at minimum, we would still need to check for each element in list 1 with each element in list 2, so the runtime would still be the same.
  • In terms of space for the second function of the algorithm, in the worst case, the space required for this algorithm would be either O(n) or O(m). This means that at most, in the new list, all elements from either list 1 or list 2 would be in the intersection. In the best case scenario, there might be no common elements, and in that case the new list will have zero elements. So the space complexity would be O(1).

Method 3: Sort one list

The second method also involves sorting but this time, we sort only the second list. After sorting the list, we perform a binary search for the elements in the first list, present in the second list. We usually sort that list which has a greater length relatively.

The steps involved are as follows-

  1. Sort either of the two lists.
  2. Search for every element of unsorted list in sorted list, using binary search.
  3. Add the element to the new list, if element in present (i.e if binary search returns true).

Implementation example

The below program implements the second method to find intersection of two lists/arrays.

# function to sort lists/arrays
def MergeSort(lst):
    # making sure length of list is 1 , which is the midpoint element
    if len(lst) > 1:
        # find midpoint of the list
        midpoint = len(lst)//2
        # divide the list into two parts (sublists), part to the left and right of midpoint
        lefthalf = lst[:midpoint]
        righthalf = lst[midpoint:]
        
        # keep calling the function till length of list becomes 1
        MergeSort(lefthalf)
        MergeSort(righthalf)

        # Assign zero value to all index variables initially
        # k is the index variable for list parameter
        i = j = k = 0
        
        # compare all elements in leftpart and rightpart
        # add elements in sorted order and update the list
        while i < len(lefthalf) and j < len(righthalf):
            if lefthalf[i] < righthalf[j]:
                lst[k] = lefthalf[i]
                i += 1
            else:
                lst[k] = righthalf[j]
                j += 1
            k += 1

        #add any remaining elements in leftpart to the new list  
        while i < len(lefthalf):
            lst[k] = lefthalf[i]
            i += 1
            k += 1
  
        #add any remaining elements in rightpart to the new list  
        while j < len(righthalf):
            lst[k] = righthalf[j]
            j += 1
            k += 1

    return lst

# function to find intersection of two arrays/lists
def binary_search(lst_1, x):    
    low = 0
    high = len(lst_1)-1
    mid = 0

    while low <= high:
        mid = (high + low)//2
        # ignore the left part of list in case x is greater
        if lst_1[mid] < x:
            low = mid + 1
        # ignore the right part of list if x is smaller
        elif lst_1[mid] > x:
            high = mid - 1
        else:
            #implies element found in list
            return True
        
    #implies element not found in list
    return False
       
# intersection function to find common elements in two lists
def intersection(lst_1,lst_2):

    # sort any one list
    MergeSort(lst_2)

    # create new list
    lst3= []

    # traverse all elements in unsorted list once and  
    for item in lst_1:
        # look for each element of unsorted list in sorted list using binary search
        if (binary_search(lst_2, item)):

            # if true, add element to the list
            lst3.append(item)
    return lst3

#example of two lists to test the intersection program
lst1 = [1, 2, 2, 1]
lst2 = [2, 2]

# print final output
print(intersection(lst1,lst2))

The output of the above program is as follows:

[2, 2]

Time and Space Complexity

  • This method involves binary search which has a time complexity of O(log n) in the worst case and a constant best case running time. Let 'm' be the number of elements in the sorted list. Since, we perform binary search for each element of the sorted list, runtime becomes m(log n). However, we also used merge sort for sorting here, which has a runtime of n(log n), in both best and worst cases.
  • In terms of space, binary search requires O(n) space in worst case and O(1) in the best case. Merge sort's space complexity of merge sort is O(n) in each case as well.
  • So taken together, the overall runtime and space is more than method 1.

Method 4: Use Hash Map

Another approach is to use the dictionary data type in Python. This represents Hash Map.

Dictionary in Python are used to store key, value pairs items where order doesn't matter. To create one, we use curly braces {}. The main algorithm stays the same, but now we traverse the elements of the second list only once because of the use of dictionary. We can convert either of the two lists into a dictionary for this. This improves the runtime.

The steps involved are as follows-

  1. Convert either of the lists into a dictionary.
  2. Traverse all elements in the other list and check for elements in dictionary.
  3. If element is present in dicitonary, add to new list.

Implementation example

# function to calculate intersection of two arrays/lists/sets
def intersection(lst1, lst2):
# new list to include common elements in list 1 and list 2
    lst3=[]
    # a temporary dictionary that stores list 2 after conversion
    #temp_dictionary = {value : True for value in lst2}
    temp_dictionary = dict.fromkeys(lst2, True)

    # traverse through elements in list 1 once
    for item in lst1:
    # traverse through list 2, now converted into a dictionary, once
        if item in temp_dictionary:
            # add common element to new list
            lst3.append(item)
    # return the final new list of common elements (intersection)
    return lst3
    
# example to test the program
lst1 = [3, 3, 5]
lst2 = [1, 3, 2, 3, 3, 4, 5, 5, 6 ]
#print final list
print(intersection(lst1, lst2))

The program above produces the following output-

[3, 3, 5]

Time and Space Complexity

  • In the worst case, this approach would have a runtime of O(m + n), where m,n stand for number of elements in list 1 and list 2 respectively. Both lists would now be traversed only once. Even in the best case, at minimum, we would still need to check for each element in list 1 with each element in list 2, so the runtime would still be the same.
  • In the worst case, the space required for this algorithm would be either O(n) or O(m), which means that at most, in the new list, all elements from either list 1 or list 2 would be in the intersection. In the best case scenario, there might be no common elements, and in that case the new list will have zero elements. So the space complexity would be O(1).

Applications of Intersection of two arrays

Applications of Intersection of two arrays are:

  • Intersection of sets and set theory in general are widely used as foundation for fields like geometry, topology, calculus as well as chemistry, biology and other sciences.
  • In computer science, sets are highly useful in many programming languages, data structures, applications and programming in general.

Try and Practice similar problems based on Array.

With this article at OpenGenus, you must have the complete idea of how to find Intersection of two arrays efficiently.