Parallel Bubble Sort

Internship at OpenGenus

Get this book -> Problems on Array: For Interviews and Competitive Programming

In this article, we have explored how to implement Bubble Sort in parallel on multiple threads / cores. We have presented two approaches: one is slower than the sequential approach but the other approach is significantly faster than all other approaches for large input.

Table of contents:

  1. Introduction to Parallel Bubble Sort
  2. Parallel Bubble Sort
  3. Sequential Bubble Sort Algorithm
  4. Parallel Bubble Sort using Threads
  5. Alternate Parallel bubble sort using threads (with faster runtime)
  6. Comparison of 3 Bubble Sort implementations
  7. Time and Space Complexity

Prerequisites:

Introduction to Parallel Bubble Sort

Parallel bubble sort is a parallel implementation of the classic Bubble sort algorithm. The concept of parallelism involves executing a set of instructions/code simultaneously instead of line by line sequentially (the traditional way the code is executed in most machines and computer programming generally).

The main benefit of this is faster computation. Implementation of parallelism depends on some factors. For example, if there are parts of code that can be implemented independently and executed irrespective of any order, then we can execute at the same time. However, some parts of the program may still require other parts of the program to run first.

Therefore, the degree of parallelism depends on the program and its requirements. Additionally, the capacity of today’s modern processors that run these sequential programs is limited. We can solve this problem using multi-core computers. In addition, even sequential programs involve some degree of parallelism.

Modern-day processors can handle multiple instructions and the data generated from them at the same time. The hardware and the system are designed in such a way so that the data generated, and the previous, current, and future instructions don’t interrupt each other.

This way, even though the programs are sequential, we can implement some degree of parallelism. We can also design our algorithms in such a way that the parts of the program run sequentially.

In parallel sorting, we transform the already existing traditional sorting algorithms in a way such that during execution, instead of sequential execution of algorithm’s instructions.

Parallel Bubble Sort

In Parallel Bubble Sort, we divide sorting of the unsorted into two phases- odd and even. We compare all pairs of elements in the list/array side by side (parallel).

We divide the sorting process into two phases- odd phase and even phase. When it is the odd phase, we compare the element at index 0 with the element at index 1, the element at index 2 with the element at index 3, and so on. In the even phase, we compare index 1 element with index 2 element, index 3 element with index 4 element, and so on, with our last comparison being element at third last index and element at second last index.

When comparing, just like in normal bubble sort, we swap the elements, if the initial element is greater than the next element in comparison.

Both phases occur one after the other and the algorithm stops once no swap occurs in both odd and even phase.

The process of parallel bubble sort is shown below.
Parallel-bubble-sort

In the case of multi-core processors, both phases can occur simultaneously, which is known as parallel implementation.

Sequential Bubble Sort Algorithm

To implement a parallel version of bubble sort, follow the steps mentioned below-

  1. Create a variable called Sorted to keep track of whether swaps occur or not. This way, we can sort the list and stop the algorithm if no swaps occur. If no swaps occur in either phase, it means the list is already sorted. Then create a variable to store the length of the list, so that we can traverse and compare all elements in the list in pairs.
  2. Using a while loop, we compare all elements in the list as pairs and sort them in phases. If the starting condition is Sorted == 0, this implies that the list is not yet sorted. Then we set Sorted equal to 1 to imply that no swaps occurred and the list is already sorted. This way, in the next step, if swaps occur, then we continue inside the loop, otherwise, we break out of it.
  3. Using a for loop, we start with our odd phase. We compare two elements at a time ( by incrementing by 2 at each iteration inside the for loop) till the end of the list. If the current element is greater than the next element, we swap them. Then, we set the Sorted variable equal to 0 to imply that a swap occurred. Likewise, we repeat the same process in the next for loop of the even phase, this time starting from index 1 till the end of the list.
  4. Using the print function, we print the sorted version of the given list in the end.

Implementation

The following code executes the parallel bubble sort algorithm discussed above.

import time
import random

#timer to keep track of performance
start = time.perf_counter()

# a function to implement bubble sort in parallel
def Parallel_bubble_sort(lst):
    # variable to keep track of swaps to end the while loop
	Sorted = 0
    
    # variable to get length of list
	n = len(lst)
    
    #loop to traverse all list elements in phases
	while Sorted == 0:
    
        # set to 1 initially to assume list is sorted
        # and no swaps occurred
		Sorted = 1
        
        # traverse all list elements in pair
        # start at index 0 for odd phase
        # start at index 1 for even phase
		for i in range(0, n-1, 2):
            # check if current element greater than next element
			if lst[i] > lst[i+1]:
                # if so, swap the elements
				lst[i], lst[i+1] = lst[i+1], lst[i]
                
                # set to 0 to imply a swap occurred
				Sorted = 0
		for i in range(1, n-1, 2):
			if lst[i] > lst[i+1]:
				lst[i], lst[i+1] = lst[i+1], lst[i]
				Sorted = 0

	# print final sorted list	
	print(lst)

# an example list to test above program
lst = [(random.randint(0,100)) for i in range(100)]

Parallel_Bubble_Sort(lst)

finish = time.perf_counter()

print(f'Finished in {round(finish-start,2)} second(s)')

The above program produces the following output-

[0, 0, 2, 3, 4, 5, 5, 5, 5, 6, 6, 7, 7, 8, 8, 8, 9, 10, 10, 11, 13, 15, 16, 18, 20, 28, 29, 30, 30, 33, 33, 34, 36, 40, 41, 41, 42, 42, 42, 43, 44, 44, 45, 47, 48, 48, 49, 51, 52, 52, 54, 55, 55, 56, 56, 56, 57, 57, 58, 58, 61, 63, 67, 67, 68, 69, 69, 71, 71, 71, 71, 72, 72, 74, 76, 77, 77, 77, 78, 79, 79, 79, 80, 81, 84, 84, 85, 85, 90, 92, 94, 94, 95, 95, 98, 99, 99, 100, 100, 100]
Finished in 0.01 second(s)

Parallel Bubble Sort using Threads

Using Python's threading module, we can actually implement our algorithm in parallel. This module allows us to create threads, where each thread is considered as a single process that run within our program. We can think of threads of tasks or a set of instructions that make our program. Usually, all tasks in our program are implemented sequentially. However, using threads, we can run these tasks concurrently. This way, our program can run much faster. However, depending on the computer's memory and processor core, creating threads sometimes creates inefficiency as well.

To implement the bubble sort algorithm using threads, follow the steps mentioned below-

  • Import module threading so that we can use threads in our program. We also import time and random modules, to keep track of our program's running time and to generate a random list in the end, to test the program.
  • As mentioned in previously, we sort the list in two phases, odd and even. So, we create a function called odd_phase, which we will use to perform sorting in phases. This function takes in two values, the list and its length. Then we compare two elements at a time in the list and swap them in order. Likewise, we create a function called even_phase.
  • As our last function, we create the final Parallel_Bubble_Sort function. In this function, we again create a variable to get the lenght of the list. We then create a an empty list called threads that we use to keep track of all threads we start, so that later we can stop them. Then using a for loop, we go through the entire list and sort the list alternatively, using odd_phase and even_phase functions. We run both of these functions using threads, so that they run parallelly. Using a variable called t1, we create a thread and as our target, we set the odd_phase function if i id odd. We then pass the list and its length, and later start the thread. After starting the thread, we append it to our threads list. Then using another for loop, we close the thread using the join function. Likewise, we do the same process in case i is even.
  • Lastly, we print out the sorted list.
import time
import random
import threading
start = time.perf_counter()

# create function to sort elements in odd phase
def odd_phase(lst,n):
        for i in range(0, n-1, 2):
                if lst[i] > lst[i+1]:
                        lst[i], lst[i+1] = lst[i+1], lst[i]

# create function to sort elements in even phase
def even_phase(lst,n):
        for i in range(1, n-1, 2):
                if lst[i] > lst[i+1]:
                        lst[i], lst[i+1] = lst[i+1], lst[i]

# create function to for parallel bubble sort
def Parallel_Bubble_Sort(lst):

        n = len(lst)
      
        # create list to keep track of all threads
        threads = []
        for i in range(1,n+1):
                
                # if i is odd, call for odd_phase function
                if (i % 2 == 1):
                        
                        # start threads one by one
                        t1 = threading.Thread(target=odd_phase,args = [lst,n])
                        t1.start()
                        threads.append(t1)
                        
                        # close all threads
                        for thread in threads:
                                thread.join()
                
                # if i is even, call for even_phase function
                else:
                        t2 = threading.Thread(target=even_phase,args = [lst,n])
                        t2.start()
                        threads.append(t2)
                                        
                        for thread in threads:
                                thread.join()
        
        # print final sorted list
        print(lst)

# example list to test above program
lst = [(random.randint(0,100)) for i in range(100)]
Parallel_Bubble_Sort(lst)
finish = time.perf_counter()
print(f'Finished in {round(finish-start,2)} second(s)')

The above program produces the following output.

[0, 5, 6, 7, 8, 9, 9, 10, 10, 11, 14, 14, 16, 16, 17, 19, 21, 21, 24, 25, 25, 26, 26, 28, 28, 29, 29, 30, 31, 33, 33, 34, 34, 35, 35, 37, 39, 39, 40, 41, 41, 43, 45, 47, 48, 49, 52, 52, 52, 53, 54, 55, 56, 56, 57, 57, 58, 59, 60, 61, 63, 66, 67, 68, 68, 71, 72, 73, 73, 76, 76, 77, 80, 82, 84, 84, 85, 85, 87, 88, 89, 89, 90, 90, 91, 91, 93, 94, 94, 94, 95, 95, 96, 96, 97, 99, 99, 99, 100, 100]

Finished in 0.05 second(s)

This approach with threads takes more time because the overheads of creating threads are more. Since in each phase, odd or even, we only compare two elements at a time till the end of the list, and then separately create a thread, the runtime increases. Increase time happens because of starting a new thread and stopping the said thread each time. The costs of creating such threads, on top of comparing two elements, lead to increased runtime. This algorithm requires some calculations and is CPU bound. This means that greater the calculations, the more CPU resources, both in terms of software and hardware, are required.

Alternate Parallel bubble sort using threads (with faster runtime)

Another way we can bubble sort a list using threads is by dividing the list into smaller sublists. The number of sublists we create depends on the number of cores in the system. Each core can ideally run two threads. So in a dual-core system, we can use four threads.

We create sublists based on the biggest item in the list and the number of threads. Each sublist acts like a class interval. For example, if a list has 100 elements ranging from 0 to 100 and we use four threads, we would create four sublists. The first sublist would contain numbers ranging from 0-25, the second sublist would have numbers 26-50, the third would have numbers 51-75, and the last sublist would contain numbers ranging from 76-100.

Then we create a thread for each sublist and run bubble sort parallelly in each sublist. Finally, we merge all sublists into a sorted list.

To implement this algorithm, follow the steps mentioned below-

  • We import the threading module to create threads. In addition, we also import the itertools module, which we use to merge all sublists into a single list at the end of our program. The other two modules, time and random, are used to keep track of the program runtime and generate a list with random numbers in the end to test our program.
  • We then create a thread lock. This helps us to synchronize our usage of threads. Since we are sorting, this makes sure the order of the threads is maintained. It also ensures our list items don't jumble after thread activation, and during calculation in each thread.
  • Next, we create a general bubble sort function, where we first acquire the lock. This helps the thread to acquire the lock and makes sure that access to this lock is locked until the lock is released. The rest of the function for bubble sort is the same as a usual one. We first get the length of the list. Then we go through the list swapping all elements in order till the entire list is sorted. After that, we release the lock, so that other threads can then use the lock.
  • Finally, we create a separate Parallel_bubble_sort function. We first get the biggest element in the list and then set the number of threads to use for sorting parallelly. Then we create separate sublists as per the number of threads. Using the biggest element in the list and the number of threads, we calculate a number that helps us decide the distribution of list items into different sublists. This is called the split_factor in the program.
  • Within the same function, using a for loop, we add list items to each sublist. Every subsequent sublist contains elements that are greater than the elements in the previous sublist. After adding list items into sublists, we remove those items from the main list, to avoid repetition. Finally, if any more items are left in the list, we add those to the last sublist.
  • Later, we handle the threading part of our Parallel_bubble_sort. To activate the thread, we use the thread method. Using a for loop, we pass in as parameters, the bubblesort function, and each sublist one by one. Then we start the thread using the start method. We also create a list of active threads to keep track of them. After activating the thread, we append it to the active_threads list, so that later we can stop them. After activating all the threads, we close them using another for loop using the join method.
  • We merge all the sublists using the itertools module and the chain method. Since the chain method returns an object, we cannot directly print it. So, we convert it into a list in the end.
# Alternate Threading Version
import time
import random
import threading
import itertools

# start time counter to calculate runtime
start_time = time.time()

# create a threading lock
lock = threading.Lock()

# normal bubble sort function
def bubblesort(lst):
        
        # acquire a lock for thread
        lock.acquire()
        
        # get length of list
        n = len(lst)
        
        # perform bubble sort
        for i in range(n):
                swap = False
                for j in range(0, n-i-1):
                    if lst[j]>lst[j+1]:
                        lst[j], lst[j+1] = lst[j+1], lst[j]
                        swap = True
                if swap == False:
                        break
        
        # release the lock after calculation
        lock.release()

# create parallel bubble sort function
# this function uses normal bubble sort function

def Parallel_bubble_sort(lst): 

        # get biggest element in the list
        biggest_item = max(lst)
        
        # set number of threads 
        # as per the number of cores
        # each core can run 2 threads ideally
        no_threads = 4
        
        # create sublists as per number of threads
        lists = [[] for _ in range(no_threads)]
        
        # we use a number to divide the list 
        # into class intervals for each sublist
        
        split_factor = biggest_item//no_threads

        # split list into sublists
        # as per no of threads
        for j in range(1,len(lists)):
            for i in lst:
                if i <= (split_factor*j):
                        lists[j-1].append(i)
                        
                        #remove the element from list
                        # after adding to sublist
                        # to prevent duplication
                        lst = [x for x in lst if x != i]
            # include the remaining elements in list
            # in the last sublist
            lists[-1] = lst

        # start all threads for each sublist
        active_threads = []
        for list_item in lists:
            t = threading.Thread(target=bubblesort, args=(list_item,))
            t.start()
            active_threads.append(t)
            
        # stop all active threads
        for thread in active_threads:
            thread.join()

        # merge all lists
        # into final list
        final_lst = itertools.chain(*lists)
        final_lst = list(final_lst)

# an example list with 10k elements to test 
lst = [(random.randint(0,10000)) for i in range(10000)]
Parallel_bubble_sort(lst)
end_time = time.time() - start_time
print("--- Sorted in %s seconds ---" % (end_time))

We test the program with a list of length equal to 10,000 elements. Using this approach, we can sort a list of over 10,000 elements in around seven to eight seconds.

If the length of the list is small, this approach does run slower as compared to the previously mentioned approaches. However, once the length of the list becomes reasonably large, this approach fares much better in runtime comparatively.

Comparison of 3 Bubble Sort implementations

The image below shows the runtime of all approaches mentioned so far in different scenarios-

PBS-runtime-table-1

Time and Space Complexity

  • Time: Each iteration involves an odd and even phase. Combining the number of comparisons in both phases, we get n-1 comparisons for each iteration. So, the time complexity, considering we have multiple processors to perform each iteration, the original time complexity of O(n^2) becomes O(n).

  • Processor: In the case of multi-processors, both odd and even phase comparisons occur simultaneously. Both phases at max require n/2 processors, so total processors required become n/2 + n/2 = n processors, where n stands for the number of elements in the list. The alternate threading approach used for parallel bubble sort in the end makes use of all the available processors on the system.

  • Space: Since bubble sort is an in-place algorithm, that property stays the same even in parallel implementation. So without extra space use, the space requirement stays constant O(1). In the case of the alternate parallel bubble sort approach using threads, space requirement depends on the number of threads used.

With this article at OpenGenus, you must have the complete idea of Parallel Bubble Sort.