×

Search anything:

Merge Sort in Python [with OOP, Iterative + Recursive]

Binary Tree book by OpenGenus

Open-Source Internship opportunity by OpenGenus for programmers. Apply now.

Table of Contents

  1. Introduction
  2. Algorithm Explanation
  3. Implementation in Python (Recursive and Iterative using OOP)
    • Parent Class
    • Recursive Class
    • Iterative Class
  4. Time and Space Complexity Analysis
  5. Conclusion

Introduction

Sorting is a fundamental computer science operation that allows us to organize data in a specified order for efficient searching, analysis, and processing. There have been several sorting algorithms invented, each with its unique set of benefits and qualities. Merge sort is one such algorithm that stands out for its speed and simplicity. In this post, we'll look into merge sort in the context of Python programming, going over its working principles, implementation, and performance analysis.

Merge sort is a sorting algorithm that uses the divide-and-conquer approach to break down an issue into smaller subproblems, solve them recursively, and then combine the answers to achieve the final result. Merge sort divides the unsorted list into two halves, sorts each half individually, and then merges them.

The elegance and efficacy of merge sort stem from its ability to handle big datasets effectively while maintaining a stable sorting order. Its temporal complexity of O(nlog n) makes it a popular choice for many applications, especially when working with large data sets. Furthermore, merge sort performs well in both average and worst-case circumstances, making it a dependable and adaptable sorting algorithm.

We shall investigate the inner workings of merge sort in this article, presenting a step-by-step analysis of the algorithm's execution. We will also demonstrate a functional implementation of merge sort in Python, together with explanations of the underlying code logic and crucial principles. We will also examine the time and space complexity of merge sort and compare its performance.

This article at OpenGenus will provide you with the information and tools you need to utilize the power of merge sort in Python, whether you are a novice looking for an introduction to sorting algorithms or an experienced programmer looking to improve your grasp of merge sort. So let us begin our examination of the merge sort, unraveling its complexities and discovering the beauty of its simplicity and effectiveness.

Algorithm Explanation

Merge sort is a divide-and-conquer sorting algorithm that operates by recursively dividing the unsorted list into smaller sublists, sorting them individually, and then merging them back together to obtain a sorted output. The key idea behind merge sort is to repeatedly divide the list until we reach sublists of size 1, which are already sorted, and then merge them in a sorted order.

Here is a step-by-step explanation of the merge sort algorithm:

1. Divide: The algorithm begins by dividing the unsorted list into two halves. This step is performed recursively until each sublist contains only one element, which is inherently sorted. This division process continues until we have n sublists, each containing one element.

2. Merge: After dividing the list into sublists, the merging phase starts. This step involves repeatedly merging pairs of adjacent sublists, creating new sorted sublists of twice the size. The merging process continues until a single sorted list, containing all the elements from the original unsorted list, is obtained.

  • To merge two sublists, we compare the elements from both sublists and select the smaller element at each comparison. The selected element is then placed into the merged list.
  • The process of merging continues, selecting the smallest elements from the remaining unmerged portions of the sublists, until all the elements are merged into a single sorted list.

3. Final Result: Once all the sublists have been merged, we obtain a single sorted list that contains all the elements from the original unsorted list. This sorted list represents the final output of the merge sort algorithm.

merge-sort

The key insight behind merge sort lies in the merging phase. By repeatedly merging smaller sorted sublists, merge sort efficiently builds up the final sorted list. The recursive nature of merge sort ensures that the sublists are sorted before merging, guaranteeing a correct and complete sorting of the original list.

Merge sort exhibits excellent performance characteristics due to its balanced nature. The time complexity of merge sort is O(n log n) in all cases, where n represents the number of elements in the list. This makes merge sort particularly efficient for sorting large datasets. Moreover, merge sort is a stable sorting algorithm, meaning it preserves the relative order of equal elements during the sorting process.

In the upcoming sections, we will delve into the implementation of merge sort in Python, exploring the code logic and individual steps involved. By gaining a deeper understanding of the algorithm's inner workings, you will be able to harness the power of merge sort in your own Python programs and tackle sorting challenges with ease and efficiency.

Implementation in Python (Recursive and Iterative using OOP)

In this section, we will explore two implementations of merge sort in Python using object-oriented programming (OOP) principles. We will provide both recursive and iterative approaches to showcase different strategies for implementing merge sort.

Parent Class

from abc import ABC, abstractmethod

class MergeSort(ABC):
    def __init__(self, arr):
        self.arr = arr

    @abstractmethod
    def sort(self):
        pass

    def _merge(self, left, right):
        merged = []
        left_index, right_index = 0, 0

        while left_index < len(left) and right_index < len(right):
            if left[left_index] <= right[right_index]:
                merged.append(left[left_index])
                left_index += 1
            else:
                merged.append(right[right_index])
                right_index += 1

        merged.extend(left[left_index:])
        merged.extend(right[right_index:])

        return merged

Explanation:

  • The MergeSort class serves as the parent class for both the recursive and iterative implementations of merge sort.
  • It is an abstract base class (ABC) as indicated by the inheritance of ABC in the class definition.
  • The __init__ method is the class constructor that initializes the self.arr attribute with the input list arr.
  • The sort method is declared as an abstract method using the @abstractmethod decorator. This means that any child class must implement this method.
  • The _merge method is a helper method responsible for merging two sorted lists (left and right) back into the original input list. It takes the two lists as input and iteratively merges them by comparing elements from each list.

Recursive Class

class RecursiveMergeSort(MergeSort):
    def sort(self):
        if len(self.arr) <= 1:
            return self.arr

        mid = len(self.arr) // 2
        left_half = self.arr[:mid]
        right_half = self.arr[mid:]

        left_merge_sort = RecursiveMergeSort(left_half)
        right_merge_sort = RecursiveMergeSort(right_half)

        left_sorted = left_merge_sort.sort()
        right_sorted = right_merge_sort.sort()

        return self._merge(left_sorted, right_sorted)

Explanation:

  • The RecursiveMergeSort class is a child class of MergeSort and represents the recursive implementation of merge sort.
  • It overrides the sort method from the parent class to provide the specific recursive logic for merge sort.
  • The sort method checks if the length of self.arr is 1 or less. If so, it returns self.arr as it is since it is already sorted.
  • If the length is greater than 1, it divides self.arr into two halves: left_half and right_half.
  • It creates instances of RecursiveMergeSort for each half and calls the sort method on them to recursively sort the halves.
  • The sorted halves (left_sorted and right_sorted) are then merged using the inherited _merge method, resulting in the fully sorted list.

Iterative Class

class IterativeMergeSort(MergeSort):
    def sort(self):
        if len(self.arr) <= 1:
            return self.arr

        queue = []
        for item in self.arr:
            queue.append([item])

        while len(queue) > 1:
            first = queue.pop(0)
            second = queue.pop(0)
            merged = self._merge(first, second)
            queue.append(merged)
        return queue[0]

Explanation:

  • The IterativeMergeSort class is a child class of MergeSort and represents the iterative implementation of merge sort.
  • It overrides the sort method from the parent class to provide the specific iterative logic for merge sort.
  • It first checks if the length of self.arr is 1 or less. If so, it returns self.arr as it is since it is already sorted.
  • It creates an empty queue to hold sublists of individual elements from self.arr.
  • It iterates through each item in self.arr and appends a sublist containing that item to the queue.
  • In the while loop, it continues to merge sublists until only one sorted sublist remains in the queue.
  • In each iteration, it dequeues the first two sublists from the queue, merges them using the _merge method, and appends the merged sublist back into the queue.
  • Finally, it returns the single remaining sublist in the queue, which represents the fully sorted list.

By dividing the code into these three sections and providing explanations for each snippet, we can understand the purpose and functionality of each part of the code more clearly.

To use the merge sort implementations, create an instance of the respective class and call the sort method:

arr = [7, 3, 9, 2, 1, 6]

recursive_sorter = RecursiveMergeSort(arr)
recursive_sorted = recursive_sorter.sort()
print(recursive_sorted)

iterative_sorter = IterativeMergeSort(arr)
iterative_sorted = iterative_sorter.sort()
print(iterative_sorted)

Both implementations will produce the same output: [1, 2, 3, 6, 7, 9], which is the sorted version of the input list.

By leveraging the power of object-oriented programming, we have implemented both recursive and iterative versions of merge sort in Python. These implementations provide flexibility and readability while maintaining the efficiency and accuracy of the merge sort algorithm. You can choose the approach that suits your needs and integrate it into your projects with ease.

Time and Space Complexity Analysis

Analyzing the time and space complexity of algorithms is crucial for understanding their performance characteristics and resource requirements. Let's delve into the time and space complexity of the merge sort algorithm.

  1. Time Complexity:
  • Best Case: The best-case scenario for merge sort occurs when the input list is already sorted. In this case, the algorithm still performs the division and merging steps but does not need to perform any element comparisons during merging. The time complexity in the best case is O(n log n), where n represents the number of elements in the input list.

  • Worst Case: The worst-case scenario for merge sort occurs when the input list is in reverse order. In this case, the algorithm needs to perform the maximum number of element comparisons during merging. The time complexity in the worst case is also O(n log n).

  • Average Case: The average-case time complexity of merge sort is O(n log n). This is because, regardless of the initial order of the elements, merge sort consistently divides the input list into two halves until each sublist has only one element. The merging process then combines the sublists in a sorted manner, requiring O(n) operations.

Merge sort's time complexity of O(n log n) remains consistent across different scenarios, making it an efficient sorting algorithm for large datasets. The logarithmic factor arises from the divide-and-conquer nature of merge sort, where the input size is halved in each recursive step.

  1. Space Complexity:
  • Merge sort has a space complexity of O(n) in the worst case. This is because, during the merging process, the algorithm creates temporary arrays or lists to hold the sorted sublists. As the recursion unfolds, multiple temporary arrays are created until the merging is complete. The total space required is proportional to the number of elements in the input list.

  • However, it's worth noting that merge sort can be implemented to use an in-place merging approach, where the temporary arrays are not created. This can be achieved by carefully managing indices and utilizing the original input array for merging. In such cases, the space complexity can be reduced to O(log n) due to the recursive stack space required for the function calls.

While merge sort's space complexity is higher than some other sorting algorithms, its time efficiency and stability make it a favorable choice for sorting large datasets, especially when memory is not a critical constraint.

Understanding the time and space complexity of merge sort helps in evaluating its performance and determining its suitability for specific applications. Despite the additional space requirements, merge sort's consistent time complexity and stability make it an attractive option for a wide range of sorting needs.

Conclusion

In this article at OpenGenus, we explored the merge sort algorithm and its implementation in Python. Merge sort is a powerful and efficient sorting algorithm based on the divide-and-conquer technique. By dividing the input list into smaller sublists, sorting them recursively, and then merging them back together, merge sort achieves a stable and sorted output.

Through a step-by-step explanation, we gained insights into how the merge sort algorithm operates. We learned about the divide and merge phases, where the input list is divided into sublists and then merged in a sorted manner. This recursive process ensures that the sublists are sorted before merging, resulting in an overall sorted list.

By examining the time complexity, we discovered that merge sort exhibits a consistent performance of O(n log n) in all cases. This makes it highly efficient, especially for sorting large datasets. Additionally, merge sort is a stable sorting algorithm, preserving the relative order of equal elements during the sorting process.

The Python implementation provided a practical demonstration of merge sort's elegance and effectiveness. By leveraging recursion and the merging process, we were able to create a concise and efficient implementation. The merge sort implementation in Python can be easily applied to various sorting tasks, enabling developers to handle sizable collections of data with ease.

Furthermore, we explored the space complexity of merge sort, which typically requires additional memory for creating temporary arrays during the merging phase. However, it is possible to implement merge sort in an in-place manner, reducing the space complexity to O(log n) by carefully managing indices and utilizing the original input array for merging.

Merge sort's efficiency, stability, and ease of implementation make it a popular choice for sorting tasks. Its O(n log n) time complexity, regardless of the input order, provides consistent performance, and its stability ensures that equal elements maintain their original order. These characteristics make merge sort particularly useful in scenarios where sorting accuracy and efficiency are paramount.

In conclusion, merge sort is a powerful sorting algorithm that can significantly simplify sorting tasks in Python. By understanding its inner workings, implementing the algorithm, and analyzing its time and space complexity, you now have the knowledge and tools to leverage merge sort effectively in your own projects. Whether you are sorting a small list or handling massive datasets, merge sort's efficiency and stability will prove invaluable in achieving accurate and timely results.

Merge Sort in Python [with OOP, Iterative + Recursive]
Share this