×

Search anything:

B Tree in Python using OOP Concepts

Internship at OpenGenus

Get this book -> Problems on Array: For Interviews and Competitive Programming

In this article at OpenGenus, we will explore the implementation of a B-tree in Python using the principles of Object-Oriented Programming (OOP). By utilizing classes and objects, we'll create a B-tree structure that supports efficient insertion, searching, and display operations.

Table of contents:

  1. Introduction
  2. Solution
  3. Quiz

Introduction

A B-tree is a self-balancing tree data structure that maintains sorted data and allows searches, sequential access, insertions, and deletions in logarithmic time. The B-tree generalizes the binary search tree, allowing for nodes with more than two children. Unlike other self-balancing binary search trees, the B-tree is well suited for storage systems that read and write relatively large blocks of data, such as databases and file systems. A B-tree of order m is a tree which satisfies the following properties:

  • Every node has at most m children.
  • Every internal node has at least ⌈m/2⌉ children.
  • Every non-leaf node has at least two children.
  • All leaves appear on the same level.
  • A non-leaf node with k children contains k−1 keys.

Solution

We will split the implementation into smaller parts for better understanding.

BTreeNode Class

class BTreeNode:
    def __init__(self, leaf=False):
        self.keys = []
        self.child = []
        self.leaf = leaf

The BTreeNode class represents a node in the B-tree. It has the following attributes:

  • keys: A list to store the keys in the node.
  • child: A list to store the child nodes of the current node.
  • leaf: A boolean flag indicating whether the node is a leaf node or not.

BTree Class

class BTree:
    def __init__(self, t):
        self.root = BTreeNode(True)
        self.t = t

The BTree class represents the B-tree itself. It has the following attributes:

  • root: The root node of the B-tree.
  • t: The minimum degree of the B-tree, which determines the maximum number of children a node can have.

Insertion

Here are the steps to insert an element in a B Tree:

  1. Start at the root node of the B-tree.
  2. If the root is full (contains 2t-1 keys), split it into two child nodes and promote the middle key to the parent.
  3. Determine the appropriate child node to descend based on the key to be inserted.
  4. If the selected child node is full, split it recursively as in step 2.
  5. Repeat steps 3-4 until reaching a leaf node.
  6. Insert the key into the leaf node in its appropriate position, maintaining the keys in ascending order.
  7. If the insertion causes the leaf node to become full, split it as in step 2 and promote the middle key to the parent.
  8. Repeat steps 6-7 as necessary to maintain the B-tree properties.
    def insert(self, k):
        root = self.root
        if len(root.keys) == (2 * self.t) - 1:
            temp = BTreeNode()
            self.root = temp
            temp.child.insert(0, root)
            self.split_child(temp, 0)
            self.insert_non_full(temp, k)
        else:
            self.insert_non_full(root, k)

The insert method inserts a key k into the B-tree. It starts by checking if the root node is full. If it is, a new root node is created, and the original root becomes its child. Then, the method split_child is called to split the child and make room for the new key. Finally, the method insert_non_full is called to insert the key into the B-tree.

Non-Full Insertion

    def insert_non_full(self, x, k):
        i = len(x.keys) - 1
        if x.leaf:
            x.keys.append((None, None))
            while i >= 0 and k < x.keys[i]:
                x.keys[i + 1] = x.keys[i]
                i -= 1
            x.keys[i + 1] = k
        else:
            while i >= 0 and k < x.keys[i]:
                i -= 1
            i += 1
            if len(x.child[i].keys) == (2 * self.t) - 1:
                self.split_child(x, i)
                if k > x.keys[i]:
                    i += 1
            self.insert_non_full(x.child[i], k)

The insert_non_full method is called to insert a key into a non-full node. If the node is a leaf node, the key is inserted in the correct position, shifting other keys if necessary. If the node is not a leaf, the method finds the appropriate child node to insert the key recursively. If the child node is full, it is split using the split_child method, and the insertion continues on the appropriate child.

Split Child

    def split_child(self, x, i):
        t = self.t
        y = x.child[i]
        z = BTreeNode(y.leaf)
        x.child.insert(i + 1, z)
        x.keys.insert(i, y.keys[t - 1])
        z.keys = y.keys[t:(2 * t) - 1]
        y.keys = y.keys[0:t - 1]
        if not y.leaf:
            z.child = y.child[t:(2 * t)]
            y.child = y.child[0:t - 1]

The split_child method splits a full child node into two nodes. It creates a new node z, inserts it as a child of x at position i+1, and moves the appropriate keys and child nodes from node y to z to maintain the B-tree properties.

Deletion

Here are the steps to delete an element in a B Tree:

  1. Start at the root node of the B-tree.
  2. Find the node containing the key to be deleted.
  3. If the key is present in a leaf node, remove it.
  4. If the key is present in an internal node, replace it with its predecessor or successor key and recursively delete the predecessor/successor from its original location.
  5. If a child node after the deletion becomes less than half full, perform the following steps:
    a. If the immediate left sibling has more than t-1 keys, borrow a key from it.
    b. If the immediate right sibling has more than t-1 keys, borrow a key from it.
    c. If both siblings have t-1 keys, merge the child node with one of its siblings.
  6. Repeat steps 2-5 until the key is deleted or determined to be absent in the B-tree.
    def delete(self, k):
        self.delete_recursive(self.root, k)

The delete method deletes a key k from the B-tree. It calls the delete_recursive method with the root node and the key to be deleted.

Recursive Deletion

    def delete_recursive(self, x, k):
        i = 0
        while i < len(x.keys) and k > x.keys[i]:
            i += 1

        if i < len(x.keys) and k == x.keys[i]:
            if x.leaf:
                x.keys.pop(i)
            else:
                y = x.child[i]
                z = x.child[i + 1]
                if len(y.keys) >= self.t:
                    predecessor = self.get_predecessor(y)
                    x.keys[i] = predecessor
                    self.delete_recursive(y, predecessor)
                elif len(z.keys) >= self.t:
                    successor = self.get_successor(z)
                    x.keys[i] = successor
                    self.delete_recursive(z, successor)
                else:
                    self.merge_nodes(x, i, y, z)
                    self.delete_recursive(y, k)
        else:
            if x.leaf:
                print(f"Key {k} does not exist in the B-tree.")
            else:
                if len(x.child[i].keys) < self.t:
                    self.fix_child(x, i)

                self.delete_recursive(x.child[i], k)

The delete_recursive method is called to recursively delete a key from the B-tree. It starts by finding the position of the key in the current node. If the key is found in a leaf node, it is removed. If the key is found in an internal node, it is replaced with its predecessor or successor key, and the deletion process continues recursively. If both the predecessor and successor nodes have the minimum number of keys, the method merge_nodes is called to merge two child nodes. If the key is not found in the current node, the method continues the deletion process recursively in the appropriate child node.

Other Methods

def get_predecessor(self, x):
        while not x.leaf:
            x = x.child[-1]
        return x.keys[-1]

    def get_successor(self, x):
        while not x.leaf:
            x = x.child[0]
        return x.keys[0]

    def merge_nodes(self, x, i, y, z):
        y.keys.append(x.keys[i])
        y.keys.extend(z.keys)
        y.child.extend(z.child)
        x.keys.pop(i)
        x.child.pop(i + 1)

        if len(x.keys) == 0:  
            self.root = y

    def fix_child(self, x, i):
        if i > 0 and len(x.child[i - 1].keys) >= self.t:
            self.borrow_from_left(x, i)
        elif i < len(x.child) - 1 and len(x.child[i + 1].keys) >= self.t:
            self.borrow_from_right(x, i)
        else:
            if i > 0:
                self.merge_nodes(x, i - 1, x.child[i - 1], x.child[i])
                i -= 1
            else:
                self.merge_nodes(x, i, x.child[i], x.child[i + 1])

    def borrow_from_left(self, x, i):
        child = x.child[i]
        sibling = x.child[i - 1]

        child.keys.insert(0, x.keys[i - 1])
        x.keys[i - 1] = sibling.keys.pop()
        if not child.leaf:
            child.child.insert(0, sibling.child.pop())

    def borrow_from_right(self, x, i):
        child = x.child[i]
        sibling = x.child[i + 1]

        child.keys.append(x.keys[i])
        x.keys[i] = sibling.keys.pop(0)
        if not child.leaf:
            child.child.append(sibling.child.pop(0))

The code also includes several helper methods such as get_predecessor, get_successor, merge_nodes, fix_child, borrow_from_left, borrow_from_right. These methods assist in finding the predecessor or successor of a key, merging nodes, fixing child nodes, borrowing keys from neighboring nodes, in the B-tree.

Searching

Here are the steps to search an element in a B Tree:

  1. Start at the root node of the B-tree.
  2. Compare the search key with the keys in the current node.
  3. If the search key matches a key in the current node, the key is found.
  4. If the search key is smaller than the current key, recursively search in the left child node.
  5. If the search key is greater than the current key, recursively search in the right child node.
  6. Repeat steps 2-5 until a match is found in a leaf node or it is determined that the key is not present in the B-tree.
def search(self, k, x=None):
        x = x or self.root
        if isinstance(x, BTreeNode):
            i = 0
            while i < len(x.keys) and k > x.keys[i]:
                i += 1
            if i < len(x.keys) and k == x.keys[i]:
                return x, i
            elif x.leaf:
                return None
            else:
                return self.search(k, x.child[i])
        else:
            return self.search(k, self.root)

The method then checks if the current node x is an instance of the BTreeNode class. If it is, the search process proceeds. Otherwise, it recursively calls the search method, this time passing the root node as the argument. Within the search process, the method iterates through the keys in the current node x, comparing them with the target key k. It determines the appropriate position for k within the keys of x and either returns a tuple containing the node x and the index i if the key is found, or continues the search recursively in the appropriate child node.

Display

    def display(self, x=None):
        x = x or self.root
        for i in x.keys:
            if i != (None, None):
                print(i, end=" ")
        if not x.leaf:
            for i in x.child:
                self.display(i)

The method begins by checking if a specific node x is provided as an argument. If not, it defaults to the root node of the B-tree. It iterates through the keys in the current node x. For each key, it checks if the key is not (None, None). If it is not, it prints the key followed by a space. After printing the keys in the current node, the method checks if the current node is not a leaf node (not x.leaf). If it is not a leaf node, it means there are child nodes to explore. It recursively calls the display method on each child node i in x.child. This recursive call continues the display process on the child nodes.

Tree Creation

btree = BTree(3)
keys = [10, 20, 30, 40]
for key in keys:
    btree.insert(key)

btree.display()

key = 20
result = btree.search(key)
if result:
    print(f"\nFound {key}")
else:
    print(f"\nNot Found {key}")

btree.delete(30)
print("After deleting 30")
btree.display()

This code creates a B-tree with a minimum degree of 3. It inserts keys 10, 20, 30, and 40 into the B-tree. Then, it displays the keys in the B-tree. It searches for key 20 and prints whether it is found or not. Finally, it deletes key 30 from the B-tree and displays the keys again to show the updated B-tree structure.

Sample Explanation

B-Tree-Python-OOP

  1. A B-tree of degree of maximum degree 3 is created
  2. Element 10 is inserted
  3. Element 20 is inserted
  4. Element 30 is inserted and only m-1 (3-1=2) can be added the tree is splitted into two parts : a left-child and a right-child
  5. Element 40 is inserted after 30
  6. Element 30 is deleted and it does not affect the balance of the B-tree

Minimum degree (t):
The minimum degree (t) of a B-tree is the minimum number of children a non-root node can have. A non-root node in a B-tree must have at least t-1 keys and t children. In other words, a non-root node can have a minimum of t-1 keys and t-1 pointers to child nodes.

Maximum degree (2t):
The maximum degree of a B-tree is the maximum number of children a node can have. A node in a B-tree can have at most 2t-1 keys and 2t children. In other words, a node can have a maximum of 2t-1 keys and 2t pointers to child nodes.

Complete Implementation

class BTreeNode:
    def __init__(self, leaf=False):
        self.keys = []
        self.child = []
        self.leaf = leaf


class BTree:
    def __init__(self, t):
        self.root = BTreeNode(True)
        self.t = t

    def insert(self, k):
        root = self.root
        if len(root.keys) == (2 * self.t) - 1:
            temp = BTreeNode()
            self.root = temp
            temp.child.insert(0, root)
            self.split_child(temp, 0)
            self.insert_non_full(temp, k)
        else:
            self.insert_non_full(root, k)

    def insert_non_full(self, x, k):
        i = len(x.keys) - 1
        if x.leaf:
            x.keys.append((None, None))
            while i >= 0 and k < x.keys[i]:
                x.keys[i + 1] = x.keys[i]
                i -= 1
            x.keys[i + 1] = k
        else:
            while i >= 0 and k < x.keys[i]:
                i -= 1
            i += 1
            if len(x.child[i].keys) == (2 * self.t) - 1:
                self.split_child(x, i)
                if k > x.keys[i]:
                    i += 1
            self.insert_non_full(x.child[i], k)

    def split_child(self, x, i):
        t = self.t
        y = x.child[i]
        z = BTreeNode(y.leaf)
        x.child.insert(i + 1, z)
        x.keys.insert(i, y.keys[t - 1])
        z.keys = y.keys[t:(2 * t) - 1]
        y.keys = y.keys[0:t - 1]
        if not y.leaf:
            z.child = y.child[t:(2 * t)]
            y.child = y.child[0:t - 1]

    def delete(self, k):
        self.delete_recursive(self.root, k)

    def delete_recursive(self, x, k):
        i = 0
        while i < len(x.keys) and k > x.keys[i]:
            i += 1

        if i < len(x.keys) and k == x.keys[i]:
            if x.leaf:
                x.keys.pop(i)
            else:
                y = x.child[i]
                z = x.child[i + 1]
                if len(y.keys) >= self.t:
                    predecessor = self.get_predecessor(y)
                    x.keys[i] = predecessor
                    self.delete_recursive(y, predecessor)
                elif len(z.keys) >= self.t:
                    successor = self.get_successor(z)
                    x.keys[i] = successor
                    self.delete_recursive(z, successor)
                else:
                    self.merge_nodes(x, i, y, z)
                    self.delete_recursive(y, k)
        else:
            if x.leaf:
                print(f"Key {k} does not exist in the B-tree.")
            else:
                if len(x.child[i].keys) < self.t:
                    self.fix_child(x, i)

                self.delete_recursive(x.child[i], k)

    def get_predecessor(self, x):
        while not x.leaf:
            x = x.child[-1]
        return x.keys[-1]

    def get_successor(self, x):
        while not x.leaf:
            x = x.child[0]
        return x.keys[0]

    def merge_nodes(self, x, i, y, z):
        y.keys.append(x.keys[i])
        y.keys.extend(z.keys)
        y.child.extend(z.child)
        x.keys.pop(i)
        x.child.pop(i + 1)

        if len(x.keys) == 0:  
            self.root = y

    def fix_child(self, x, i):
        if i > 0 and len(x.child[i - 1].keys) >= self.t:
            self.borrow_from_left(x, i)
        elif i < len(x.child) - 1 and len(x.child[i + 1].keys) >= self.t:
            self.borrow_from_right(x, i)
        else:
            if i > 0:
                self.merge_nodes(x, i - 1, x.child[i - 1], x.child[i])
                i -= 1
            else:
                self.merge_nodes(x, i, x.child[i], x.child[i + 1])

    def borrow_from_left(self, x, i):
        child = x.child[i]
        sibling = x.child[i - 1]

        child.keys.insert(0, x.keys[i - 1])
        x.keys[i - 1] = sibling.keys.pop()
        if not child.leaf:
            child.child.insert(0, sibling.child.pop())

    def borrow_from_right(self, x, i):
        child = x.child[i]
        sibling = x.child[i + 1]

        child.keys.append(x.keys[i])
        x.keys[i] = sibling.keys.pop(0)
        if not child.leaf:
            child.child.append(sibling.child.pop(0))

    def search(self, k, x=None):
        x = x or self.root
        if isinstance(x, BTreeNode):
            i = 0
            while i < len(x.keys) and k > x.keys[i]:
                i += 1
            if i < len(x.keys) and k == x.keys[i]:
                return x, i
            elif x.leaf:
                return None
            else:
                return self.search(k, x.child[i])
        else:
            return self.search(k, self.root)

    def display(self, x=None):
        x = x or self.root
        for i in x.keys:
            if i != (None, None):
                print(i, end=" ")
        if not x.leaf:
            for i in x.child:
                self.display(i)


btree = BTree(3)
keys = [10, 20, 30, 40]
for key in keys:
    btree.insert(key)

btree.display()

key = 20
result = btree.search(key)
if result:
    print(f"\nFound {key}")
else:
    print(f"\nNot Found {key}")

btree.delete(30)
print("After deleting 30")
btree.display()

Output

10 20 30 40 
Found 20
After deleting 30
10 20 40

Time and Space Complexity

Insertion

Time Complexity   : O(log n) , n is the total number of keys in the tree
Space Complexity  : O(n) , n is the total number of keys in the tree

Deletion

Time Complexity   : O(log n) , n is the total number of keys in the tree
Space Complexity  : O(n) , n is the total number of keys in the tree

Searching

Time Complexity   : O(log n) , n is the total number of keys in the tree
Space Complexity  : O(n) , n is the total number of keys in the tree

Quiz

Question 1

Time Complexity of B Tree insertion operation is?

O(log n)
O(1)
O(n)
O(n/2)
Time Complexity : O(log n) , n is the total number of keys in the tree

Question 2

The B-tree is well suited for storage systems?

True
False
Unlike other self-balancing binary search trees, the B-tree is well suited for storage systems that read and write relatively large blocks of data, such as databases and file systems.

MATHANKUMAR V

Mathankumar V is the Winner of Smart India Hackathon (2022) and Software Developer, Intern at OpenGenus. He is pursuing BE in Computer Science from Dr. Mahalingam College of Engineering and Technology

Read More

Improved & Reviewed by:


OpenGenus Tech Review Team OpenGenus Tech Review Team
B Tree in Python using OOP Concepts
Share this