Node structure of different trees in Python


Trees are abstract and hierarchial data structures. In this article, we have discussed the implementation of node structures of different trees in Python Programming language. This gives a great opportunity to understand how different tree data structure vary. The trees we have covered are:

  • Binary Tree
  • Binary Search Tree (BST)
  • AVL Tree
  • B Tree
  • B+ Tree
  • Red-Black Tree
  • Treap
  • Splay Tree
  • R Tree

We will go through each node structure in detail.

1. Binary Tree

A binary tree is a tree with the special condition of a maximum of two children nodes. Each node in a binary tree has a left reference, a right reference and a data element. A non-empty tree always has a root node. Root node is either the topmost or the bottom node in the tree, depending on the representation. If the representation is top-down, root is the topmost node, else if the representation is bottom-up, root is the bottom node. Leaf nodes have null values in their right and left link. In a full binary trees, every node except leaf node has two children.

class BinaryNode:

	# has one data value in each node.
	def __init__(self, data):

		self.data = data
		self.right = None
		self.left = None

2. Binary Search Tree (BST)

As the name suggests, we can easily carry out binary search on this tree. A BST is a binary tree with elements stored in a sorted manner; the value stored at the root is more than any value stored in its left subtree and less than any value stored in its right subtree. The BST property must be satisfied for all nodes on tree and their subtrees. The node structure of BST is similar to that of binary tree.

class BSTNode:

	# has one data value in each node.
	def __init__(self, data):

		self.data = data
		self.right = None
		self.left = None

3. AVL Tree

AVL tree is also a self Balancing Binary Search Tree. In an AVL tree, the difference between heights of left and right subtrees cannot be more than one for all nodes. Since the height of the tree remains O(Logn) after every insertion and deletion, we have an upper bound of O(Logn) for all other operations.

The node structure of an AVL tree has an additional parameter namely node height which is used to calculate the balance factor for itself.

class AVLNode:

	# has one data value in each node.
	def __init__(self, data):

		self.data = data
		self.right = None
		self.left = None

		# height signifies its distance from root node.
		self.height = 1

4. B Tree

B-Tree is a self-balancing search tree. The main idea of using B-tree is to reduce the number of disc accesses. All leaves are at the same level in a B-Tree. A B-Tree has a term minimum degree "t" which depends upon disc block size. Every node in a B-Tree has a variable number of keys, between t-1 and 2t-1, except the root node. The number of children of a node is equal to the number of keys in it plus 1.

class BTreeNode:

	# Every node contains minimum t-1 keys and maximum 2t - 1 keys
	# Except root node which can have minimum 1 key.
	def __init__(self, t):

		self.t = t

		# it is true when node is leaf, otherwise false.
		self.isLeaf = True

		# current number of keys.
		self.keys = []
		self.childPointers = []
		self.numberOfKeys = 0

5. B+ Tree

Unlike B Tree, B+Tree does not store data pointers along with its key values in all the nodes; it only stores pointers at the leaf nodes of the tree. Moreover, the leaf nodes are linked to provide ordered access to the records. Therefore, all the nodes form a multilevel index with leaf nodes at the first level.
Thus, the structure of the leaf nodes of a B+ tree is quite different from the structure of the internal nodes of the B tree.

Note: In B Tree, Keys and records both can be stored in the internal as well as leaf nodes. Whereas, in B+ tree, records (data) can only be stored on the leaf nodes while internal nodes can only store the key values.

class BplusTreeNode:

	def __init__(self, t):
		self.t = t
		
		# it is true when node is leaf, otherwise false.
		self.isLeaf = True
		self.next = None

		# current number of keys
		self.keys = []
		self.childPointers = []
		self.numberOfKeys = 0

6. Red-Black Tree

It is a self-balancing Binary Search Tree with every node either red or black, except the root node which is always black. If a node is red, both of its children are black. This means no two nodes on a path can be red nodes. Every path from a root node to a NULL node has the same number of black nodes.

The Node structure is similar to a Binary Search Tree with an extra parameter interpreted as color, red or black. These color bits are used to ensure the tree remains approximately balanced during insertions and deletions.

Note: AVL trees cause more rotations than a Red-Black tree but are more balanced than the latter. Hence if a Red-Black tree is preferred in applications with frequent insertions and deletions. And AVL trees are preferred in applications with frequent search.

class RedBlackTreeNode:

	# similar to Binary search tree
	def __init__(self, data):

		self.data = data

		# 1 for Red and 0 for Black
		self.colour =  1 
		self.left = None
		self.right = None

7. Treap

Formally, a treap is a binary tree whose nodes contain two values, a key and a priority, such that the key keeps the BST property and the priority is a random value that keeps the heap property. It uses randomization and binary heap property to maintain balance with a high probability. The key follows the binary search property while the priority follows the heap property.

class TreapNode:

	def __init__(self, key, priority):
		self.key = key
		self.priority = priority
		self.left = None
		self.right = None

8. Splay Tree

Splay tree is a self-balancing BST. All the operations in a splay tree are performed at its root. Whenever an element is looked up, the tree reorganizes to bring that element to the root, without violating the BST property. Thus, the elements which are heavily used are found near the top and hence can be found quickly. The node structure is similar to that of BST.

class SplayNode:

	# has one data value in each node.
	def __init__(self, data):

		self.data = data
		self.right = None
		self.left = None

9. R Tree

R-tree is used for storing spatial data indexes efficiently i.e. for indexing multi-dimensional information in an efficient manner. Each node of an R-tree has a variable number of entries(up to some pre-defined maximum) and each entry within a non-leaf node stores pointer to a child node, and the bounding box of all entries within this child node. Each entry within a leaf node stores the actual data element and the bounding box of the data element. The node structure is similar to that of B tree.

class RTreeNode:

	def __init__(self, maxEntries):

		self.maxEntries = maxEntries
		self.entries = []
		self.childPointers = []

		# for leaf node
		self.isLeaf = True
		self.data = None

With this article at OpenGenus, you must have the complete idea of different trees and how implementation in Python translates the idea. Consider contrasting the different node structures we have presented among each other. This will bring out some insightful thoughts.