# Y fast trie

#### Data Structures tree data structure trie y fast trie x fast trie

Reading time: 35 minutes | Coding time: 15 minutes

Y-fast trie is a data structure used to store integers from a bounded domain. It is a bitwise trie, i.e. a binary tree where each subtree stores values having binary representations with common prefix.

It is considered as an upgrade to X-fast tries, allowing more efficient memory usage and faster insertion/deletions.Y-fast trie also treats integers as if they were words of w bits which allows storing integers as a trie of words.

It is expected to perform five operations:

• find(x): Find x in the given trie.
• predecessor(x): Return the largest element in domain less than or equal to x.
• successor(x): Return the smallest element in domain less greater or equal to x.
• insert(x, v): Insert an element x into trie pointing to value v.
• delete(x): Delete an element x from trie.

Y-fast tries consists of two data structures:

• X-fast tries
• balanced binary search trees

Keys are divided into logU groups of consecutive elements, with U being the universe size. A balanced bst is created for each group, with each group being of atleast logU / 4 elements and atmost 2logU elements.

For every binary search tree, a representative r is chosen and all these representatives are stored in a X-fast trie. r has to be smaller than it's successor in X-fast trie and all elements in the bst represented by the successor, as well as greater than its predecessor and all elements in its bst. r does not necessarily have to be one of the element in the bst it represents. Image above shows a Y-fast trie with universe size M.
Y-fast trie starts as a X-fast trie. The major change is that we operate on representative values r in X-fast tries, and the leaf nodes points to balanced binary search trees rather than values.

### Algorithm

Y Fast trie supports the following operations:

• find
• successor and predecessor
• insert
• delete

### Find operation

• find(k): Find the element k in trie.
1. Find the predecessor and successor representatives of k in the X-fast
trie, i.e. find largest representative smaller than k and find smallest
representative larger than k in representative X-fast trie.
k can be only in the bst's represented by them.
2. Binary search for k in the binary search trees.

### Successor and Predecessor operations

• successor(k) and predecessor(k): Successor is the node with smallest element greater than or equal to k, while predecessor is largest element smaller than or equal to k in trie. This node will have longest common prefix with k.
1. Find the successor and predecessor representatives of k in the X-fast trie.
2. Traverse both trees to get two potential successor (or predecessor) of k.
Select the appropriate of the two options. 

### Insert operation

• insert(k, v): To insert the key value pair (k, v)
First determine in which balanced bst should k be inserted
1. Find the successor of k. Determine the bst in which the successor is stored.
Let the tree be T.
2. Insert k into tree T, with the node pointing to v.
3. If T has more than 2log2U elements, remove its representative and
split the tree into two balanced bst's. Pick a representative for each tree
and insert them into X-fast trie.


### Delete operation

• delete(k):
1. Find the key k and delete it. Let the tree that contained it be T.
2. If the elements count of T goes lower than (log2U)/4, merge tree
T with its predecessor or successor while removing representatives of merged
trees from X-fast trie.
3. Insert representative of new tree into X-fast trie.
4. If the newly formed tree has more than 2log2U elements, remove its
representative and split the tree into two balanced bst's. Pick a representative
for each tree and insert them into X-fast trie.

### Complexity

• Time complexity:
1. Find: O(1)
2. Successor, Predecessor: O(log2log2U)
3. Insert: O(log2U)
4. Delete: O(log2U)
• Space complexity: O(N log2U)

Where N is count of values stored and U is universe size.

### Implementations

C++ 11

/*
/*
* c++ 11 code to demonstrate a Yfast trie. A proper would be
* humongous, so this code should be taken just as an example.
*/

#include <iostream>
#include <unordered_map>
#include <vector>
#include <map>
#include <iterator>

struct Node{
int level;

int key;

// also used by leaf nodes to simulate linked list
Node *left, *right;

Node(){
level = -1;
left = nullptr;
right = nullptr;
}
};

class XfastTrie{
int w;
std::vector<std::unordered_map<int, Node *>> hash_table;

int getDigitCount(int x){
int count = 0;
for(; x > 0; x >>= 1, ++count);
return count;
}

int leftChild(int x){
return (x << 1);
}

int rightChild(int x){
return ((x << 1) | 1);
}

Node * getLeftmostLeaf(Node *parent){
while(parent->level != w){
if(parent->left != nullptr)
parent = parent->left;
else
parent = parent->right;
}
return parent;
}

Node * getRightmostLeaf(Node *parent){
while(parent->level != w){
if(parent->right != nullptr)
parent = parent->right;
else
parent = parent->left;
}
return parent;
}

public:
XfastTrie(){}

XfastTrie(int U){
w = getDigitCount(U);
hash_table.assign(w + 1, std::unordered_map<int, Node *>());
Node *root = new Node();
root->level = 0;
hash_table = root;

}

Node* find(int k){
if(hash_table[w].find(k) == hash_table[w].end())
return nullptr;
return hash_table[w][k];
}

Node* successor(int k){
int low = 0,
high = w + 1,
mid, prefix;

Node *tmp = nullptr;
while(high - low > 1){
mid = (low + high) >> 1;
prefix = k >> (w - mid);
if(hash_table[mid].find(prefix) == hash_table[mid].end())
high = mid;
else{
low = mid;
tmp  = hash_table[mid][prefix];
}
}

if(tmp == nullptr || tmp->level == 0)
// occours on first insertion
return nullptr;

if(tmp->level == w)
return tmp;

// use descendant node
if((k >> (w - tmp->level - 1)) & 1)
tmp = tmp->right;
else
tmp = tmp->left;

if(tmp->key < k){
return tmp->right;
}
return tmp;
}

Node* predecessor(int k){
// completely same as successor except lst section
int low = 0,
high = w + 1,
mid, prefix;

Node *tmp = nullptr;
while(high - low > 1){
mid = (low + high) >> 1;
prefix = k >> (w - mid);
if(hash_table[mid].find(prefix) == hash_table[mid].end())
high = mid;
else{
low = mid;
tmp  = hash_table[mid][prefix];
}
}

if(tmp == nullptr || tmp->level == 0)
// occours on first insertion
return nullptr;

if(tmp->level == w)
return tmp;

// use descendant node
if((k >> (w - tmp->level - 1)) & 1)
tmp = tmp->right;
else
tmp = tmp->left;

if(tmp->key > k){
return tmp->left;
}
return tmp;
}

void insert(int k){
Node *node = new Node();
node->key = k;
node->level = w;

Node *pre = predecessor(k);
Node *suc = successor(k);
if(pre != nullptr){
if(pre->level != w){
std::cout << "Wierd level " << pre->level << '\n';
}
node->right = pre->right;
pre->right = node;
node->left = pre;
}
if(suc != nullptr){
if(suc->level != w){
std::cout << "Wierd level " << suc->level << '\n';
}
node->left = suc->left;
suc->left = node;
node->right = suc;
}

int lvl = 1, prefix;
while(lvl != w){
prefix = k >> (w - lvl);
if(hash_table[lvl].find(prefix) == hash_table[lvl].end()){
Node *inter = new Node();
inter->level = lvl;
hash_table[lvl][prefix] = inter;
if(prefix & 1)
hash_table[lvl - 1][prefix >> 1]->right = inter;
else
hash_table[lvl - 1][prefix >> 1]->left = inter;
}
++lvl;
}
hash_table[w][k] = node;
if(k & 1)
hash_table[w - 1][k >> 1]->right = node;
else
hash_table[w - 1][k >> 1]->left = node;

// update descendant pointers
prefix = k;
lvl = w - 1;
while(lvl != 0){
prefix = prefix >> 1;
if(hash_table[lvl][prefix]->left == nullptr)
hash_table[lvl][prefix]->left = getLeftmostLeaf(hash_table[lvl][prefix]->right);
else if(hash_table[lvl][prefix]->right == nullptr)
hash_table[lvl][prefix]->right = getRightmostLeaf(hash_table[lvl][prefix]->left);
--lvl;
}
if(hash_table->left == nullptr){
hash_table->left = getLeftmostLeaf(hash_table->right);
}
if(hash_table->right == nullptr){
hash_table->right = getRightmostLeaf(hash_table->left);
}
}
};

class BinarySearchTree{
public:
std::map<int, int> tree;

void insert(int k, int val){
tree[k] = val;
}

int successor(int k){
std::map<int, int> :: iterator tmp = tree.lower_bound(k);
if(tmp == tree.end())
return -1;
else
return tmp->first;
}

int predecessor(int k){
std::map<int, int> :: iterator tmp = tree.upper_bound(k);
if(tmp == tree.begin())
return -1;
tmp = std::prev(tmp);
return tmp->first;
}
};

class YfastTrie{
std::unordered_map<int, BinarySearchTree> bst;
XfastTrie xtrie;
int w;

int getDigitCount(int x){
int count = 0;
for(; x > 0; x >>= 1, ++count);
return count;
}

public:
YfastTrie(int u){
w = getDigitCount(u);
xtrie = XfastTrie(u);
}

int find(int k){
Node *suc = xtrie.successor(k);
Node *pre = xtrie.predecessor(k);

if(bst[suc->key].tree.find(k) != bst[suc->key].tree.end())
return bst[suc->key].tree[k];
if(bst[pre->key].tree.find(k) != bst[pre->key].tree.end())
return bst[pre->key].tree[k];

return -1;
}

int successor(int k){
Node *suc = xtrie.successor(k);
Node *pre = xtrie.predecessor(k);
// used as infinite here
int x = 2 << 2, y = 2 << w;

if(suc != nullptr)
x = bst[suc->key].successor(k);
if(pre != nullptr)
y = bst[pre->key].successor(k);

return (x < y) ? x : y;
}

int predecessor(int k){
Node *suc = xtrie.successor(k);
Node *pre = xtrie.predecessor(k);
int x = -1, y = -1;
if(suc != nullptr)
x = bst[suc->key].predecessor(k);
if(pre != nullptr)
y = bst[pre->key].predecessor(k);

return (x > y) ? x : y;
}

void insert(int k, int val){
Node *suc = xtrie.successor(k);
if(suc == nullptr){
xtrie.insert(k);
// representative can be anything. using first element as
// representative requires length of xfast trie to be u.
bst[k] = BinarySearchTree();
bst[k].tree[k] = val;
}
else{
int succ = suc->key;
std::cout << succ << '\n';
bst[succ].tree[k] = val;
}
}
};

int main(){
YfastTrie trie(1 << 5);
std:: cout << "insert 1, 5, 11, 12\n";
trie.insert(5, 2);
trie.insert(11, 2);
trie.insert(12, 2);
trie.insert(1, 2);

std::cout << "Successor of key 2:\n";
int tmp = trie.successor(2);
if(tmp != -1){
std::cout << tmp << '\n'
<< "value stored = " << trie.find(tmp) << '\n';
}

std::cout << "Predecessor of key 13:\n";
tmp = trie.predecessor(13);
if(tmp != -1){
std::cout << tmp << '\n'
<< "value stored = " << trie.find(tmp) << '\n';
}

return 0;
}


### Applications

• Y-fast tries are used extensively in database systems.
• Y-fast tries are extremely fast associative array.
• Y-fast tries have replaced X-fast tries in almost all applications, for example, IP routing. 