Maximum XOR of two numbers in an array using Trie

Internship at OpenGenus

Get FREE domain for 1st year and build your brand new site

Given a list of numbers we need to identify a pair of numbers in the list such that the XOR of those numbers is the maximum possible over all the pairs. If there are M elements with each element of order N, then brute force approach will take O(M^2 logN) time while Trie solution will take O(M logN) time.

Introduction

Given an array A of N integers, we need to find A[i], A[j] such that A[i] XOR A[j] is the maximum over all i and j, where 1<=i<=|A|, 1<=j<=|A| and iā‰ j.

Here, by XOR operation, x XOR y, we mean the bitwise XOR of each corresponding bits of x and y. The number of integers in the list, N, must be atleast 2.

XOR operation

XOR of two bits b1 and b2 is a binary logical operation that results in 1 iff the bits are different. It results in a 0 if the bits are the same.
xor_tt

Trie

A trie is a data structure that uses the digits in the keys to organize and search the dictionary. Usually these are used for strings where the strings are keys and the digits would be letters. For our problem, we will consider the binary representation of the numbers as keys and the digits thus are 0 and 1.

The root is usually an empty key. The child nodes of any of the parent node have a common prefix associated with that particular parent node.
trie_example-1
An example trie stores keys: 0, 3, and 4

For bitwise operations since we represent bits using the position of nodes, this is sometimes called bitwise trie

O(NlogN) solution using trie

Construct a trie

For each integer key, each node of the trie will represent one bit (0 or 1) starting from most significant bit. Since there are only two possible bits, the number of children of a trie node will be at most 2. The root node will have an arbitrary value.

Query

Now for each element A[i] of A[1, 2, ...N]
Perform query to retrieve the maximum XOR value possible for A[i]. We know that XOR of different type of bits is always 1, i.e., 0^1=1 and 1^0=1.

So during the query for each bit:

  • we try to traverse the node holding the opposite bit. This will thus result in a 1 bit for that particular bit, thus has the effect of maximizing the XOR value.
  • If there is no node with opposite bit, only then traverse the same bit node, which will result in a 0.

Following is the pseudocode:

query( key, otherKey )
    start from root node
    xor = 0;
    for each bit b in key
    {
        if(child node has opposite bit)
        {
            traverse to the child node of opposite bit
            turn ON ith bit of XOR
            if b is OFF
                turn ON ith bit of otherKey
            else
                turn OFF ith bit of otherKey
        }
        else{
            if b is ON
                turn ON ith bit of otherKey
            else
                turn OFF ith bit of otherKey
            traverse to the child node of value b
        }
    }
    output XOR;

Insertion

After query, insert A[i] into trie. Traverse down the trie, corresponding to bit positions of A[i]. If node already exists for a bit then we simply traverse, if it doesn't exist then create a corresponding new node at that position.

For each element, keep track the maximum XOR value possible.
During walking through each node, build the other key for which the XOR is being maximized. Thus, for A[i], we check the maximum possible XOR element and then compare it with our existing maximum and update if it is greater than it.

Following is the pseudocode for Insert operation:

insert( key )
    start from root node
    for each bit b in key
    {
        if current node does not have a child with bit b
            create new child node based on value of b
        traverse to the child node
    }

Insight: a note on bit operations

Turn on/off the ith bit of k
To turn the ith bit of k into 1, we do an OR operation. such that the ith bit will be 1 but the rest are not affected. 0 OR b = b, and 1 OR b = 1 so
k = k | (1<<i) makes the ith bit of k as 1

To turn the ith bit of k into 0, we do an AND operation. such that the ith bit will be 0 but the rest are not affected. 0 AND b = 0, and 1 AND b = b so
k = k & ~(1<<i) makes the ith bit of k as 0.

Algorithm

The steps to find the maximum XOR using Trie is as follows:

  • Step 1: Insert all M elements (bitwise representation) into trie
  • Step 2: For each element N:
    • Step 2.1: Find the maximum XOR in the trie (say L2)
    • Step 2.2: If L1 < L2, then update L1 = L2
  • Step 3: The maximum value of L1 is the answer
maxXorElements( A )
    maxXOR = 0
    a = 0
    b = 1
    trie.insert(0);
    for each element k in A
        elem = 0;
        curr = trie.query(k, elem);
        if curr > maxXOR
            maxXOR = curr;
            a = k
            b = elem
        trie.insert(k);
    output a, b

Time complexity

For M elements where every element is of order O(N), we need one query of (O(logN)) and one insertion of (O(logN)) for each element.

So the overall time complexity is O(M logN) as we need to insert M elements and find the maximum XOR for each element.

Time complexity of different steps:

  • Step 1: O(M logN)
  • Step 2.1: O(logN)
  • Step 2: O(M logN)
  • Step 3: O(1)

Brute force approach would have taken O(N^2) time as we need to check for every pair and there are O(N^2) pairs. Additionally, if we consider the number of bits as it is considered in the trie solution, then the time complexity of our Brute force approach will be O(N^2 logN).

O(logN) is the number of bits in a number N.

Pseudocode

Following is the complete pseudocode with implementation details:

void insert(key) {
    curr = root;
    for(i = MSB; i >= 0; --i) {
        bit = key & (1 << i);
        if((curr->children[bit]) == 0)
            curr->children[bit] = new TrieNode();
        curr = curr->children[bit];
    }
}

int query(int key, int& otherKey) {
    XOR = 0;
    curr = root;
    for(i = MSB; i >= 0; --i) {
        bit = key & (1 << i);
        if(curr->children[!bit]) {
            curr = curr->children[!bit];
            XOR |= (1 << i);
            if(bit == 0) otherKey |= (1 << i);
            else otherKey &= ~(1 << i);
        }
        else{
            if(bit == 1) otherKey |= (1 << i);
            else otherKey &= ~(1 << i);
            curr = curr->children[bit];
        }
    }
    return XOR;
}

findMaximumXorElements(vector<int>& A) {
    maximumXOR = 0;
    result = {  } 
    if(A.size() < 2) return result;

    trie.insert(0);
    for(i = 0; i < A.size(); ++i) {
        elem = 0;
        curr = trie.query(A[i], elem);

        if(curr > maximumXOR) {
            maximumXOR = curr;
            result = { A[i], elem };
        }
        trie.insert(A[i]);
    }
    return result;
}

Implementation

Here is C++ implementation of the above algorithm:

#include<iostream>
#include<utility>
#include<vector>
using namespace std;

const static int SIZE = 2;
const static int MSB = 30;

class Trie {
private:
    struct TrieNode {
        TrieNode* children[SIZE];
        TrieNode(){
            for(int i = 0; i < SIZE; ++i) {
                children[i] = nullptr;
            }
        }
        ~TrieNode() {
            for(int i = 0; i < SIZE; ++i) {
                delete children[i];
                children[i] = nullptr;
            }
        }
    };
    TrieNode* root;

public:
    Trie(): root(new TrieNode()) {}
    
    ~Trie() {
        delete root;
        root = nullptr;
    }

    void insert(int key) {
        TrieNode* curr = root;

        for(int i = MSB; i >= 0; --i) {
            bool bit = (bool)(key & (1 << i));
            if(!curr->children[bit]) {
                curr->children[bit] = new TrieNode();
            }
            curr = curr->children[bit];
        }
    }

    int query(int key, int& otherKey) {

        int XOR = 0;
        TrieNode *curr = root;
        for(int i = MSB; i >= 0; --i) {
            bool bit = (bool)(key & (1 << i));
            if(curr->children[!bit]) {
                curr = curr->children[!bit];
                XOR |= (1 << i);
                if(!bit) { otherKey |= (1 << i); }
                else { otherKey &= ~(1 << i);}
            }
            else{
                if(bit) { otherKey |= (1 << i); }
                else { otherKey &= ~(1 << i); }
                curr = curr->children[bit];
            }
        }
        return XOR;
    }
};

pair<int, int> findMaximumXorElements(vector<int>& A) {
    int n = A.size();
    int maximumXOR = 0;
    pair<int, int> result; 
    if(n < 2) return result;

    Trie trie;
    trie.insert(0);
    for(int i = 0; i < n; ++i) {
        int elem = 0;
        int curr = trie.query(A[i], elem);

        if(curr > maximumXOR) {
            maximumXOR = curr;
            result = make_pair(A[i], elem);
        }
        trie.insert(A[i]);
    }

    return result;
}

int main()
{
    vector<int> v({13,18,19,12});
    pair<int,int> s = findMaximumXorElements(v);
    cout << s.first << ", " << s.second << endl;
    // possible solutions: (13,18), (12,19)

    return 0;
}

Example

For the sake of this example consider MSB = 8, so that it is easier to follow.

Consider the elements {13,18,19,12}. We need to find two elements such that their XOR is maximum over all the element pairs.
Initially a 0 key is inserted into the trie. The global maximum XOR is 0.
We consider each element in the list. The first query is with key = 13. Since the trie has only 0 in it, the highest XOR value is with 0, which is 13. 13 is greater than the global max 0, so we update that. We insert the key = 13 into the trie.

Next 18 is considered and the trie has 0 and 13. 18 is XORed with each of 0 and 13. The highest XOR we get is with 13. The XOR value is 31. Since this is higher than the global highest value of 13, we update that. We insert the key = 18 into the trie.

Next 19 is considered and the trie has 0, 13 and 18. The key 19 is XORed with each of 0, 13, and 18. The highest XOR we get is with 13. The XOR value is 30. This is not greater than the global highest so we don't update that. We insert the key = 19 into the trie.

Next 12 is considered and the trie has 0, 13, 18 and 19. The key 12 is XORed with each of 0, 13, 18 and 19. The highest XOR we get is with 19. The XOR value is 31. This is not greater than the global highest so we don't update that. We insert the key = 12 into the trie.

Thus the maximum value of XOR was 31 and was first obtained with keys 18 and 13.

The final trie for this instance would look something like this
trie

The code above computes one possible solution. There can be multiple solutions, for example in this instance (12,19) also gives a XOR value of 31.

Conclusion

Tries can be used to solve the maximum XOR of two numbers over an array and allows us to do it in O(NlogN) time. The implemented trie supports insertion of keys and query of the maximum XOR of a value with all key values stored in the trie.

Rohit Topi

Intern at OpenGenus | B.Tech Computer Science Student at KLS Gogte Institute Of Technology Belgaum | Contributor to OpenGenus book: "Binary Tree Problems: Must for Interviews and Competitive Coding"

Read More