Range greatest common divisor (GCD) query using Sparse table

Open-Source Internship opportunity by OpenGenus for programmers. Apply now.

In this article at OpenGenus, we will be discussing one of the common problem in computer science that involves finding the GCD of a given range of numbers in an array. One of the efficient solution is to use sparse table data structure to solve this problem.

Table of contents

What is GCD, Range GCD
Brute Force
What is a sparse table and how does it work
Finding the range GCD
Complexities
Advantage and disadvantage of sparse table
Conclusion

What is GCD, Range GCD

The Greatest Common Divisor (GCD) is a basic and a very important mathematical concept that refers to the largest positive integer that divides two or more integers without leaving a remainder. Example: GCD of 9 and 6 is 3 because 3 is the largest positive integer that can divide both 9 and 6 without leaving a remainder.

In computer science, GCD is often used as a basic building block for more complex algorithms. Here we will be discussing one of the common problem that involves GCD, which is Range GCD where we have to compute the GCD of a given range of integers in an array.

The range GCD is one of those problems that arises in many applications, such as signal processing, image processing etc. In this problem we will be finding the GCD of a range of integers in an array, where the range is defined by two indices. If l and r are the indices of the range then we have to find the GCD of A[l], A[l+1], A[l+2], ..., A[r].

Example:
If the array elements are:  8, 64, 80, 88, 196, 200
The answers for the querites should be:
    | Query | GCD of the range |
    | 0, 2  |        8         |
    | 2, 5  |        4         |
    | 0, 5  |        4         |
    | 3, 5  |        4         |

Brute Force

The brute force approach to solve the Range GCD query problem is to simply iterate over all pairs (i, j) in the given range and find the GCD of the elements between them. The time complexity of this approach would be O(q * n^2), where q is the number of queries and n is the size of the array, since we need to compute the GCD for each query and for each pair (i, j). The space complexity would be O(1), since we don't need to store any additional data structures. This approach is not feasible for large values of n, since the time complexity is quadratic. Hence we need a better approach.

What is a sparse table and how does it work

Sparse table is a data structure that is used in computer science to solve range query problems efficiently. It is really useful if the input array is static (meaning it doesn't change after being initialized).

A sparse table is a 2-dimensional array, where each cell (i, j) stores the result of some operation like min, max, GCD etc, over a subrange of length 2^j starting from index i. The key idea here is to pre-compute all the possible subranges of a power of two, and then these values will be used to answer the range queries on the original array.

To make a sparse table we will create a matrix and then copy the elements of the array to the first column. After this for the subsequent column, we use the adjacent pairs of cells from the previous column to do the operation and to calculate the further values. This process is repeated for each column until we have filled the entire table.

The time complexity of constructing a Sparse Table for an array of size n is O(n log n), and the time complexity of answering a single range query is O(1). Therefore, the total time complexity of answering m range queries on an array of size n using a Sparse Table is O(n log n + m). This makes Sparse Tables an efficient solution to many range query problems.

Finding the range GCD

Algorithm

Build a Sparse Table for the input array A using the GCD as the operation.
To answer the range GCD query for a range [L, R] in array A:
- We will compute k=floor(log2(R-L+1))
- Then we will split the range into two sub ranges [L, L+2^k-1] and [R-2^k+1, R]
- Compute the GCD off the two subranges by querying the sparse table
- Take the GCD of the results from the two subranges to get the final answer.

Code

#include <bits/stdc++.h>
using namespace std;

const int MAXN = 1e5 + 5;
const int MAXLOG = 25;

int n, q;
int a[MAXN];
int st[MAXN][MAXLOG];

void buildSparseTable() {
    for (int i = 0; i < n; i++)
        st[i][0] = a[i];
    for (int j = 1; (1 << j) <= n; j++)
        for (int i = 0; i + (1 << j) <= n; i++)
            st[i][j] = __gcd(st[i][j - 1], st[i + (1 << (j - 1))][j - 1]);
}

int rangeGCD(int l, int r) {
    int k = log2(r - l + 1);
    return __gcd(st[l][k], st[r - (1 << k) + 1][k]);
}

int main() {
    cin >> n >> q;
    cout<<"Array elements: ";
    for (int i = 0; i < n; i++)
        cin >> a[i];
    buildSparseTable();
    cout<<"Ranges:"<<endl;
    while (q--) {
        int l, r;
        cin >> l >> r;
        cout<<"GCD of the range is: " << rangeGCD(l, r) << endl;
    }
    return 0;
}

Output

6
4
Array elements: 8
64
80
88
196
200
Ranges:
0
2
GCD of the range is: 8
2
5
GCD of the range is: 4
0
5
GCD of the range is: 4
3
5
GCD of the range is: 4

Explanation

First inside the int main() we will accept the number of elements in the array n, number of queries q and the array itself. Then we will build the sparse table using buildSparseTable().
Inside the buildSparseTable() we will use a for loop to copy the first column to be the elements in the array, then we will use a nested for loops to fill in the remaining values of the table.
Inside that loops we are using 1 << j, which represents the size of the interval we are processing. The idea behind the sparse table is to pre-compute the range GCD for all the intervals of length 2^j in the input array, where j ranges from 0 to log2(n). We can then compute the Range GCD for any interval [L, R] by combining precomputed values for non-overlapping intervals of length 2^j that cover [L, R].
In each iteration of the outer for loop, we are processing intervals of length 2^j. The expression 1 << j is equivalent to 2^j, which is the length of the interval we are currently processing.
For example, when j = 0, we are processing intervals of length 2^0 = 1, which are just individual elements in the array. When j = 1, we are processing intervals of length 2^1 = 2, which are pairs of adjacent elements in the array. And so on, until we reach j = log2(n), where we are processing the entire input array as a single interval.
By using 1 << j inside the for loop, we can easily compute the length of the interval we are processing at each iteration, without having to manually compute 2^j each time.
Inside these loops we use st[i][j] = __gcd(st[i][j - 1], st[i + (1 << (j - 1))][j - 1]); to get the GCD of that particular range. The GCD is of the starting and ending of the range with size 1 << j
After computing the GCD, we will now accept the ranges and then display it using rangeGCD(). Inside the rangeGCD() we will calculate k, using the formula k = log2(r - l + 1); here r-l+1 would be the size of the range, and k is the largest power of 2 so that 2^k<=size of the range.
Now we will calculate the indices of the two subintervals of length 2^j that cover the range a[l,r]. These subintervals do not overlap and together cover the entire range [l,r]. The first interval starts at index l and its end would be l+2^j-1. The second interval starts at index r-2^j+1 and end at index r. Now we will return the GCD of these two numbers we obtain. This would give us the GCD of the entire range we queried.
Now we will print the returned GCD and we will continue this till all queries are accepted and results are displayed.

Advantage and disadvantage of sparse table

Advantages

The Sparse Table provides a simple and efficient way to pre-process an array for Range Minimum/Maximum/GCD queries, allowing these queries to be answered in O(1) time.
The construction of the Sparse Table has a time complexity of O(n log n), which is better than some other data structures, such as segment trees, which have a time complexity of O(n log n) for both construction and query answering.
The Sparse Table requires less memory than a segment tree, since it only needs O(n log n) space, compared to the O(n) space required by a segment tree.

Disadvantages

The Sparse Table is not dynamic, which means that it cannot be updated easily if the input array changes whereas, a segment tree can be updated in O(log n) time.
The Sparse Table cannot be used for other types of queries, such as Range Sum queries, without significant modifications to the code.
The Sparse Table is not as widely known or used as some other data structures, such as segment trees or Fenwick trees, so it may be more difficult to find existing implementations or resources for learning.

Conclusion

To conclude this article at OpenGenus, the Sparse Table data structure provides an efficient way to pre-process an array for Range GCD queries. Building the Sparse Table takes O(n log n) time, while answering each query takes O(1) time. The space complexity of the program is O(n log n), which is less than the space required by some other data structures, such as segment trees. However, the Sparse Table is not dynamic and cannot be updated easily, and it is not as widely known or used as some other data structures. Overall, the choice of data structure depends on the specific constraints of the problem, and the Sparse Table is a viable option for Range GCD queries if memory usage is a concern.