×

Search anything:

# Mean of Array problem solved with Treap

#### Data Structures Algorithms tree data structure Get this book -> Problems on Array: For Interviews and Competitive Programming

In this article at OpenGenus, we have solved the Mean of Array problem using Treap Data Structure.

Pre-requisites

• Maths
• Treaps

### Problem Statement

Here you are given an array, the aim is to find the number of sub-arrays in which the mathematical mean is not less than given `k`. Mean of an array is the average of that array. Mean is equal to the sum of all numbers in an array divided by the number of elements in the array.

In the input you will be given `n` the number of elements in the array, `k` the boundary of the mean and the array itself.

In the output print the number of sub-arrays with mean less than `k`.

All the numbers in the input section are `32-bit` positive integers and `N ≤ 200000`.

Example

``````Input:
5 4
5 2 4 5 1

Output:
5
``````

### Observation

The observations made from the question are:

• We need to find the number of sub-arrays in which the mathematical mean is not less than the given value `k`.
• The mean is defined as the sum of the elements in an array divided by the number of elements.
• The input contains the number of elements in the array as `n`, the value of `k`, and `n` integers representing the elements of the array.
• The size of the array can be as large as `2,00,000`, which implies that the algorithm should be efficient.
• All the numbers in the input section are `32-bit` positive integers.
• We need to use some mathematical concepts to solve the problem efficiently.

### Brute force approach

Here in the brute force approach to solve this problem would be to consider all possible sub-arrays of the given array and calculate the mean of each sub-array. Then count the sub-arrays in which mean is greater than or equal to `k`.

To achieve this we would need two nested loops to generate all the possible sub-arrays of the given array and calculate the sum of the sub-array using another loop and divide the sum by the length of the sub-array to calculate the mean.

The time complexity of this expression would be `O(n^3)` where `n` is the size of the given array as we have three nested loops. The space complexity would be `O(1)` as we are not using any additional data structures.

### Approach

Here to get the optimized approach we will be using treap data structure.
Intuitive steps

The intuitive approach is as follows:

• First, a node structure is defined, which represents a node in the treap.
• We will have a
• function `sz` to get the size of the treap.
• function `upd_sz` to update the size of the treap.
• function `init` to initialize node in the treap.
• function `split` to split the treap `t` into two traps `l` and `r` such that all elements in `l` are less than or equal to key, and all the elements in `r` are greater than key.
• function merge is used to merge two treaps `l` and `r` into a single treap `t`.
• function insert is used to insert a node `it` into the treap.
• function `sma_and equa` funtion is used to get the number of elements in the treap `t` that are less than or equal to `x`.
• Now in the main function the value of `k` and `n` are taken as input
• The root node is initialized to `NULL`.
• A node with the value of `0` is inserted into the treap `root`.
• Variable `pref` and `res` are initialized to `0`.
• Loop is executed to read the values to the array.
• The value of the `val` variable is subtracted from `k`.
The value of `val` is added to `pref`.
The number of elements in the treap `root` that are less than or equal to `pref` is added to `res`.
A new node with a value of `pref` is inserted into the treap `root`.
After the loop, the value of `res` is printed.

Explained
To implement this optimally we will use treaps data structure for this solution.
First, a treap is a binary search tree that is also a heap. It satisfies the binary search tree property in which the left subtree of a node contains only nodes with keys less than the node's key, and the right subtree contains only nodes with keys greater than the node's key. It satisfies the heap property where each node's key is smaller than the keys of its children. A treap is built by associating a random priority with each node and maintaining the heap property based on the priorities. The priority is generated randomly when the node is inserted into the treap and is not changed thereafter.

In the intuitive approach we have discussed about the duties of each function. So now lets look how everything works together

• First we will take in the values `n` and `k` which are the size of the array and the desired minimum mean, respectively. Then it initializes a treap called `root` to `NULL` using the `init()`, which creates a new node with the given value, priority, size, left and right children and returns a pointer to that node.
• After that we insert the new node into the root using the `insert()`, which inserts the node at the correct position in the treap by recursively calling itself on the left or right subtree depending on the value of the node's `val` compared to the current node's `val`. It also updates the size of each node as it is added to the treap.
• Now we will enter the loop to read each element of the array one by one. For each element we subtract `k` from it and add the resultant to the variable `pref`, which is the running prefix sum.
• Now we will search for the number of sub-arrays with a mean greater than or equal to `k` using the `sma_and_equa()`, which recursively splits the treap into two subtrees based on the `val` of each node, and returns the size of the left subtree. It then merges the two subtrees back together and returns the size of the left subtree as the result.
• Then it adds the result to a variable called `res`, which represents the running count of sub-arrays with a mean greater than or equal to `k`. It then inserts a new node into the treap representing the current prefix sum, and repeats the process for the next element of the array.
• Once all elements of the array have been processed, the program outputs the final value of `res`, which represents the total number of sub-arrays with a mean greater than or equal to `k`.

Code
The following is the C++ implementation of the above approach:

``````#include <bits/stdc++.h>
using namespace std;

typedef long long ll;

struct node {
ll val, prior, size;
node *l, *r;
};

ll sz(node* t) {
return t ? t->size : 0;
}

void upd_sz(node* t) {
if (t) t->size = sz(t->l) + 1 + sz(t->r);
}

node* init(ll val) {
node* ret = new node();
ret->val = val;
ret->size = 1;
ret->prior = rand();
ret->l = ret->r = NULL;
return ret;
}

void split(node* t, node* &l, node* &r, ll key) {
if (!t) l = r = NULL;
else if (t->val <= key) split(t->r, t->r, r, key), l = t;
else split(t->l, l, t->l, key), r = t;
upd_sz(t);
}

void merge(node* &t, node* l, node* r) {
if (!l || !r) t = l ? l : r;
else if (l->prior > r->prior) merge(l->r, l->r, r), t = l;
else merge(r->l, l, r->l), t = r;
upd_sz(t);
}

void insert(node* &t, node* it) {
if (!t) t = it;
else if (it->prior > t->prior) split(t, it->l, it->r, it->val), t = it;
else insert(t->val <= it->val ? t->r : t->l, it);
upd_sz(t);
}

ll sma_and_equa(node* &t, ll x) {
node *l, *r;
split(t, l, r, x);
ll nw = sz(l);
merge(t, l, r);
return nw;
}

int main() {
srand(time(NULL));
ll k, n, val;
scanf("%lld%lld", &n, &k);
node* root = NULL;
insert(root, init(0ll));
ll pref = 0, res = 0;
for (int i = 0; i < n; ++i) {
scanf("%lld", &val);
val -= k;
pref += val;
res += sma_and_equa(root, pref);
insert(root, init(pref));
}
printf("%lld\n", res);
return 0;
}

``````

If in the input we give `n = 5` and `k = 4` and if the array is `[5, 2, 4, 5, 1]` then the output will be:

Output

``````5
``````

This is because there are 5 sub arrays which satisfies the condition. Those are

``````| Sub-array | Mean |
--------------------
|        |  5   |
| [5,2,4,5] |  4   |
|        |  4   |
|        |  5   |
|   [4, 5]  |  5   |
``````

All the above means are equal to or greater than `k` which is `4`. The rest of the sub arrays formed have mean less that `4` which is the value of `k` The count of such sub-arrays whose mean is not less than `k` is `5`, hence the output.

### Complexities

The time complexity of the code is `O(n log n)`, where `n` is the number of elements in the input array. This is because we are using a treap to maintain a sorted sequence of prefix sums. The insertion and splitting operations in the treap take `O(log n)` time in the worst case, and we perform them for each of the `n` elements in the input array.

The space complexity is `O(n)` as we are using a treap to store the prefix sums of the input array, and the size of the treap can be as large as the input array itself. We are also using a constant amount of extra space for variables used in the code, such as variables to store the input values and the treap nodes.

Overall, the time complexity is efficient and a reasonable space complexity which makes this a good solution.

#### Aswin Shailajan

Aswin Shailajan is a B. Tech Computer Science student at Vellore Institute of Technology and is a SDE, Intern at OpenGenus.