Open-Source Internship opportunity by OpenGenus for programmers. Apply now.
In this article, we have solved the problem of finding elements that appear more than N/3 times. This involve ideas of Hash Map and Boyer Moore algorithm.
Pre-requisites:
Problem
Let's look at the problem first and understand what we have to do here.
There are $N$ numbers of elements in an array, and we have to find the element or elements that may occur more than $\frac{N}{3}$ times in the array.
For example, if we have the following list:
[1, 1, 1, 1, 2]
you can notice that obviously1
is the element that satisfies the condition of havingcount
more than $N/3$.
Now, let's look at this problem in a little different way, that is if we make a list that may have only three different elements, then we can see a few specific cases it may create.
- Case-I: Perfect 3 Elements
In this case, we'll get the only empty list in return because there are 3 elements and every element has the same number of occurrences.
For example, in the below list $N/3$ is3
and every element has occurred only3
also note this happened because they all are sharing equal space.
[1, 1, 1, 2, 2, 2, 3, 3, 3]
- Case-II: Result gives us one element
If an element occurs4
times in the list of9
then it has surely taken space of other elements.
[1, 1, 1, 1, 2, 2, 3, 3, 3]
- Case-III: Result gives us two elements
Or else, there could be two elements that may take space from other remaining space.
[1, 1, 1, 1, 2, 3, 3, 3, 3]
To conclude what we just saw above, we can see that we're looking for an element that takes up more than $\frac{1}{3}$rd of the space of total space.
Therefore, no matter how much big the space is we can only have two elements which will have a greater count than other ( if available ) third or many other elements.
How to solve
One could use many different ways to solve this,
- Naive Approach
One way could be to look at every element then count it and then check if the element satisfies the check and then add it to another array and then return that "another array" as a result.
This way we will have
Time Complexity of $O(N^2)$
and,
Space Complexity of $O(1)$.
An example, code.
example_list = [1, 2, 2, 2, 4, 5, 1, 1, 1, 1, 2]
## This is a solution one may use when they don't know of
## HashMap and may have learned bubble sort before.
def more_than_N_by_3(l):
length = len(l)
nums = []
for i in range(length):
n = l[i]
count = 0
for j in range(length):
if l[j] == n:
count += 1
if count > length//3 and l[i] not in nums:
nums.append(l[i])
return nums
print(more_than_N_by_3(l))
Output: [1, 2]
- using HashMap
The second way could be to use the smart method of counting and keeping elements in a list, that is to use the HashMap to store the element and counts and then at last look at every count and return the ones that have a higher count thanN/3
.
using HashMap, cause to have
Space Complexity of $O(N)$
and,
Time Complexity of $O(N)$
An example, code.
example_list = [1, 2, 2, 2, 4, 5, 1, 1, 1, 1, 2]
def count(l):
counts = {}
for i in l:
if counts.get(i) is not None:
counts[i] += 1
else:
counts[i] = 1
return counts
def more_than_N_by_3(counts, N):
result = []
for key in counts.keys():
if counts[key] > N//3:
result.append(key)
return result
l = count(example_list)
print(more_than_N_by_3(l, len(example_list)))
Output: [1, 2]
- Boyer Moore's vote Algorithm
Let's first look at how we will implement this, and then we will look at an explanation to understand why this algorithm works?
So, first, we take two pairs of variables, each pair will have a variable to count and a variable to store the previous value we were counting on.
Then we perform three kinds of operations according to the element we currently while parsing the list.
First, If we find the current iteration value in the list, same as the previous element's value then we increment the counter.
Second, if we have the count is at zero then we change the element we're counting for and also set the count of a particular element to
1
.
Third, if the above does not happen then just decrement count for elements.
The above is three kinds, not the right order to perform the operations, for right order look at the code.
After we are done with the above parsing, we will reset the counters and then count for the values we found to be dominant in the previous parsing of the list.
And at last, we will take the value count and if it passes the condition of being greater than $N/3$ and add it to the list, which we will return as a result.
The Code,
example_list = [1, 2, 2, 2, 4, 5, 1, 1, 1, 1, 2]
class Elem:
val = 0
count = 0
def inc(self):
self.count += 1
def dec(self):
self.count -= 1
def more_than_N_by_3(l):
elem1 = Elem()
elem2 = Elem()
for n in l:
if elem1.count == 0:
elem1.val = n
elem1.inc()
elif elem1.val == n:
elem1.inc()
elif elem2.count == 0:
elem2.val = n
elem2.inc()
elif elem2.val == n:
elem2.inc()
else:
elem1.dec()
elem2.dec()
nums = []
N = len(l)
# Reseting the counter
elem1.count = 0
elem2.count = 0
# counting again to confirm
for n in l:
if n == elem1.val:
elem1.inc()
elif n == elem2.val:
elem2.inc()
## adding if counter statisfy our problem's condition
if elem1.count > N//3:
nums.append(elem1.val)
if elem2.count > N//3:
nums.append(elem2.val)
return nums
print(more_than_N_by_3(example_list))
Output: [1, 2]
Why it works?
If you may remember at the very beginning we discussed how there can only be at most two elements that will satisfy the condition of having the count of elements greater than $N/3$.
This algorithm takes advantage of that fact and assumes that there do exist two such elements and starts to count up for every consecutive similar element we find and count down if we find different consecutive elements.
Let's take a simple example,
[1, 1, 1, 1, 2, 3, 4, 5, 5]
We can separate this list into 3 different categories.
- The dominant element, which in the above case occurs for
4
times - The second dominant element, which in this case is
5
and occurs for2
times. - Other non-dominant elements.
Now, ask yourself if we ran the above algorithm then what would elem1
and elem2
will look like?
Note: we're
elif
, so, only one thing will occur at every iteration.
So, therefore, the elem1
will get updated until we reach 2
and its counter will become 4
after this there are three consecutive different numbers, for which elem1
's counter will decrement only when the elem2
counter is not zero.
Let's say after elem1
's counter reaches 4
, then after that, we have 7
continuous different element ( 7
because if we take more than that then 1
will become less than $N/3$. ) then we will have 7
iterations of which there will be 3
iterations where elem1
's counter will be decremented and other times elem2
's counter being 0
will make skip the decrement.
Well for time complexity, since we're parsing the list two times ( separately ) therefore it should be O(N)
.
Time Complexity: O(N)
And for space complexity, we're not creating anything more than four variables and a list of two elements, therefore,
Space Complexity: O(1)
With this article at OpenGenus, you must have the complete idea of finding elements that appear more than N/3 times.