Subset of maximum size with no pair sum divisible by 'K'


Reading time: 20 minutes | Coding time: 10 minutes

We are given an array A[0....N-1] of N elements and an element K. Our focus is to find the subset of largest size in which if we select any two elements, then the sum of the two elements should not be divisble by K.

Example

Input :

arr[] = [3, 17, 12, 9, 11, 15]
K = 5

Output : 4

Valid Subset is:

{17, 12, 9, 15}

Other Invalid Subsets (of size-4) are as follows:

{3, 17, 12, 9}
{3, 17, 12, 11}
{3, 17, 12, 15}

Above subsets are invalid as there is a pair (3,17) in all the subsets which has sum=20 which is divisible by 5

Other Invalid Subsets (of size 5) are:

{17, 12, 9, 11, 15}
{12, 9, 11, 15, 3}

The above two subsets are invalid as there is a pair (11,9) in all the subsets which has sum=20 which is divisible by 5.

The brute force approach will take O(2^N * N^2) time complexity as:

  • there are 2^N subsets
  • For a given subset, we need to check all sums which will take O(N^2) time

In this problem, we have reduced the time complexity to O(N+K) which is linear.

Approach

There is a simple mathematical idea to solve this problem. Let's discuss the algorithm stepwise:

  • Initialize an array freq of size K with 0. It will store frequency of all the modulus values as we take modulus of all array elements with K. The size of 'freq' array is 'K' as the possible modulus values when an element is divided by 'K' lie between '0' and 'K-1' (both inclusive)
freq[] = {0, 0, ..., 0} (size K)

freq[i] = number of elements e such that e % K = i 
  • Now we divide each element of the given array with K and increment the corresponding modulus frequency in the freq array.

  • After that we observe that if the subset has an element which has modulus value i then it should not have element which has modulus value K-i as there sum will then be divisible by K.

  • Now we traverse in the freq and then see for all modulus values i less than K/2 , the max(fre[i],fre[K-i]) and add that to our answer as we have to maximize the size of our subset.

  • For modulus value K/2 we take min(1,fre[K/2]) as if we take more than 1 element with modulus K/2 then given condition will not be satisfied.

Explanation of the given example

Consider this array and value of K:

arr[] = [3, 17, 12, 9, 11, 15]
K = 5
  • Initialize an array freq of size 5 with 0. It will store frequency of all the modulus values as we take modulus of all array elements with K. The size of 'freq' array is '5' as the possible modulus values when an element is divided by '5' lie between '0' and '4' (both inclusive)

  • Now we divide each element of the given array with 5 and increment the corresponding modulus frequency in the freq array.

  • First element is 3 so increase fre[3] by 1 likewise increase fre[2] by 2 for elements 17,12,increase fre[4] by 1 for element 9 ,fre[1] by 1 for element 11,fre[0] by 1 for element 15.

  • Traverse from i=1 to 2. For i=1 we take max(fre[1],fre[4]) which is 1, for i=2 we take max(fre[2],fre[3]) which is 2 for fre[0] we take min(1,1) which is 1 and together they add up to make 4.

Frequency array status of given example at each step

freq[]={0,0,0,0,0}

Element-1=3
Remainder=(3%5)=3

freq[]={0,0,0,1,0}

Element-2=17
Remainder=(17%5)=2

freq[]={0,0,1,1,0}

Element-3=12
Remainder=(12%5)=2

freq[]={0,0,2,1,0}  

Element-4=9
Remainder=(9%5)=4

freq[]={0,0,2,1,1}

Element-5=11
Remainder=(11%5)=1

freq[]={0,1,2,1,1}

Element-6=15
Remainder=(15%5)=0

freq[]={1,1,2,1,1}
  • Now in the final freq array we traverse from i=1 to 2. For i=1 we take max(fre[1],fre[4]) which is 1, for i=2 we take max(fre[2],fre[3]) which is 2 for fre[0] we take min(1,1) which is 1 and together they add up to make 4.

C++ implementation

Following is the C++ implementation of the above approach:

#include<bits/stdc++.h>
using namespace std;
#define ll long long int 
#define SIZE 1000000 + 1
using namespace std;
int main()
{
    ll n,k,i,l=1;
    cout<<"Input :"<<"\n";
    cin>>n;
    ll arr[n+1];
    for(i=1;i<=n;i++)
    {
        cin>>arr[i];
    }
    cin>>k;
    //declaring and initializing frequency array
    //initialize array by 0 
    ll fre[k]={0};
    for(i=1;i<=n;i++)
    {
        fre[arr[i]%k]++;
    }
    // Final answer will be stored
    ll ans=0;
    for(i=1;i<=k/2;i++)
    {
        //adding modulus value which
        //has greater number of elements
        ans+=max(fre[i],fre[k-i]);
    }
    if(k%2==0)
    ans+=min(l,fre[k/2]);
    ans+=min(l,fre[0]);
    cout<<"Output :"<<ans<<endl;
    return 0;
}

Input:

5
3, 17, 12, 9, 11, 15
5

Output:

4

Time Complexity

The time complexity of following approach is O(K + N)

where:

  • K is the divisor.
  • N is the number of elements

This completes the algorithm to solve this problem efficiently.

Enjoy. Happy Coding.