Largest subset with divisible pairs


Reading time: 30 minutes | Coding time: 5 minutes

In this post, we explored the method to get the longest subset within any given array, such that for each pair, the smaller number divides the larger number. We will solve this using a Dynamic Programming approach.

Consider the following array:

1, 3, 9, 15, 27

There are multiple subsets such as:

1, 3, 9, 15, 27
1, 3, 9, 15
1, 3, 9, 27
1, 3, 15, 27
so on

The condition of the problem is that for a given subset, every pair of elements should be such that the smaller number divides the larger number exactly.

Consider the subset 1, 3, 9, 15:

In this case, the pair (9, 15) does not satisfy the condition as 15 % 9 != 0.

The largest subset satisfying this condition is 1, 3, 9, 27. In this case, all pairs satisfy this condition.

Now, that the problem is clear, we will dive deeper into it to solve it completely.

Solution

Now, one easy way to think of would be to check every element and the numbers that divide it, and then get the length, store it in a place, get the maximum among those lengths & output the sequence with that length. This makes it incessantly complicated, as we need to keep track of a lot of things, while going back & forth between all of these. With so many different variables & arrays to track, we end up increasing our space complexity a lot.

Another way to think of this is that if the subset is sorted then every number divides the next number within the sequence. This is the key idea.

Brute Force

Normally, if we were to brute force this, we would choose an element within the array, then iterate through the array and check all the elements that divide it and then check if every element is divisible by the smaller number, which again requires us to iterate over our subsequence multiple times to look over all the elements. This not only is extremely complex to keep track of, but as our array size grows, the time and space complexity grow exponentially. Since we are already iterating over the full array multiple times, our complexity already is bound to be Θ(2n). We also are iterating multiple times over all elements in every subsequence, which makes the complexity rise up to θ(n22) = θ(n4) in the worst case scenario. This seems quite unacceptable even with moderate sized arrays.

Instead, let us explore a much simpler & faster approach with dynamic programming.

Dynamic Programming

Let's try solving this the dynamic programming way. Our aim would be to get various solutions & use them to obtain our final solution. To make things simpler, it's also better to have the array sorted, so we can simply iterate over successive elements to get our solutions. We create a solution by taking all array elements up-to a point along with the length, iterating backwards we reduce the length if the next number does not divide our current one.

What basically we are doing is calculating length of the subsequence that ends with every number in the array. The sorting of array ensures that we can accurately keep tabs on every number in subsequence easily.

Example

Let's take an array a = 1, 3, 9, 15, 27 (sorted)

We make separate arrays taking different elements as the last one. What we get is:

sep

Starting from the last position and going backwards, we remove the next element if it does not divide our current one. Let's try to do this on our last array.
Starting from the last, we first have 27, next to which lies 15. 15 does not divide 27, we reduce the value of length by 1, which gives us 4.
9 divides 27, we move to 9. Next to 9 lies 3, and 9 is divisible by 3. We move to 3 & our next element is 1, which does divide 3 perfectly, so we keep it. So, here the length of subsequence, such that every element divides the next, is 4.
Repeat this procedure on all of the sub-arrays & output the largest value among the lengths.

Algorithm

  1. Get the input array.
  2. Sort the array.
  3. Create sub-arrays with ith element as the last for ith sub-array.
  4. Create another array with length of ith sub-array at ith index.
  5. For every ith sub-array, starting from last, if the next element does not divide current, reduce the value at ith index in the array of lengths.
  6. Output the maximum value from the length array.

Pseudocode

function largest_subset(array):
	length = length(array)
	for i in length:
		length_array[i] = i
	for i from length to 1:
		if array[i] mod array[i-1] != 0:
			length_array[i] - 1
	return max from length_array

Implementation

sorted_data = sorted([18, 1, 2, 3, 3, 6, 13, 17]   
#create new lists with every element of the array
largest_lds = [[sorted_data[i]] for i in range(len(sorted_data))]

def lds(sorted_data):
    for i in range(len(sorted_data)):
        for j in range(i):
            if ((sorted_data[i] % largest_lds[j][0]) == 0) and (largest_lds[j][0] != largest_lds[j-1][0]):
                 #The part after 'and' ensures that there are no duplicates &
                 #we get longest subsequence with unique elements only.
                 tmp = [sorted_data[i]]
                 tmp.extend(largest_lds[j])
                 largest_lds[i] = tmp
     #max is used to output, from largest_lds, the list with the largest length
     #sorted ensures our elements appear in order which we want them to
     print(len(max(largest_lds)))

lds(sorted_data)

Space Optimization

We can reduce the time complexity if we do not create these separate sub-arrays and simply keep track of the lengths by iterating in reverse over our original array and reduce the ith value, with i being the index where we started.

sorted_data = sorted([18, 1, 2, 3, 3, 6, 13, 17])

def largest_subset(array):
	#eliminate any dupliactes from our array
	array = list(dict.fromkeys(array))
	length_array = [i+1 for i in range(len(array))]
	for x in range(len(length_array)):
		current_maximum = length_array[x] -1
		#This eliminates the edge case where an element is divisible by both 2 & 3
		#We need to ensure that every successive element is divible by previous
		#And 3 is not divisible by 2
		if( (array[x] % 6) == 0 ):
				length_array[x] -= 1
		for i in range(current_maximum, 0, -1):
			if( (array[x] % array[i]) != 0 ):
				length_array[x] -= 1
	print(max(length_array))

largest_subset(sorted_data)

Complexity

  • Worst case time complexity: Θ(n2)
  • Average case time complexity: Θ(n2)
  • Best case time complexity: Θ(n2)
  • Space complexity: Θ(n2)
  • Space complexity(optimized): Θ(n)

With this, you have the complete knowledge of solving this problem. Enjoy.