Minimum number of nodes to be removed such that no subtree has more than K nodes

Open-Source Internship opportunity by OpenGenus for programmers. Apply now.

In this article, we have solved the problem of finding the Minimum number of nodes to be removed such that no subtree has more than K nodes. This involves idea of Depth First Search, Graph Algorithms and Dynamic Programming.

Table of contents:

Problem Statement with examples
Approach / Idea
Implementation Approach
Time Complexity and Space Complexity

Problem Statement with examples

We will be given a tree with N Nodes and N-1 edges and the number K. We have to find minimum numbers of Nodes that we have to need to deleted in order to make total numbers of node in every subtree less then or equal to K.

This problem can be solved in O(V^2) time and using the idea of Depth First Search.

Lets understand with few examples

Example 1

consider the following tree with 6 Nodes and 5 Edges

untitled

lets the value of K be 3 so for this case if we want all the subtree having total nodes less then or equal to K then we need to delete anyone from 0 or 1 lets we have deleted 0, then after deleting we will get our result as following

untitled-1-

result will be two subtree contaning total nodes less then or equal to K.

Example 2

consider the following tree with 11 Nodes and 10 Edges

untitled-2-

lets the value of K be 3 so for this case if we want all the subtree having total nodes less then or equal to K then we need to delete two nodes 1 or 5 , after deleting we will get our result as following

untitled-3-

result will be three subtree contaning total nodes less then or equal to K.

After deleting vertices the connected edges with that vertices also get deleted

Approach / Idea

The idea to solve this problem is as follows:

If for a sub-tree rooted at node N has more than K nodes, then we have to add the current node N to the list of removed nodes.
Removing current node is the best option as removing a node connected to N will reduce the number of nodes to be within K but it increase chance that total count of nodes will again cross K.
Dynamic Programming can be used as we will use the count of nodes of a smaller sub-tree to get the count of nodes of a larger sub-tree.
In a tree, a node has only one parent node so we need not stored the count value permanently. Depth First Traversal and recursion can be used to implement a bottom-up approach so there will be no repeated calls.

Try to visualize this approach before going into implementation details.

Implementation Approach

For implementing this we will use Graph data structure and we will apply Depth First Search algorithm with some little modification for creating graph we will use adjency matrix.

In normal dfs function, we use dynamic programming for storing visited nodes so that we can skip those nodes if they get called again. this will improve its time complexity.
In this we will add another container so that we can store removed element in that.
Also instead of returning void we will return total number of child node including parent node for that particular node. if the node is leaf node then it will return:

[1 (count for current node) + 0 (zero child node)]

and if node is not leaf node then it will return:

[1 (count for current node) + count returned by child nodes of particular node ]

If total number of child node get greater then k then we will add that element into removed elements container and set total child nodes equal to 0 for node who called removed node.

Steps

Writing DFS function:

We have to first check that all of the child nodes with that node and call them with the same dfs function with starting vertex as the child node. first we will take our starting vertex as root.
After calling we will compare the returned count value weather it is greater or less than k.
of it is less than k then we will return that by adding count of current node
If count is greater than k then we will set count for current node to 0, and also we will add current node to removed node list because subtree for current node is greater then k.
Also at the start of the function we will check that the current node is leaf node or not, if it is leaf node then we will return 1 (count for that node only).

Code:

Following is the implementation in C++:

#include<bits/stdc++.h>
using namespace std;
// Modified DFS function
int dfs(int **a, int n, int sv, bool *visited, set<int> &ss, int k)
{
    // sv -> starting vertex
    // ss -> set of removed vertices
    int count=1;
    visited[sv] = true;
    // count = number of nodes in subtree rooted at sv
    // DP: Add count of nodes connected to sv
    for(int i=0; i<n; i++)
    {
        if(a[sv][i] == 1 && !visited[i]) 
            count += dfs(a, n, i, visited, ss, k);
    }
    // If count is > K, add current node to 
    // removed set
    if(count>k)
    {
        ss.insert(sv);
        count = 0;
    }
    return count;
}

int main()
{
    int n, e, k;     
    // n = number of nodes, e = number of edges, k = max nodes in subtrees
    cin>>n>>e>>k;
    // creating graph
    int **a = new int*[n];
    for(int i=0; i<n; i++)
    {
        a[i] = new int[n];
        for(int j=0; j<n; j++)
            a[i][j] = 0;
    }
    
    int f, s;
    for(int i=0; i<e; i++)
    {
        cin>>f>>s;
        a[f][s] = 1; a[s][f] = 1;
    }
    
    bool *visited = new bool[n];
    for(int i=0; i<n; i++) visited[i] = false;
    set<int> ss;
    //calling function dfs
    int count = dfs(a, n, 0, visited, ss, k); cout<<endl;
    if(count>k)
        ss.insert(0);
    cout<<"min no of element: "<<ss.size()<<endl;
    cout<<"elements: ";
    for(int i:ss)
        cout<<i<<" "; cout<<endl;

    //clearing data
    for(int i=0;i<n;i++)
        delete [] a[i];
    delete [] a;
    delete [] visited;
    return 0;
}

Input:

below input is graph in Example 2

Output:

min no of element: 2
elements: 1 5

Time Complexity and Space Complexity

If we apply BFS using Adjacency List then Time complexity will be O(V+E).
If we apply BFS using Adjacency Matrix for this the time complexity will be O(V^2).

For the above example

Time Complexity will be O(V * V)
Space Complexity will be O(V * V)

With this article at OpenGenus, you must have the complete idea of finding minimum number of nodes to be removed such that no subtree has more than K nodes.