Finding Diameter of a Tree using DFS

Open-Source Internship opportunity by OpenGenus for programmers. Apply now.

In this article, we will be discussing how to find the diameter of a tree or in general a graph. The algorithm that we will be using to find the diameter of the tree is Depth first search.

if you are unfamiliar with depth first search (DFS), I would first suggest you to go through this article before proceeding: https://iq.opengenus.org/depth-first-search/

What is Diameter of a tree ?

Diameter of tree is defined as the no. of nodes in the longest path between leaf nodes nodes of a tree ( undirected graph ) , it is also called the width of the tree.

for example -
In the tree shown above the diameter of the tree is no. of nodes coming along the path 7-3-2-4-6 or 7-3-2-4-5.

How to find diameter of the tree ?

Novice Approach

1.We can run DFS n times .
2.first we can choose any node and find the farthest node from that node using depth first search .
3.we need to repeat the step 2 with all the nodes of the tree and store the maximum value .
4.This maximum value will then be the diameter of the tree.

Time Complexity

The above approach would work fine but time complexity for the above approach will be O(n*n) which is quite high.

Better Approach

we can find the diameter of any tree using only 2 DFS run.
how ?

Quick Explanation -

1.Take any arbitary node as the root node .
2.Run dfs from that node and find the farthest node.
3.let this node be x .
4.Now run dfs from this node to the farthest away node , let this node be y.
5.now the count of all the nodes that come along the way of x and y ( including them) is the diameter of the tree.

Why this works ?

whenever we run DFS we always reach one of the end points of diameter of the tree and from that node we are just finding the farthest node which turns out to be the diameter of the tree.

Let us proof this fact that on running DFS we always reach the end point of diameter of the tree

Proof

Proof by contradiction

here we are considering x as the root of the tree

Assume that on running DFS first time from x and finding the farthest node we reach on node y .

Let's say y is not the end point of the diameter of the tree
and v-p-y is the actual diameter of the tree

In that case according to the above image we can write the equation

( note that if root of the tree lies on the diameter of the tree then distance xp is considered as 0 )

xp + py < xp + pv
and
xp + py < xp + pu

on simplying the above equations

py < pv
and
py < pu

but this cannot be true as y is at maximum distance from p and according to our assumption y is not on the diameter of the tree both statements cannot be true simultaneously .

so, our first assumption that node y doesnot lie on the diameter doesnot hold true and hence after running the first DFS we reach node y which always lie on the diameter of the tree .

Example

10-1

Suppose we want to find the diameter of the graph shown in the above image.

let us take node 4 as our starting node ( we can choose any node )
Now on running the depth first search for the first time we will be able to find the one end of the diameter which will be 7 for this example .
Now on running DFS from 7 and finding the farthest node we will reach 6 via path
7-3-2-4-6.

4.The count of nodes along the path is 5 which is the diameter of the tree in this example.

( Note - we could have also reached 6 via path 7-3-2-4-5 which would also have been the diameter of the tree as both count as farthest node with same value of distances )

5.In this way we can find the diameter of any tree using only two DFS run .

Code


#include <iostream>
#include<vector>
using namespace std;

// here maxD represent the diameter of the tree
int maxD = -1 , maxNode = -1 ;
// maxNode represents node at maximum distance

// declaring the visited node as we will be using the DFS
int vis[8] ;

void createEdge( int a , int b , vector<int> graph[] ){

     // creating undireted edges between the connected nodes
     graph[a].push_back(b) ;
     graph[b].push_back(a) ;

}


void DFS( vector<int> graph[] , int node , int d )
{
    // marking the node as visited
    vis[node] = 1 ;


    // if the distance from root is greater then maximum Distance then updating the maximum value of distance
    // also storing the maxNode no. as this node is now at the farthest distance
    if( d > maxD )
    {
        maxNode = node ;
        maxD = d ;
    }

     // applying the standard dfs 
    for( auto x : graph[node] )
    {
        if( vis[x] == 0 )
        {
            DFS( graph , x , d+1 ) ;
        }
    }

    


}


int main()
{


    // first creating the nodes of the graph
    vector<int> graph[8] ;

   

    // now creating the edges of the tree as shown in the image of  this article
    createEdge( 1 , 2 , graph ) ;
    createEdge( 2 , 3 , graph ) ;
    createEdge( 3 , 7 , graph ) ;
    createEdge( 2 , 4 , graph ) ;
    createEdge( 4 , 5 , graph ) ;
    createEdge( 4 , 6 , graph ) ;

     
     
     // Applyting DFS from node 1
     DFS( graph , 1 , 1 ) ;
     // we could have choosen any node in the graph but for simplicity we have choosen node 1

     
     // making every node unvisited as we will be applying DFS
     maxD = -1 ;
     for( int i = 1 ; i<=8 ; i++ )
     {
         vis[i] = 0 ;
     }
     
     // applying the dfs for the second time as this will give the diameter of the tree
     DFS( graph , maxNode , 1  ) ;


     // Now printing the maximum diameter of the tree
     cout<<maxD<<" is the diameter of the tree "<<endl;




    return 0;
}

Output


5 is the diameter of the tree

Complexity Analysis

Time taken = 2 * n ( here n units of time is taken to run the depth first search once)
so -
Time Complexity: O(n).

There is no extra space required apart from some variables therefore the space required is constant.
Space Complexity: O(1).