Open-Source Internship opportunity by OpenGenus for programmers. Apply now.
In this article at OpenGenus, we will solve the famous course scheduling problem and it's variations using the graph algorithm known as Topological Sort. The article will guide you through the intuition of how to solve course scheduling problems using topological sort which is frequently asked in coding interviews of various tech companies.
Table of contents:
- Problem Statement
- Intuition behind using Topological Sort with Example
- Implementing the Solution
- Time and Space Complexity
- Other Variations of Problem
Problem Statement
There are a total of n courses offered by some instituion you have to take, labeled from 0 to n - 1. You are given an list of prerequisites where prerequisite[i] = [ai
, bi
] indicates that you must take course bi
first if you want to take course ai
.
You have to tell weather you can finish all the courses or not.
For example if you are given n = 3 courses labelled from 0 to 2 and prerequisite = [ [ 0 , 1 ] , [ 1 , 2 ] ] this implies that :
- You have to take course 1 before taking course 0.
- You have to take course 2 before opting for course 1.
So , as we can clearly see the solution to above problem is that we can schedule courses as follows : 2 --> 1 ---> 0.
Intuition behind using Topological Sort
Now that we know the problem statement , we can move towards an example which will be a little complex to think.
Given , n = 5 and ,
prerequisite = [ [ 1 , 4 ] , [ 2 , 4 ] , [ 3 , 1 ] , [ 3 , 2 ] ]
This is getting tougher to think of solution right ?
Now , as we know when the problem gets tougher to think and it have some kind of connections we try to visualize it using graph or tree. Let's visualize our current example.
As we can visualize the graph can easily present us with the solution. First we can scehdule the course 4 and then schedule the course 2 & 1 and then at last we can schedule the course 3.
Note : The course 3 can only be taken if course 2 & course 1 are done. Till course 2 & 1 are not done we can't start the course 3.
So , till now we know that a graph can easily represent our problem. Therefore , the graph datastructure will be used. Now comes the part why topological sort will be used.
Topological sort is used in graph theory and computer science to linearly order the vertices of a directed acyclic graph (DAG). The goal of topological sort is to arrange the vertices of a DAG in such a way that if there's an edge from vertex A to vertex B, then A comes before B in the ordering. In other words, it satisfies the dependencies between vertices. Therefore , it meets the requirement of our problem. The figure below shows how the topological sort can help us to do so. The figure shows the node containing the indegree of each node.
The figure shows, State 1 show indegree of all the nodes. Indegree is the count of edges which are directed towards the node or coming inward towards the node. Therefore we store the indegree of all the nodes.
The State 2 shows the processing of the nodes. We keep track of the nodes with indegree 0 as the node with indegree 0 are not dependent on any node. We will use this fact to our advantage. As the node will get processed it shows that the course is done. As the course is taken we can opt for the courses which were dependent on the course which is being processed.Therefore , we decrease the indegree of the nodes which are connected to the node being processed.
Therefore , in State 3 we can see that the node 2 & 1 are being having indegree 0 and both nodes can be processed. We process node 2 first and decrease the indegree of node 3 by one and check if we can use it ? No , because the node 3 have indegree 1. So we move on and process the node 1 a shown in State 4. As node 1 is being processed the node 3 's indegree is decreased to 0. Now, Node 3 can also be processed shown in State 5.
At last we check if all the nodes are processed , if yes the courses can be scheduled else not.
Implementing the Solution
The implementation to the problem is divided into functions to make the understanding easier.
bool canFinish(int numCourses, vector<vector<int>>& prerequisites) {
// code godes here
}
The function takes input the number of courses numCourses and prerequisites prerequisites as input.
First of all we will create the indegree and graph of the given prerequisites.
// STORE THE GRAPH IN g
unordered_map<int , vector<int>> g;
// STORE INDEGREE IN indegree
unordered_map<int , int> indegree;
// Function to store graph and indegree
void make_graph(vector<vector<int>>& prerequisites){
// iterating over the prerequisites vector
for(auto i: prerequisites ){
// storing the directed edges [ ai , bi] as bi-> ai
g[i[1]].push_back(i[0]);
// incrementing indegree at ai as it contians inward edge from bi
indegree[i[0]]++;
}
}
Now , as we have the indegree and graph for each edge we can start processing the nodes with indegree 0. The vector added will be maintained to keep track of nodes which are processed. The queue data structure will be used as will use the BFS implementation of Topological sort.
Initialization : The queue will be initalized with the nodes having indegree 0. If no such node is there return false as there is not course to start.
The function for toplogical sort looks like following:
#include <bits/stdc++.h>
using namespace std;
class Solution
{
public:
// STORE THE GRAPH IN g
unordered_map<int, vector<int>> g;
// STORE INDEGREE IN indegree
unordered_map<int, int> indegree;
// Function to store graph and indegree
void make_graph(vector<vector<int>> &prerequisites)
{
// iterating over the prerequisites vector
for (auto i : prerequisites)
{
// storing the directed edges [ ai , bi] as bi-> ai
g[i[1]].push_back(i[0]);
// incrementing indegree at ai as it contians inward edge from bi
indegree[i[0]]++;
}
}
// Function to check if it's possible to finish a set of courses with prerequisites
bool canFinish(int numCourses, vector<vector<int>> &prerequisites)
{
// Create a queue for topological sorting
queue<int> q;
// Call make_graph to build the prerequisites graph and calculate indegrees
make_graph(prerequisites);
// Create a vector to track whether a course has been added to the queue
vector<int> added(numCourses, 0);
// Step 1: Initialize the queue with courses that have no prerequisites
for (int i = 0; i < numCourses; i++)
{
if (indegree[i] == 0)
{
q.push(i); // Add the course to the queue
added[i] = 1; // Mark it as added
}
}
// Step 2: If the queue is empty at the beginning, it means there are no courses with 0 prerequisites, and it's impossible to finish any course.
if (q.empty())
{
return false;
}
// Step 3: Perform topological sorting using BFS
while (!q.empty())
{
int root = q.front(); // Get the course with 0 prerequisites from the queue
q.pop();
// Traverse the courses that depend on the current course
for (int i = 0; i < g[root].size(); i++)
{
int nextnode = g[root][i];
// If the next course has already been added, it means there's a cycle, so return false
if (added[nextnode] == 1)
{
return false;
}
// Decrement the prerequisite count for the next course
indegree[nextnode]--;
// If the next course now has 0 prerequisites, add it to the queue
if (indegree[nextnode] == 0)
{
q.push(nextnode);
added[nextnode] = 1; // Mark it as added
}
}
}
// Step 4: After processing all courses, check if all courses have been added. If not, it means there's an isolated course, so return false.
for (int i = 0; i < added.size(); i++)
{
if (added[i] != 1)
{
return false;
}
}
// If all courses have been added without encountering cycles, return true.
return true;
}
};
int main()
{
int n = 5;
vector<vector<int>> prerequisites = { { 1,4} , {2 , 4} , {3 , 1} , {3 , 2}};
Solution * s = new Solution();
if(s->canFinish(n , prerequisites)){
cout<<"True\n";
}
else{
cout<<"False\n";
}
return 0;
}
Compile & Run :
g++ filename.cpp -o desiredname
./desiredname.exe
Output:
True
Time and Space Complexity
So now that we have solved the problem we will analyse efficiency using the time and space complexity of code.
The time and space complexity of the provided code are as follows:
Time Complexity
The Time complexity is analysed below :
- Building the graph and calculating indegrees in the
make_graph
function takes O(E) time, where E is the number of edges (prerequisites). - Initializing the queue in the
canFinish
function takes O(V) time, where V is the number of courses. - The main while loop in the
canFinish
function performs a BFS traversal of the graph, and it visits each course and its prerequisites once. Therefore, the time complexity of this loop is O(E). - The final loop in the
canFinish
function iterates over theadded
vector, which has V elements, to check if all courses have been added. This loop takes O(V) time.
So, the overall time complexity of the code is O(E + V), where E is the number of prerequisites (edges) and V is the number of courses (vertices).
Space Complexity
The Space complexity is analysed below :
- The space complexity of the graph representation (adjacency list) is O(E + V), as it stores the edges and vertices.
- The
added
vector stores information about whether a course has been added to the queue, and it requires O(V) space. - The queue (
q
) can potentially store all courses, so its space complexity is also O(V).
So, the overall space complexity of the code is O(E + V).
In summary, the code's time complexity is O(E + V), and its space complexity is O(E + V).
Other Variations of Problem
There are many possible variations of the problem , the variations are mostly based on either the output or contraints.
In output variations we add some variation like instead of the giving true or false give the order of scheduling of courses which is nothing but the order in which courses are being scheduled or processed in the queue. The problem can be tried on leetcode given by the name Course Schedule II.
One other variant of the problem involves answering the queries which are asked by regarding can we some course before other which is approachable using the DFS implementation of Topological sort easily. The Function will look like below you can try the problem here.
// Function to perform a depth-first search to find a path from node 's' to node 'e' in a graph
bool search(int s, int e, vector<int> &visited) {
// Mark the current node 's' as visited
visited[s] = true;
// Iterate through all adjacent nodes of 's'
for (auto i : g[s]) {
// If we reach node 'e', return true (path found)
if (i == e) {
return true;
}
// If the adjacent node 'i' has not been visited and we find a path to 'e' through 'i', return true
if (!visited[i] && search(i, e, visited)) {
return true;
}
}
// If no path is found, return false
return false;
}
Last but not least is the variation in which we need to return the list of courses required to be taken before taking any course. This variation can also be solved using the topological sort very easily. Below is the modified implementation :
// Function to get ancestors for each node in a directed acyclic graph
vector<vector<int>> getAncestors(int n, vector<vector<int>>& edges) {
// Initialize data structures and containers
inorder.resize(n, 0);
make_graph(edges);
queue<int> q;
vector<set<int>> ans(n, set<int>());
// Initialize the queue with nodes that have no incoming edges
for (int i = 0; i < n; i++) {
if (inorder[i] == 0) {
q.push(i);
}
}
// Perform a BFS traversal to compute ancestors for each node
while (!q.empty()) {
int root = q.front();
q.pop();
for (auto next : g[root]) {
// Add ancestors of 'root' to the ancestors of 'next'
ans[next].insert(ans[root].begin(), ans[root].end());
ans[next].insert(root);
// Decrement the incoming edge count for 'next'
inorder[next]--;
// If 'next' has no more incoming edges, add it to the queue
if (inorder[next] == 0) {
q.push(next);
}
}
}
// Convert the result to the desired output format
vector<vector<int>> answer(n, vector<int>());
int i = 0;
for (auto v : ans) {
for (auto e : v) {
answer[i].push_back(e);
}
i++;
}
return answer;
}
Key Takeaways
- The course scheduling problem involves graph-based scheduling of courses with prerequisites.
- Topological sorting is an essential tool for solving course scheduling problems by satisfying dependencies.
- We provided an implementation using topological sorting for course scheduling.
- Understanding the time and space complexity is crucial for assessing code efficiency.
- Course scheduling problems have various real-world variations, including ordering , output variation , and queries.