Find minimum number of deletions to make a string palindrome


Reading time: 25 minutes | Coding time: 10 minutes

If we are given a string say S, we have to find out the minimum number of characters to be deleted to make the string a palindrome. A palindrome is a string which is the same if traverse from both from left to right and right to left.

This can be done in O(N2) time with the help of Dynamic Programming approach of Longest Palindromic Subsequence.

Some examples of palindrome are: madam, refer, Malayalam etc.

Example Strings

Input : ABBABBD
Output: 2

Beacause if the first A and last D are removed then the resulting string will be BBABB which is a palindrome.

Input : NITIN
Output: 0

Beacuse it's already a palindrome no need to delete any no.of charcters further.

Algorithm Explanation

There are mutiple ways to do this task but the optimzed way would be by finding the longest palindrome subsequence that can be formed from the give string and subtracting it from the the original will give us the no of characters to be deleted to make the string a palindrome.

The longest palindrome subsequence can be found using a Dynamic Programming approach in O(N2) time. You should go through this article at OpenGenus to understand the approach. This is fundamental.

  1. First, we read the string from the user.
  2. Second, we find the longest palindrome that can be formed for the string and iterate through all the substrings possible with the given string and chech if they form a palindrome or not and then find the maximum length possible with it.
  3. Strings of lenght one are already a plaindrome, and some given strings are as whole a plaindrome already in such cases the output will be zero.
  4. Third, subtract this longest possible palindrome's length from the original string length which is our required output.

Time Complexity of this Algorithm is : O(n2)

Example

Consider the String

OPENGENUS

Next steps,

  1. First we find the length of the given string, so in this case it's nine(9)
  2. Now we find the length of longest palindromic sequence possible with the string
  3. So, now we iterate through the characters of the string with comparing the characters in the beginning of the string with the characters starting from the end of the string. By taking two loops we start comapring the first and last set of charcaters. The outer loop starting from the beginning and the inner loop from the end.
  4. So for "OPENGENUS" we will get a length of three(3) as the maximum length possible palindromes will be ENE, NGN, NEN, EGE.
  5. Now subtract this from the total length of the string.(9-3=6) will be the minimum no.of characters to be deleted to form a palindrome.

Code in Python

Following is the implementation of the above approach in Python:

# Gives the length of the longest palindromic subsequence in string 'str'
def lps(str): 
	n = len(str) 
	# Create a table to store results of subproblems 
	L = [[0 for x in range(n)]for y in range(n)] 
	# Strings of length 1 are palindrome of length 1 
	for i in range(n): 
		L[i][i] = 1
	# Build the table.
    #Note that the lower diagonal values of table are useless
    #and not filled in the process. 
	#c1 is length of substring 
	for cl in range( 2, n+1): 
		for i in range(n - cl + 1): 
			j = i + cl - 1
			if (str[i] == str[j] and cl == 2): 
				L[i][j] = 2
			elif (str[i] == str[j]): 
				L[i][j] = L[i + 1][j - 1] + 2
			else: 
				L[i][j] = max(L[i][j - 1],L[i + 1][j]) 
	# length of longest palindromic subsequence is found 
	return L[0][n - 1] 

# function to calculate minimum number of deletions 
def minimumNumberOfDeletions( str): 
	n = len(str) 
	# Find longest palindromic subsequence 
	l = lps(str) 
	# Subtract it from the original length of the string 
	return (n - l) 

if __name__ == "__main__": 
	str=input("Enter the String: ")
	print( "Minimum number of deletions required = "
		, minimumNumberOfDeletions(str)) 

Input

Enter the String: OPENGENUS
Minimum number of deletions = 6

Thoughts

Though the computation with palindromes may not seem usefulo but in real life, they could be used for some compression algorithms and other cases with repetitive data.
Palindromes are also used in DNA for marking and permitting cutting. They are used to change one dimensional chain into 2 or 3 dimensional structure,there are studies about biological sequence compression algorithms, that use this property.

Code in C++

Following is the implementation in C++:

#include <bits/stdc++.h>
//includes all required header files
using namespace std;

// Returns the length of the longest palindromic subsequence in 'str'
int lps(string str)
{
	int n = str.size();
    
	// Create a table to store  results of subproblems
	int L[n][n];

	// Strings of length 1 are palindrome of length 1
	for (int i = 0; i < n; i++)
		L[i][i] = 0;
	for (int cl=2; cl<=n; cl++)
	{
		for (int i=0; i<n-cl+1; i++)
		{
			int j = i+cl-1;
		    if (str[i] == str[j])
				L[i][j] = L[i+1][j-1];
			else
				L[i][j] = min(L[i][j-1], L[i+1][j]) + 1;
		}
	}

	// length of longest palindrome
	return L[0][n-1];
}

// function to calculate least number of deletions
int minimumNumberOfDeletions(string str)
{
	int n = str.size();

	// Find longest palindromic subsequence
	int len = lps(str);
	return len;
}

int main()
{
	string str = "opengenus";
	cout << "\nMinimum number of deletions required = "
		<< minimumNumberOfDeletions(str);
	return 0;
}

References

With this, you have the complete idea of this problem. Enjoy.