Get this book -> Problems on Array: For Interviews and Competitive Programming

In this problem, we will see how we can check whether a string is a subsequence of another string. Now we will consider an example to understand the given problem.

Let's say we have a string 'Welcome to OpenGenus' and a substring OGenus.

Now we need to find whether we have a substring or not.

```
mainString = S1 = "Welcome to OpenGenus";
Subsequences (S2) are:
OpenGenus
OGenus
Wlcomepen [Notice we have deleted few characters in mainString]
```

Hence, S2 is a subsequence of S1 if we can convert S1 to S2 by deleting a few characters of S1. Subsequence is a generalization of substring where substring is a continuous set of characters while in subsequence, characters need not to continuous.

We will use these approaches to solve this problem:

- Naive approach O(N * 2^N) time
- Optimised approach O(N) time

# Naive Method

We will make all possible subsequences of the string we have (mainString) in the length of the string we are given to find (stringToFind).

The steps are:

- Create all subsequences of string S1
- For each subsequence P, check if P == S2
- If P == S2, then S2 is a subsequence of S1.
- If no subsequence matches, S2 is not a subsequence of S1.

**PseudoCode:**

**How to generate all subsequences of a string S1?**

- The key idea is to use a number N of B bits where B is the lenght of S1.
- If the i-th bit of N is 1, then i-th character of S1 is included in the subsequence.
- If the i-th bit of N is 0, then i-th character of S1 is not included in the subsequence.
- We can get all possible values of N by traversing from 0 to 2^B-1.

```
Function genrateSubstr(mainString)
array subseq[]
int j, counter, lenReq=length(mainString)
powSize = 2^lenReq // Number of subsequences
for counter = 0 to counter = powSize :
string x="";
for j = 0 to j = lenReq:
if (counter&(1<<j) !=0:
x= x + mainString[j]
subseq.add(x);
return subseq;
Function isSubstring(subs,stringToFind)
for(string x:subs):
if(x==stringToFind):
return "Match Found"
return "Not Found"
subseq = genrateSubstr(mainString)
for each S in subseq:
if isSubstring(S, stringToFind)
then stringToFind is subsequence of mainString
```

### Algorithm

We will create 2 functions First one will generate all subsequences possible.

The second one will match and try to find the right subsequence.

**Algorithm for Subsequence Generation:**

Step 1:First we will calculate the number of time it would require to run the process. Since we haveNsize string it will generate about 2^{N}strings.

Step 2:Now we will have two nested loops where the first one runs 2^{N}times and the second one just runs to the length of our string and implements logic.

Step 3 (a)The logic involved is bit manipulation. Consider as we have a string required to give subsequence saymars. Now we will require 2^{4}which is 16 subsequences, as we can see in any subsequence ofmarswe will have either one value present or not. This means there are only 2 possibilities for a character to exist in a subsequence.

Hence we can safely represent them as in Binary form like 1111 which is "mars" and 0000 which will be "". Thus we can utilize this technique to generate 2

^{4}subsequence.

Step 3 (b)Now we have

```
0000 - ""
0001 - "m"
0011 - "ma"
0111 - "mar"
.
.
1001 - "ms"
.
1111 - "mars"
```

This is done by a simple formula

counter & ( 1<<j ) ! = 0

Here this is a special case meaning we left Shift 0001,

jtimes and then we take the AND of both counter and 1<<j to get the present characters till 0 comes in. When that happens we restart the loop.

Step 4:This step has repeated a total of 2^{N}times over a length of mainString.

Step 5:Finally we traverse the array of these constructed subsequences and lastly through linear search verify the presence of the string.

### Example:

Let's take these strings

```
string mainStr = "Welcome to OpenGenus";
string str = "OpenGenus";
```

So now we pass mainStr to the generator function. Now we observe the status of variable string x (*consider the pseudo-code*)

so initially:

Here as the loop iterates we try to append the value of x. We continiously iterate the mainString to select the characters for making a subsequence.

So here,

```
x: j: 0 counter: 0 (1<<j): 1 (counter&(1<<j)):0
x: j: 1 counter: 0 (1<<j): 2 (counter&(1<<j)):0
```

1<<j implies that we create 001 as 010 in binary or simply we multiply power of 2 j times. Simply 1* pow(2,j).

Now we observe the status at the mid of the internal loop

Here with further iteration x still hasn't taken a value as counter&(1<<j)=0

Now it happens as we try to seek a value where counter is location of main loop here 0 and (1<<j) is 16 = 10000.

Opearing & we get 00000 & 10000 = 000000.

```
x: j: 4 counter: 0 (1<<j): 16 (counter&(1<<j)):0
x: j: 5 counter: 0 (1<<j): 32 (counter&(1<<j)):0
x: j: 6 counter: 0 (1<<j): 64 (counter&(1<<j)):0
```

At the end of the first iteration we have value:

As we can see it is not suitable for operation (counter&(1<<j)) to give non-zero we seek the end of loop by pure traversal throughout the string.

```
x: j: 18 counter: 0 (1<<j): 262144 (counter&(1<<j)):0
x: j: 19 counter: 0 (1<<j): 524288 (counter&(1<<j)):0
```

which is a good result as "" is also a subsequence.

Now the loop restarts and set's the value as

```
x: j: 0 counter: 1 (1<<j): 1 (counter&(1<<j)):1
```

Here we add it to the array of strings finally.

For better understanding, here's how our algorithm takes in the non-empty strings.

```
x: W j: 1 counter: 9 (1<<j): 2 (counter&(1<<j)):0
x: W j: 2 counter: 9 (1<<j): 4 (counter&(1<<j)):0
```

Now for the iteration in the mid.As we can infer from here it has (counter&(1<<j)) becoming non-zero at multiple places. It happens as we set counter as 9 which is 1001 and 1<<j becomes anything other then 0XX0.( X= 1 or 0 )

so when we see this we can append that x and contune throughout the string to see if we can get more subsequence.

```
x: Wc j: 11 counter: 9 (1<<j): 2048 (counter&(1<<j)):0
x: Wc j: 12 counter: 9 (1<<j): 4096 (counter&(1<<j)):0
```

Lastly, before we add the string to our array

```
x: Wc j: 18 counter: 9 (1<<j): 262144 (counter&(1<<j)):0
x: Wc j: 19 counter: 9 (1<<j): 524288 (counter&(1<<j)):0
x: j: 0 counter: 10 (1<<j): 1 (counter&(1<<j)):0
```

Thus in this way, we add subsequences to our array and do the linear search over the array.

For detail over Linear Seach read this article

### Code in C++

```
#include <iostream>
#include <bits/stdc++.h>
using namespace std;
#define pb push_back
vector<string> subseqGen(string main){
int lenReq=main.size();
int powSize = pow(2,lenReq);
vector<string> subseq;
for (int counter = 0; counter < powSize; counter++) {
string x="";
for (int j = 0; j < lenReq; j++) {
if((counter&(1<<j))!=0)x+=main[j];
}
subseq.push_back(x);
}
return subseq;
}
string matchStr(vector<string> subs,string req){
for(string x:subs){
if(x==req){
return "Match Found";
}
}
return "Not Found";
}
int main(){
string main,sub;
getline(cin,main);
getline(cin,sub);
cout<<main<<endl;
cout<<sub<<endl;
vector<string> subseq = subseqGen(main);
string result=matchStr(subseq,sub);
cout<<result<<"\n";
return 0;
}
```

### Output

```
Match Found
```

## Time Complexity

Here we use O(N) to compare the generated strings with our stringToFind

and use O( 2^{N} ) to create an array of different substrings.

So, Time complexity is **O(N * 2 ^{N})**

# Optimised Algorithm

Here we iterate each character of the mainString to match it with our stringToFind and finally print the result.

The steps are as follows:

- Specify the main string (S1) and sub string (S2)
- Initialize iterator for S1 (say i) and S2 (say j) at the first character
- If S1[i] == S[j], then increment both i and j
- If S1[i] != S[j], then increment i only
- If i has reached the end of S1 but j has NOT reached the end of S2, then S2 is not a subsequence of S1.
- Else S2 is a subsequence of S1.

The time complexity is O(N).

The space complexity is O(1).

**PseudoCode:**

```
Function findSubSeq(mainString,subString):
while i<length(mainString) && j<length(subString):
i++
if mainString[i]==subString[j]:
j++
if j==length(subString):
return "Match Found"
return "Not found"
```

### Working

Here rather then genrating we utilize the already available string by simply counting the appearance of each character and if it matches we have our result.

Steps:

First:We take in two variable i and j. One for iterating mainString and next for counting the character matched.

Second:We iterate till we get i equal to mainString size and j is also about same size as subString.

Third:For the steps involved we increment our variable i for finding a match of string as we compre it with subString at j^{th}position. If found we increment j else we continue.

Lastly:We compare the length of subString with the value of j.Lastly we report the result.

### Code in working:

For this consider the pseudocode first.

Now we have first iteration with i all the way to size of mainString. Now we have j which is initialized as 0. Now we take the first iteration result.

```
i: 0 j: 0 main[i]=W sub[j]=O
```

Here we have a clear understanding of how every variable is responding to the first call. Now we study how it affects the next iterations.

Now the variables j remains same as increment condition not satisfied and consequnetly mainString[i] and subString[j] changes

```
i: 1 j: 0 main[i]=e sub[j]=O
i: 2 j: 0 main[i]=l sub[j]=O
i: 10 j: 0 main[i]= sub[j]=O
i: 11 j: 0 main[i]=O sub[j]=O
```

This happens till we encounter 12th element or 11th index which matches out subString[0] so now condition is satisfied and j is incremented.

```
i: 12 j: 1 main[i]=p sub[j]=p
i: 13 j: 2 main[i]=e sub[j]=e
i: 14 j: 3 main[i]=n sub[j]=n
i: 15 j: 4 main[i]=G sub[j]=G
i: 16 j: 5 main[i]=e sub[j]=e
i: 17 j: 6 main[i]=n sub[j]=n
i: 18 j: 7 main[i]=u sub[j]=u
i: 19 j: 8 main[i]=s sub[j]=s
```

Similar changes occours as we find matches for subsequence and thus we accomplish the task in mere 19 iteration which is exact length of our string.

### Code in C++

```
#include <iostream>
#include <bits/stdc++.h>
using namespace std;
string findSubSeq(string main,string sub){
int i=0,count=0;
int msize=main.size();
int ssize=sub.size();
while(i<msize && count<ssize){
if (main[i]==sub[count])count++;
i++;
}
if(ssize==count)return "Match Found";
return "Not Found";
}
int main(){
string main,sub;
getline(cin,main);
getline(cin,sub);
string result = findSubSeq(main,sub);
cout<<result<<"\n";
return 0;
}
```

### Output

```
Match Found
```

### Time Complexity

In this solution, we go through the string only once, hence we have **O(N)** as Time complexity.

The space complexity is O(1).

With this article at OpenGenus, we have the complete knowledge of how to check if a string is a subsequence of another string efficiently in linear time O(N).