Shift OR algorithm for String Matching

Internship at OpenGenus

Get FREE domain for 1st year and build your brand new site

The Shift OR algorithm uses bitwise techniques to check whether the given pattern is present in the string or not. It is efficient if the pattern length is lesser than the memory-word size of the machine (In this article at OPENGENUS, we consider the memory-word size to be 64bits). We are given string, string length, and the pattern. Our job is to return the starting index of the pattern if the pattern exists in the string and -1 if it does not exist.

For Example:

Input:

  • Text: Opengenus
  • Pattern: genus
  • Output: Pattern found at index: 4

Input:

  • Text: Youareawesome
  • Pattern: amazing
  • Output: No Match.

Algorithm

Following are the steps of Shift OR algorithm for String Matching:

Step 1: Take string and pattern as input.
Step 2: Create an array called pattern_mask of size 256 (total no of ASCII characters) and initialize it to ~0.
Step 3: Now traverse the pattern and initialize the ith bit of pattern_mask[pattern[i]] from right to 0.
Step 4: Now initialize a variable called R which contains ~1.
Step 5: Traverse the string from left to right.
Step 6: R is equal to R shift pattern_mask[test[i]].
Step 7: Shift R towards left by 1.
Step 8: If the mth(length of pattern) index in R from right is equal to 0 then the string is found at index i - m + 1.
Step 9: If no such m exists then return -1.

Time and space complexity of Shift OR algorithm

  • The time complexity is O(m + n) where m is the length of the pattern and n is the length of text. Since it mostly uses bitwise operations to perform tasks therefore it is very fast.
  • The space complexity is O(1) because we are not using any extra space.

Example

The given string is opengenus.
The given pattern is genus.
The length of the pattern is 5(genus).

                   s u n e g  
pattern_mask[g] =  1 1 1 1 0 
pattern_mask[e] =  1 1 1 0 1 
pattern_mask[n] =  1 1 0 1 1
pattern_mask[u] =  1 0 1 1 1 
pattern_mask[s] =  0 1 1 1 1 
pattern_mask[p] =  1 1 1 1 1 
pattern_mask[O] =  1 1 1 1 1 

R is equal to 1 1 1 1 0

traverse opengenus from left to right.
R is equal to R | pattern_mask[o]
1 1 1 1 0 | 1 1 1 1 1
1 1 1 1 1
R << 1 is equal to
1 1 1 1 0
i is equal to 0.

R is equal to R | pattern_mask[p]
1 1 1 1 0 | 1 1 1 1 1
1 1 1 1 1
R << 1 is equal to
1 1 1 1 0
i is equal to 1 ( 0 + 1)

R is equal to R | pattern_mask[e]
1 1 1 1 0 | 1 1 1 0 1
1 1 1 1 1
R << 1 is equal to
1 1 1 1 0
i is equal to 2 (1 + 1)

R is equal to R | pattern_mask[n]
1 1 1 1 0 | 1 1 0 1 1
1 1 1 1 1
R << 1 is equal to
1 1 1 1 0
i is equal to 3 (2 + 1)

R is equal to R | pattern_mask[g]
1 1 1 1 0 | 1 1 1 1 0
1 1 1 1 0
R << 1 is equal to
1 1 1 0 0
i is equal to 4 (3 + 1)

R is equal to R | pattern_mask[e]
1 1 1 0 0 | 1 1 1 0 1
1 1 1 0 1
R << 1 is equal to
1 1 0 1 0
i is equal to 5 (4 + 1)

R is equal to R | pattern_mask[n]
1 1 0 1 0 | 1 1 0 1 1
1 1 0 1 1
R << 1 is equal to
1 0 1 1 0
i is equal to 6 (5 + 1)

R is equal to R | pattern_mask[u]
1 0 1 1 0 | 1 0 1 1 1
1 0 1 1 1
R << 1 is equal to
0 1 1 1 0
i is equal to 7 (6 +1)

R is equal to R | pattern_mask[s]
0 1 1 1 0 | 0 1 1 1 1
0 1 1 1 1
R << 1 is equal to
0 1 1 1 1 0
i is equal to 8 (7 + 1)

R & 1 << 5 is equal to 0.
Therefore the pattern has been found.
return i - m + 1.
8 - 5 + 1.
return 4.

The pattern is found at index 4.

Implementing Shift OR algorithm for string matching

Following is the implementation of Shift OR algorithm in C++:

#include <iostream>
#include <string>
#include <map>
using namespace std;

int shift_or(string t, string p) {
   int m = p.length();//length of pattern.
   long pattern_mask[256];//creation of array.
   long R = ~1;

   if (m == 0)
      return -1;//If no pattern has been given.
   if (m >63) {
      cout<<"Pattern is too long!";//if pattern is too long
      return -1;
   }
   for (int i = 0; i <=255; ++i){
      pattern_mask[i] = ~0;//initializing pattern mask array.
   }
   for (int i = 0; i < m; ++i){
      pattern_mask[p[i]] &= ~(1L << i);
   }
   for (int i = 0; i < t.length(); ++i) {
      R |= pattern_mask[t[i]];
      R <<= 1;
      if ((R & (1L << m)) == 0)
         return i - m + 1;
   }
   return -1;
}

void findPattern(string t, string p) {
   int position = shift_or(t, p);//invoking shift_or function.
   if (position == -1)
      cout << "\nNo Match\n";
   else
      cout << "\nPattern found at position: " << position;
}

int main() {
   string t;
   t="opengenus";
   string p;
   p="genus";
   findPattern(t, p);
}

Output

Pattern found at position: 4

With this article at OPENGENUS, you must have the complete idea of Shift OR algorithm for String Matching.