×

Search anything:

# Generate all sub-strings of a string

#### Algorithms String Algorithms

Get this book -> Problems on Array: For Interviews and Competitive Programming

In this article, we have explored the approach of generating / printing all sub-strings of a given String and have provided the implementation to do so. The time complexity to generate all substring is O(N^3) time and there are O(N^2) sub-strings for a string of length N.

Table of content:

1. What is a Sub-string?
2. Generation of substrings
3. Algorithm / steps
4. Implementation in C
5. Time and Space Complexity
6. Uses

Pre-requisites: Arrays, Strings in C

# What is a Sub-string?

A substring is a contiguous set of characters in a given string/set of characters. For example, 'string' is a substring of 'substring'. The set of substrings will also cover the complete input string; each-and-every character/letter in the input string too.

Example: Substrings of hello are = {h, e, l, l, o, he, el, ll, lo, hel, ell, llo, hell, ello, hello}

Hence number of substrings for a given string is same as sum of len natural numbers i.e., len (len + 1) / 2 where len is number of characters in the input string.

# Generation of substrings

We start with the brute force approach where we print all substrings of length M and then, move on to generate substrings of length M+1.

So, the steps/ approach is:

1. String S of length N
2. For I from 1 to N
3. Generate all sub-strings of length I
4. For sub-strings of length I, consider P from 0 to N-I
5. One sub-string is "S[P], S[P+1], ..., S[P+I-1]"

We will go through the steps and finetune the approach.

Easiest approach is to print each character from the string, as they're also substrings.

``````// len refers to length of string or number of characters in the string
// s is the character array or a C-string
for (unsigned int i = 0; i < len; ++i) {
printf("%s\n", s[i]);
}
``````

The next piece of code that can be scribbled is for substrings whose length is 2 characters long. Above code can be modified as below

``````// The boundary condition ensures
// no single character gets printed in the last iteration
for (unsigned int i = 0; i < len - 1; ++i) {
printf("%c", s[i]);
printf("%c\n", s[i + 1]);
// OR
// printf("%c%c\n", s[i], s[i + 1]);
}
``````

We may continue writing for 3 character length substrings. But we can actually nest another loop inside the existing loop which does the job of many print statements.

``````for (unsigned int i = 0; i < len - 2; ++i) {
for (unsigned int j = 0; j < 3; ++j)
printf("%c", s[i + j]);
puts("");
}
// OR
for (unsigned int i = 0, k = 2; i < len - k; ++i) {
for (unsigned int j = 0; j < k + 1; ++j)
printf("%c", s[i + j]);
puts("");
}
``````

In the second piece of above code, `k` acts as controller on printing number of character a sub-string would contain. We can re-write code piece which would print substrings with 1 character length.

``````for (unsigned int i = 0, k = 0; i < len - k; ++i) {
for (unsigned int j = 0; j < k + 1; ++j)
printf("%c", s[i + j]);
puts("");
}
``````

Of course, we can have another loop which would vary `k`. Lowest value of `k` will be `0` (as in above code). Note that, value of `k` is one less than number of characters that generating substrings would contain. We know that, string itself would be a substring or in other words, maximum value of `k` will be one less than the length of input string.

``````for (unsigned int k = 0; k < len; ++k) {
for (unsigned int i = 0; i < len - k; ++i) {
for (unsigned int j = 0; j < k + 1; ++j)
printf("%c", s[i + j]);
puts("");
}
}
``````

Combining all the pieces:

# Algorithm / steps

The pseudocode steps to generate all sub-strings are:

1. Input a string and cache it's length in a variable
2. Vary the controller variable `k` from 0 till length
3. Vary an index `i` from 0 till `len - k`
4. Vary another index `j` from 0 till `k + 1` and display the character indexed by the value `i + j`
5. Display a newline to differentiate between generated substrings and continue with loop in step 3

# Implementation in C

Implementation in C Programming Language:

``````#include <stdio.h>
#include <string.h>

int main()
{
char *s = "Hello";
unsigned int len = strlen(s);

for (unsigned int k = 0; k < len; ++k) {
for (unsigned int i = 0; i < len - k; ++i) {
for (unsigned int j = 0; j < k + 1; ++j)
printf("%c", s[i + j]);
puts("");
}
}

return 0;
}
``````

Output:

H
e
l
l
o
He
el
ll
lo
Hel
ell
llo
Hell
ello
Hello

Following are the sub-strings of the word "OpenGenus":

• O
• p
• e
• n
• G
• e
• n
• u
• s
• Op
• pe
• en
• nG
• Ge
• en
• nu
• us
• Ope
• pen
• enG
• nGe
• Gen
• enu
• nus
• Open
• penG
• enGe
• nGen
• Genu
• enus
• OpenG
• penGe
• enGen
• nGenu
• Genus
• OpenGe
• penGen
• enGenu
• nGenus
• OpenGen
• penGenu
• enGenus
• OpenGenu
• penGenus
• OpenGenus

# Time and Space Complexity

Space: Constant space O(1) - loop indexing variables are the only cost

Time: O(N^3)

There are O(N^2) substrings of a string and printing a substring of length M takes O(M) time. Hence, the overall time will be O(N^3).

All the loops directly or indirectly depends on length of input string.

# Uses

Substrings are used in pattern-matching.

Many Algorithmic Problems involve dealing with sub-strings. As a brute force approach, all sub-strings can be generated. Note: In most problems, generating sub-strings can be avoided in an efficient approach.

With this article at OpenGenus, you must have the complete idea of generating all sub-strings of a string S.

Generate all sub-strings of a string