# Generate all sub-strings of a string

Get this book -> Problems on Array: For Interviews and Competitive Programming

In this article, we have explored the approach of **generating / printing all sub-strings of a given String** and have provided the implementation to do so. The time complexity to generate all substring is O(N^3) time and there are O(N^2) sub-strings for a string of length N.

**Table of content**:

- What is a Sub-string?
- Generation of substrings
- Algorithm / steps
- Implementation in C
- Time and Space Complexity
- Uses

Pre-requisites: Arrays, Strings in C

# What is a Sub-string?

A substring is a contiguous set of characters in a given string/set of characters. For example, 'string' is a substring of 'substring'. The set of substrings will also cover the complete input string; each-and-every character/letter in the input string too.

**Example**: Substrings of *hello* are = {*h*, *e*, *l*, *l*, *o*, *he*, *el*, *ll*, *lo*, *hel*, *ell*, *llo*, *hell*, *ello*, *hello*}

Hence number of substrings for a given string is same as *sum of len natural numbers* i.e., **len (len + 1) / 2** where **len** is number of characters in the input string.

# Generation of substrings

We start with the **brute force approach** where we print all substrings of length M and then, move on to generate substrings of length M+1.

So, the steps/ approach is:

- String S of length N
- For I from 1 to N

3. Generate all sub-strings of length I

4. For sub-strings of length I, consider P from 0 to N-I

5. One sub-string is "S[P], S[P+1], ..., S[P+I-1]"

We will go through the steps and finetune the approach.

Easiest approach is to **print each character from the string**, as they're also substrings.

```
// len refers to length of string or number of characters in the string
// s is the character array or a C-string
for (unsigned int i = 0; i < len; ++i) {
printf("%s\n", s[i]);
}
```

The next piece of code that can be scribbled is for substrings whose **length is 2** characters long. Above code can be modified as below

```
// The boundary condition ensures
// no single character gets printed in the last iteration
for (unsigned int i = 0; i < len - 1; ++i) {
printf("%c", s[i]);
printf("%c\n", s[i + 1]);
// OR
// printf("%c%c\n", s[i], s[i + 1]);
}
```

We may continue writing for **3 character length substrings**. But we can actually nest another loop inside the existing loop which does the job of many print statements.

```
for (unsigned int i = 0; i < len - 2; ++i) {
for (unsigned int j = 0; j < 3; ++j)
printf("%c", s[i + j]);
puts("");
}
// OR
for (unsigned int i = 0, k = 2; i < len - k; ++i) {
for (unsigned int j = 0; j < k + 1; ++j)
printf("%c", s[i + j]);
puts("");
}
```

In the second piece of above code, `k`

acts as controller on **printing number of character a sub-string would contain**. We can re-write code piece which would print substrings with 1 character length.

```
for (unsigned int i = 0, k = 0; i < len - k; ++i) {
for (unsigned int j = 0; j < k + 1; ++j)
printf("%c", s[i + j]);
puts("");
}
```

Of course, we can have another loop which would vary `k`

. Lowest value of `k`

will be `0`

(as in above code). Note that, value of `k`

is one less than number of characters that generating substrings would contain. We know that, string itself would be a substring or in other words, maximum value of `k`

will be one less than the length of input string.

```
for (unsigned int k = 0; k < len; ++k) {
for (unsigned int i = 0; i < len - k; ++i) {
for (unsigned int j = 0; j < k + 1; ++j)
printf("%c", s[i + j]);
puts("");
}
}
```

**Combining all the pieces**:

# Algorithm / steps

The pseudocode steps to **generate all sub-strings** are:

- Input a string and cache it's length in a variable
- Vary the controller variable
`k`

from 0 till length - Vary an index
`i`

from 0 till`len - k`

- Vary another index
`j`

from 0 till`k + 1`

and display the character indexed by the value`i + j`

- Display a newline to differentiate between generated substrings and continue with loop in step 3

# Implementation in C

Implementation in C Programming Language:

```
#include <stdio.h>
#include <string.h>
int main()
{
char *s = "Hello";
unsigned int len = strlen(s);
for (unsigned int k = 0; k < len; ++k) {
for (unsigned int i = 0; i < len - k; ++i) {
for (unsigned int j = 0; j < k + 1; ++j)
printf("%c", s[i + j]);
puts("");
}
}
return 0;
}
```

*Output*:

H

e

l

l

o

He

el

ll

lo

Hel

ell

llo

Hell

ello

Hello

Following are the sub-strings of the word "OpenGenus":

- O
- p
- e
- n
- G
- e
- n
- u
- s
- Op
- pe
- en
- nG
- Ge
- en
- nu
- us
- Ope
- pen
- enG
- nGe
- Gen
- enu
- nus
- Open
- penG
- enGe
- nGen
- Genu
- enus
- OpenG
- penGe
- enGen
- nGenu
- Genus
- OpenGe
- penGen
- enGenu
- nGenus
- OpenGen
- penGenu
- enGenus
- OpenGenu
- penGenus
- OpenGenus

# Time and Space Complexity

**Space**: Constant space O(1) - loop indexing variables are the only cost

**Time**: O(N^3)

There are O(N^2) substrings of a string and printing a substring of length M takes O(M) time. Hence, the overall time will be O(N^3).

All the loops directly or indirectly depends on length of input string.

# Uses

Substrings are used in pattern-matching.

Many Algorithmic Problems involve dealing with sub-strings. As a brute force approach, all sub-strings can be generated. Note: In most problems, generating sub-strings can be avoided in an efficient approach.

With this article at OpenGenus, you must have the complete idea of generating all sub-strings of a string S.