Strings in Python [Complete Guide]


In this article, we will learn how to implement strings in Python and explore the various string operations that we can use.

Introduction to Strings

A String is a data type in Python that represents a sequence of characters. They are immutable, which means that once it is created, it cannot be changed (or modified). There are multiple ways to define a string in Python -

  • The sequence of characters can be enclosed in single quotations ' ', or in double quotations " ".
  • Declare a string by using the str() function
  • We can also use triple quotes """ """ to define a string. When we use triple quotes, the sequence can span several lines (without the use of escape charaters).
>>> # Initialization
>>> greeting1 = 'Hello World' 
>>> greeting2 = "Hello World"
>>> 
>>> # Simple Declaration
>>> declaration = str()
>>> 
>>> # Initialization
>>> greeting3 = str('Hello World')
>>> greeting4 = str("Hello World")
>>> 
>>> #Triple Quotes
>>> greeting5 = """Hello
World
!"""
>>> print(greeting5)
Hello
World
!

We can observe the type of a string using the following command.

>>> type(str())
<class 'str'>

You can also retrieve the length of your string with the help of the following commands.

>>> greeting = "Hello World"
>>> print(len(greeting))
11

Basic String Operations

Although string are immutable, they work similar to Arrays in terms of accessing a value. Their individual values can be accessed with their index. In python, indexing starts from 0. If we want to move from the end of the string, the last value is at index -1. The image below displays the indices for the string "Hello World".
Index

To access the value at a particular index, we can implement the following steps.

>>> greeting = "Hello World"
>>> val = greeting[6]
>>> val2 = greeting[-5]
>>> print('val: ', val,'    val2:', val2) # Both indices are for the same character
val:  W     val2: W

We can access the string by using its identifier and access the value at a particular index. We can also slice the string
By printing from one index to another using Slice Operator
The slice operator [ start : stop : step ]
We give the start index(inclusive), ending index(not inclusive), and a step if you want move from one index to another in a step. The default values, if omited are:
- Start - 0
- Stop - end of string
- Step - 1(no skip in characters)

>>> greeting = "Hello World"
>>> # Basic print
>>> print(greeting)
Hello World

>>> # Using Slice to print from one index to another
>>> greeting[1:6]
'ello '
>>> greeting[1:6:1] # Same as above, as step default is 1
'ello '
>>> greeting[::2] # Prints every other character (in steps of 2)
'HloWrd'
>>> greeting[::-1] # Prints in reverse
'dlroW olleH'
>>> greeting[:-5:] # Stops at the value at index -5 (refer above)
'Hello '

To combine two strings, we can use a string method (join()) or we can use the + operator.

>>> greeting = "Hello"
>>> greeting = greeting + " John!"
>>> print(greeting)
Hello John!

in and not in are membership operators that we can use to check if a string is a substring of another. We can also use other string methods such as count(). They return a boolean value.

>>> greeting = "Hello Python!"
>>> 'py' in greeting
False
>>> 'Py' in greeting
True
>>> 'Hello' not in greeting
False

String Methods

Now, let us discuss some of the methods we can use to operate on strings. You can view all the various methods by following the below command

>>> dir(str())
>>> # Or
>>> dir(greeting)
  • str.capitalize(): This method capitalizes the first character of the string and converts the remaining characters to lowercase and returns the string.
  • str.lower(): This method returns the string after converting all the characters of the string to lowercase.
  • str.upper(): This method returns the string after converting all characters of the string to uppercase.
  • str.count(substring,start,stop): This method counts the number of times the substring is repeated between the start(included) and stop(not included) indices. It returns the number of times the substring was repeated in the string.
  • str.endswith(substring,start,stop): This method checks if the string(substring from start to stop) ends with the string mentioned in the parameters. This method returns a boolean value (True or False, based on if it ends with the character or not).
  • str.find(substring,start,stop): This method searches for the substring from the string and returns the starting index of the first substring that matched.
  • str.index(substring,start,stop): This method returns the index of the first matching substring. It is the same as the find() method, but if the substring is not found, index() method will raise a ValueError.
  • str.isalnum(): This method checks if all the characters of the string are alphanumeric, and returns True if they are. Otherwise it returns False.
  • str.replace(substring,newsubstring,count): This method returns a copy of the string after replacing all old substrings with the new substring. The count parameter is optional, if it's given, then the first 'count' occurences of the substring are replaced.
  • str.split(separator, maxsplit=-1): This method returns a list of strings that were seperated at the location of the separator. The parameter maxsplit is set to a defualt valueu of -1, which means that the string is split at all occurences of the separator.
  • str.strip(chars): This method returns a copy of the string with the leading and trailing characters removed. The chars argument is a string specifying the set of characters to be removed from the beginning or end of the string. If omitted or None, the chars argument defaults to removing whitespace.
  • str.title(): This method returns a titlecased version of the string where the first letter of each letter is capitalized and the remaining characters are converted to lowercase.
  • str.join(iterable): This method returns a string which is a concatenation of all the strings in the iterable. All values in the iterable must be strings.
  • str.format(args, kwargs): This method performs string formatting operation. The string on which this method is called can contain literal text or replacement fields delimited by braces {}. Each replacement field contains either the numeric index of a positional argument, or the name of a keyword argument. Returns a copy of the string where each replacement field is replaced with the string value of the corresponding argument.

Now let's try these methods on our strings

>>> greeting = "hello"

>>> #Capitalize
>>> greeting = greeting.capitalize()
>>> print(greeting)
Hello

>>> # Lowercase
>>> print(greeting.lower())
hello

>>> # Uppercase
>>> print(greeting.upper())
HELLO

>>> # Let's count how many times O is repeated
>>> print(greeting.count('o'))
1
>>> # Let's count how many Ls are present from the beginning till the 3rd index
>>> print(greeting.count('l',0,3))
1

>>> # Let's check if our string ends with O
>>> print(greeting.endswith('O'))
False
>>> # As observed above, python is case sensitive
>>> print(greeting.endswith('o'))
True

>>> # Let's locate 'e' from the string
>>> print(greeting.find('e'))
1
>>> # The index() method also acts similarly
>>> print(greeting.index('e'))
1
>>> # The difference between the two is, index raises an error if the substring is not found
>>> print(greeting.find('M'))
-1
>>> print(greeting.index('M'))
Traceback (most recent call last):
  File "<pyshell#35>", line 1, in <module>
    print(greeting.index('M'))
ValueError: substring not found

>>> # Let's check if our string is alphanumeric
>>> print(greeting.isalnum())
True
>>> print('08932745(*#*(&@*!&#'.isalnum())
False
>>> # The above string contained special characters, hence the method returned False

>>> # Let's create a new greeting by replacing a few characters in our string
>>> greetings = greeting.replace('ello','ola')
>>> print(greetings)
Hola

>>> #Let's define a new string
>>> newGreeting = 'Hi john, hi mary, Hi susan, welcome'

>>> # Using split(), we can return a list of string split at the commas
>>> newGreetingList = newGreeting.split(',')
>>> print(newGreetingList)
['Hi john', ' hi mary', ' Hi susan', ' welcome']

>>> # In the list, 'Hi mary' and 'welcome' have a leading space. 
>>> We can remove trailing and leading characters using strip()
>>> hiMary = newGreetingList[1]
>>> print(hiMary)
 hi mary
>>> hiMary = hiMary.strip()
>>> print(hiMary)
hi mary

>>> # Let's convert this to title case
>>> hiMary = hiMary.title()
>>> print(hiMary)
Hi Mary

>>> # We can join these two strings using join()
>>> newgreeting = ''.join([hiMary,', ',welcome])
>>> print(newgreeting)
Hi Mary, welcome

>>> # Let's try the format method
>>> newString = "Hi did you know {} + {} = {}"
>>> 
>>> newstring.format(5,9,5+9)
>>> print(newString.format(5,9,5+9))
Hi did you know 5 + 9 = 14
>>> a = 12
>>> b = 20
>>> print(newString.format(a,b,a+b))
Hi did you know 12 + 20 = 32

You can find all the the string methods in detail here

String Formatting

String Formatting in Python is similar to that of C.The "%" operator is used to format a set of variables enclosed in a tuple. Usign the appropriate argument specifiers, we can format strings the way we wish to.
Here are some basic argument specifiers:

  • %s - String
  • %d - Integers
  • %f - Floating point numbers
  • %.f - Floating point numbers with a fixed amount of digits to the right of the dot.
  • %x or %X - Integers in hex representation (lowercase/uppercase)

The demonstration below shows the working of string formating in Python

formatString.py

name = input("Enter your Name: ")
age = int(input("Enter your Age: "))
pi = 3.141592

ourString = "Hi %s, you are %d years old. Did you know pi is approximately %f?"
print(ourString % (name,age,pi))

Output:

Enter your Name: Jack
Enter your Age: 15
Hi Jack, you are 15 years old. Did you know pi is approximately 3.141592?