Different ways to implement Softmax in Python

Open-Source Internship opportunity by OpenGenus for programmers. Apply now.

Introduction

It is first important to understand what the Softmax function is and what is it used for. Softmax is an important function within neural networks in Machine Learning. A neural network is essentially a method within Deep Learning that acts like a human brain. It contains many connected nodes that all work together in a way that reads/analyzes data in order to better make decisions. This model that utilizes a neural network uses many different data sets so that its predictions/decisions can be more fine tuned and more correct. The neural network produces output that one can look at and see how that corresponds to the task at hand. However, that is where the Softmax function comes into place. The network's normal outputs can be difficult to interpret and may end up taking a while to figure out. The Softmax function creates a vector (given an input vector of numbers) in which each entry is a probability of a specific outcome that sums up to one. This is so it is easy to see how likely an element is to be in a correct classification.

The Actual Formula

The formula for the Softmax function is as follows:

Softmax(Xi) = e^(Xi) / ((e^(X1) + e^(X2) ... + e^(Xn))

Xi represents the ith element of the input vector and n represents the length of the vector. The class that has the highest probability is decided as the predicted class.

Different method to implement the Softmax function in Python

Method one: NumPy Library
This method utilizes Python's NumPy library to compute the Softmax vector. It uses two main functions to do so: exp and sum. They help with the exponentiation and normalization of the function. It also uses an "axis = 0" argument for the sum to ensure normalization. The code for this implementation is as follows:

import numpy as np
def softmax(x):
    e_x = np.exp(x - np.max(x))
    return e_x / e_x.sum(axis = 0)

Sample input and output for NumPy approach

x = np.array([4,5,6])
print(softmax(x))
//output
[0.04712342 0.04712342 0.04712342]

Method two: TensorFlow Library
This method utilizes Python's TensorFlow library to compute the Softmax vector. This library has a built in Softmax function so not much code is necessary. It can use both Tensor and NumPy rrays as its input. The code is shown below:

import tensorflow as tf
def softmax(x):
    return tf.nn.softmax(x)

Sample input and output for NumPy approach

x = tf.constant([4,5,6])
with tf.Session() as sess:
    array = sess.run(softmax(x)) //runs the computations and stores it in the array
print(array)
//output
[0.09003057 0.24472848 0.66524094]

Method three: Keras Library
This method utilizes Python's Keras library to compute the Softmax vector. The library has a "backend" module in which one can do the Softmax computation through. It uses an argument, "axis = 1," in order to specify normalization. The code can be seen below:

from keras import backend as K
def softmax(x):
    return K.softmax(x, axis = 1)

Sample input and output for Keras approach

x = ([4,5,6])
value = K.softmax(x, axis = 1)
with K.get_session() as sess:
    evaluation = sess.run(value)
print(evaluation)
//output
[0.09003057 0.24472848 0.66524094]

Method four: Naive approach
This method does not use any external library to compute the Softmax vector. The code is below:

def softmax(x):
    exps = []
    for xi in x:
        exp = 0
        for i in range(len(xi)):
            exp += xi**i / (1 if i == 0 else i * exp)
        exps.append(exp)
    return [exp / sum(exps) for exp in exps]

Sample input & output

x = ([4,5,6])
print(softmax(x))
//output
[0.09003057317038046, 0.24472847105479767, 0.6652409557748219]

Conclusion

These are simply four possible implementations of the Softmax function in Python. It is important to note that there are more implementations that can be used depending on what the user wants/needs and what would be the most helpful.

Different ways to implement Softmax in Python

Python Machine Learning (ML)

Different types of Attention Mechanism

Language Translation in Python