# Softmax Function

#### What is the Softmax Function?

The softmax function transforms a vector K of real values into a vector K whose elements range between 0 and 1 and sum up to 1. This function is also called softargmax or multi-cast logistic regression.

The advantage of applying this function is that the transformed vector values can be interpreted as a probability and, if an input is negative, the softmax function transforms it into a small value.

Mathematically, the softmax function can be expressed as: Where  is the vector being transformed and k is its dimension.

When the input vector is  the softmax function becomes the logistic function: The softmax function was first formulated by the Austrian physicist Ludwig Boltzmann in 1868, and it is now known in physics and statistics as the Boltzmann or Gibbs distribution. In 1959, Robert Duncan Luce proposed its use in reinforcement learning. In recent years, it has become widely applied in neural networks.

#### Why is the softmax function important?

In addition to its importance in physics and statistics, the softmax function has become widely applied in machine learning, because it keeps the initial order and is differentiable.

In neural networks, the softmax function is frequently used at the end of multi-layered neural networks to normalize the outputs of the penultimate layer. In reinforcement learning, this function is used to decide between taking an action with the highest probability of reward or taking an exploratory step.

#### Softmax function + LogicPlum

The softmax function is very useful in science and neural networks. However, its application requires sound statistical and mathematical knowledge, which many researchers and business analysts don’t possess.

LogicPlum’s platform helps them by providing its users with a tool that manages all the mathematical complexity required in modeling. As a result, businesses and organizations can team up all the stakeholders related to a modeling problem and benefit from their specific expertise, knowing that the resulting mathematical model is based on the latest proven research available. 