A probability distribution is a statistical function that describes the values and probabilities that a random variable can take. They can be characterized by their mean, standard deviation, skewness, and kurtosis.
There are many probability functions, each presenting different shapes. The best known is the Normal or Gaussian, also called the “bell curve” due to its form. This probability distribution can be described through the mean and the standard deviation only because it exhibits neither skewness nor kurtosis due to its symmetry.
Other well-known probability distributions are the binomial distribution, the Poisson distribution (see figure 1 below), and the chi-square distribution. Each of them represents different data generation processes. For example, the binomial distribution describes the likelihood of an event that occurs several times over a given number of various probable trials.
Figure 1: An example of a probability distribution, the Poisson distribution.
Source: Skbkekas, CC BY 3.0, via Wikimedia Commons
A probability distribution is an essential feature of statistics and provides a way to describe random variables. As such, it is used in every field that uses statistics.
In finance, for example, they are used to calculate expected returns of stocks and hedge risks. In social sciences, they are widely used in experiments, surveys, and modeling. In physics and chemistry, they are used to describe physical objects’ properties and in astronomy to describe waves.
Probability distributions are also used in several other branches of mathematics. For example, stochastic differential equations are the area of mathematics that studies differential equations with random variables. In game theory, Poisson games study situations with a random number of players.
LogicPlum’s platform uses statistics intensively, and as such, it uses probability distributions in many instances. For example, in feature engineering, probability distributions are used in data extraction, and in model evaluation, probability distributions are used to estimate performances.
Probability functions are also part of the neural network technologies included in the platform. For example, through Mixture Density Networks (MDN), the system can predict values and the underlying probability distribution. The advantage of this approach is that an MDN can predict not only different distributions but also their combinations.