What is the Five Number Summary?

The five-number summary is a set of five statistics used to describe a dataset of quantitative data. These statistics are the five main percentiles:

  1. Oth percentile or sample minimum: the smallest observation.
  2. 25th percentile or first quartile or upper quartile.
  3. 50th percentile or median.
  4. 75th percentile or third quartile or upper quartile.
  5. 100th percentile or maximum: the largest observation.

All observations must be from a univariate variable that is measurable on an ordinal, interval, or ratio scale.

The five-number summary is usually represented via a box plot. The different percentiles are indicated as follows: the minimum and the maximum with whiskers, the first and third quartile as the end of the rectangle, and the median with a line at the center of the rectangle. Outliers are symbolized with dots or asterisks.

five number summary logicplum








Figure 1: five-number summary representation in a box plot



Why is the Five Number Summary Important?

A five-number summary is handy when dealing with a large amount of data, as it provides a concise overview of its distribution. In a single figure, the quartiles show information about the data spread, the median about location, and the maximum and minimum data range. Besides, the diagram can also include outliers, providing a very complete description of a dataset.

The five-number summary can also be used to compare different sets of observations and to evaluate intermediate L-estimators, such as the interquartile range, midhinge, range, mid-range, and trimean.


Five Number Summary and LogicPlum

LogicPlum platform’s main advantages are speed and accuracy. Additionally, LogicPlum helps users to analyze datasets, compare them, and organize them into better-structured data samples, all in an automated manner. Once the engine has found an optimal model, it provides a comprehensive explanation of the findings. Among the tools used in these tasks, five-number summaries provide a way to present clear pictures of the datasets involved, easily understood by experts and non-experts alike.

Additional References