What Is Manhattan Distance?
The Manhattan distance estimates the distance between two real-valued vectors or points. It is calculated as the sum of the absolute differences of their Cartesian coordinates.
Mathematically:
Given two points p and q in a Euclidean space of dimension n, with coordinates (p1, p2, … , pn) and (q1, q2, … , qn), the Manhattan distance between them is defined as:
Figure 1 shows the Manhattan (red, yellow, and blue paths) and the Euclidean (green route) distances between two points. All these Manhattan distances have a value of 12.
Figure 1 – Comparison of Manhattan and Euclidean distances.
Source: Psychonaut, Public domain, via Wikimedia Commons
The Manhattan distance’s name derives from the grid layout representing most streets on the island of Manhattan and the paths in it that give the shortest route between two points. Due to this, the Manhattan distance is also called the Taxicab distance or the City Block distance.
This distance is related to other metrics, such as the L1 vector norm, the sum absolute error, and the mean absolute error metric.
Why Is The Manhattan Distance Valuable?
The Taxicab geometry has been used in regression analysis since the 18th century, and this approach is often referred to as LASSO.
Its geometric interpretation is due to the mathematician Hermann Minkowski who studied non-Euclidean geometries in the 19th century.
This metric is also used in specific methods to evaluate the differences in discrete frequency distributions. An example of this application is the RNA splicing positional distribution of hexamers. This distribution shows the probability of a hexamer appearing in a given nucleotide near a splice site, and two or more of them can be compared by using L1-distances.
Manhattan Distance and LogicPlum
LogicPlum’s platform provides its users with an automated tool for machine learning. The Manhattan distance is part of its components, and it is used by some of its algorithms.
The main advantage of using this platform resides in the possibility of training and testing hundreds of potential solutions in a short time. Additionally, as all operations are automated, it requires almost no mathematical knowledge from the user.
Additional Resources
- Sammut, C. and Webb, G. (2017) (Ed.). Encyclopedia of Machine Learning and Data Mining.2nd Ed. Springer. NY.