What Is The Euclidean Distance?
The Euclidean distance between two points in a Euclidean space is the length of a segment between those two points.
Mathematically:
Given two points p and q in a Euclidean space of dimension n, with coordinates (p1, p2, … , pn) and (q1, q2, … , qn), the Euclidean distance between them is defined as:
As this formula and the figure below shows, the Euclidean distance can be calculated via the Pythagorean theorem.
Figure 1: Pythagorean theorem.
Source: Kmhkmh, CC BY 4.0, via Wikimedia Commons
Euclidean distance has many important properties. Among them are:
- Symmetry:
Given two points p and q, then d(p,q) = d(q,p)
- Positive:
The value of the Euclidean distance between two points is always a positive number.
- Triangle Equality:
Given three points p, q and r, then d(p,q) + d(q,r) ≥ d(p,r)
In machine learning, the Euclidean distance is just one of the four distance measures used between a pair of samples p and q in an n-dimensional feature space. The other three are the Manhattan distance, the Minkowski distance, and the Hamming distance. Of the four distances, the Euclidean gives the shortest value between any two points.
Why Is Distance Effective?
Many machine learning algorithms use the Euclidean distance to measure the similarity between observations and supervised and unsupervised learning. Commonly known algorithms that use it are the K-nearest neighbors (classification) and K-means (clustering), which estimate distances to find the k-closest points. It is also used in hierarchical clustering and agglomerative clustering to calculate the distances between clusters.
Deep neural networks (DNNs) can calculate Euclidean distances in cases where the direct evaluation process becomes very complicated.
Euclidean Distance and LogicPlum
LogicPlum’s platform is used to find the best fitting model for a given dataset. This task is automatically done by the platform’s engine, ensuring that all potential algorithms are tried and tested.
Many of these algorithms use Euclidean distance and other mathematical artifacts. However, as these estimations are automatically done by the platform, users don’t need in-depth scientific knowledge to find optimal models.
Additional Resources
- Sharma, P. (2020). 4 Types of Distance Metrics in Machine Learning. Available athttps://www.analyticsvidhya.com/blog/2020/02/4-types-of-distance-metrics-in-machine-learning/