Python – Working example of Mahalanobis distance measurement

Working example of Mahalanobis distance measurement… here is a solution to the problem.

Working example of Mahalanobis distance measurement

I need to measure the distance between two n-dimensional vectors. It seems that Mahalanobis Distance is a good choice here, so I wanted to give it a try.

My code looks like this:

import numpy as np
import scipy.spatial.distance.mahalanobis

x = [19, 8, 0, 0, 2, 1, 0, 0, 18, 0, 1673, 9, 218]
y = [17, 6, 0, 0, 1, 2, 0, 0, 8, 0, 984, 9, 30]
scipy.spatial.distance.mahalanobis(x,y,np.linalg.inv(np.cov(x,y)))

But I get this error message:

/usr/lib/python2.7/dist-packages/scipy/spatial/distance.pyc in mahalanobis(u, v, VI)
    498     v = np.asarray(v, order='c')
    499     VI = np.asarray(VI, order='c')
--> 500     return np.sqrt(np.dot(np.dot((u-v),VI),(u-v). T).sum())
    501 
    502 def chebyshev(u, v):

ValueError: matrices are not aligned

Scipy Doc says that VI is the inverse of the covariance matrix, and I think np.cov is the covariance matrix, np.linalg. inv is the inverse of the matrix….

But I understand what the problem is here (matrix misalignment): Matrix VI has the wrong dimension (2×2 instead of 13×13).
So the possible solution is to do this:

VI = np.linalg.inv(np.cov(np.vstack((x,y)). T))

But unfortunately, np.cov(np.vstack((x,y)). T) has 0 det, which means that the inverse matrix does not exist.

then How to use the Mahalanobis distance metric when I can’t even calculate the covariance matrix?

Solution

Are you sure Mahalbis distance is right for your application? According to Wikipedia you need a set of points to generate a covariance matrix, not just two vectors. You can then calculate the distance of the vector from the center of the set.

Related Problems and Solutions