Working example of Mahalanobis distance measurement
I need to measure the distance between two n-dimensional vectors. It seems that Mahalanobis Distance is a good choice here, so I wanted to give it a try.
My code looks like this:
import numpy as np
import scipy.spatial.distance.mahalanobis
x = [19, 8, 0, 0, 2, 1, 0, 0, 18, 0, 1673, 9, 218]
y = [17, 6, 0, 0, 1, 2, 0, 0, 8, 0, 984, 9, 30]
scipy.spatial.distance.mahalanobis(x,y,np.linalg.inv(np.cov(x,y)))
But I get this error message:
/usr/lib/python2.7/dist-packages/scipy/spatial/distance.pyc in mahalanobis(u, v, VI)
498 v = np.asarray(v, order='c')
499 VI = np.asarray(VI, order='c')
--> 500 return np.sqrt(np.dot(np.dot((u-v),VI),(u-v). T).sum())
501
502 def chebyshev(u, v):
ValueError: matrices are not aligned
Scipy Doc says that VI
is the inverse of the covariance matrix, and I think np.cov is the covariance
matrix, np.linalg. inv
is the inverse of the matrix….
But I understand what the problem is here (matrix misalignment): Matrix VI has the wrong dimension (2×2 instead of 13×13).
So the possible solution is to do this:
VI = np.linalg.inv(np.cov(np.vstack((x,y)). T))
But unfortunately, np.cov(np.vstack((x,y)). T)
has 0 det, which means that the inverse matrix does not exist.
then How to use the Mahalanobis distance metric when I can’t even calculate the covariance matrix?
Solution
Are you sure Mahalbis distance is right for your application? According to Wikipedia you need a set of points to generate a covariance matrix, not just two vectors. You can then calculate the distance of the vector from the center of the set.