# Python – For each row of a two-dimensional numpy array, gets the index of an equal row in the second two-dimensional array

For each row of a two-dimensional numpy array, gets the index of an equal row in the second two-dimensional array… here is a solution to the problem.

## For each row of a two-dimensional numpy array, gets the index of an equal row in the second two-dimensional array

I have two huge 2d numpy integer arrays X and U, where it is assumed that U has only unique rows. For each row in X, I want to get the corresponding row index of the matching row in U (-1 if any, otherwise -1). For example, if you pass the following array as input:

``````U = array([[1, 4],
[2, 5],
[3, 6]])

X = array([[1, 4],
[3, 6],
[7, 8],
[1, 4]])
``````

The output should be:

``````array([0,2,-1,0])
``````

Is there an effective way to do this (or something similar) with Numpy?

@Divacar:

``````print(type(rows), rows.dtype, rows.shape)
print(rows[:10])
print(search2D_indices(rows[:10], rows[:10]))

<class 'numpy.ndarray'> int32 (47398019, 5)
[[65536     1     1     1    17]
[65536     1     1     1   153]
[65536     1     1     2   137]
[65536     1     1     3   153]
[65536     1     1     9   124]
[65536     1     1    13   377]
[65536     1     1    13   134]
[65536     1     1    13   137]
[65536     1     1    13   153]
[65536     1     1    13   439]]
[ 0  1  2  3  4 -1 -1 -1 -1  9]
``````

### Solution

Method #1

Inspired by >this solution `Find the row indexes of several values in a numpy array ` , which is a `searchsorted` using vectorization solution-

``````def search2D_indices(X, searched_values, fillval=-1):
dims = np.maximum(X.max(0), searched_values.max(0))+1
X1D = np.ravel_multi_index(X.T,dims)
searched_valuesID = np.ravel_multi_index(searched_values. T,dims)
sidx = X1D.argsort()
idx = np.searchsorted(X1D,searched_valuesID,sorter=sidx)
idx[idx==len(sidx)] = 0
idx_out = sidx[idx]
return np.where(X1D[idx_out] == searched_valuesID, idx_out, fillval)
``````

sample run-

``````In [121]: U
Out[121]:
array([[1, 4],
[2, 5],
[3, 6]])

In [122]: X
Out[122]:
array([[1, 4],
[3, 6],
[7, 8],
[1, 4]])

In [123]: search2D_indices(U, X, fillval=-1)
Out[123]: array([ 0,  2, -1,  0])
``````

Method #2

Extending to the case with negative integers, we need to offset `dims` accordingly and convert to `1D`, like this –

``````def search2D_indices_v2(X, searched_values, fillval=-1):
X_lim = X.max()-X.min(0)
searched_values_lim = searched_values.max()-searched_values.min(0)

dims = np.maximum(X_lim, searched_values_lim)+1
s = dims.cumprod()

X1D = X.dot(s)
searched_valuesID = searched_values.dot(s)
sidx = X1D.argsort()
idx = np.searchsorted(X1D,searched_valuesID,sorter=sidx)
idx[idx==len(sidx)] = 0
idx_out = sidx[idx]

return np.where(X1D[idx_out] == searched_valuesID, idx_out, fillval)
``````

sample run-

``````In [142]: U
Out[142]:
array([[-1, -4],
[ 2,  5],
[ 3,  6]])

In [143]: X
Out[143]:
array([[-1, -4],
[ 3,  6],
[ 7,  8],
[-1, -4]])

In [144]: search2D_indices_v2(U, X, fillval=-1)
Out[144]: array([ 0,  2, -1,  0])
``````

Method #3

The other is based on `views`

``````# https://stackoverflow.com/a/45313353/ @Divakar
def view1D(a, b): # a, b are arrays
a = np.ascontiguousarray(a)
b = np.ascontiguousarray(b)
void_dt = np.dtype((np.void, a.dtype.itemsize * a.shape[1]))
return a.view(void_dt).ravel(),  b.view(void_dt).ravel()

def search2D_indices_views(X, searched_values, fillval=-1):
X1D,searched_valuesID = view1D(X, searched_values)
sidx = X1D.argsort()
idx = np.searchsorted(X1D,searched_valuesID,sorter=sidx)
idx[idx==len(sidx)] = 0
idx_out = sidx[idx]
return np.where(X1D[idx_out] == searched_valuesID, idx_out, fillval)
``````