# Python – 3D distance vectorization

3D distance vectorization… here is a solution to the problem.

## 3D distance vectorization

I need help vectorizing this code. Now, N=100, it takes a minute or so to run. I want to speed things up. I’ve done something like this for double loops, but never used 3D loops, and I’m having a hard time.

``````import numpy as np
N = 100
n = 12
r = np.sqrt(2)

x = np.arange(-N,N+1)
y = np.arange(-N,N+1)
z = np.arange(-N,N+1)

C = 0

for i in x:
for j in y:
for k in z:
if (i+j+k)%2==0 and (i*i+j*j+k*k!=0):
p = np.sqrt(i*i+j*j+k*k)
p = p/r
q = (1/p)**n
C += q

print '\n'
print C
``````

### Solution

The meshgrid/where/indexing solution is already very fast. I made it about 65% faster. It’s not excessive, but I’ll explain it step by step :

I’m easiest to solve this problem because all the 3D vectors in the grid are columns in a large 2D `3 x M` array. The meshgrid is the right tool to create all combinations (note that numpy version >= 1.7 is required for 3D `meshgrid`), `vstack` + `reshape` to transform the data into the desired form. Example:

``````>>> np.vstack(np.meshgrid(*[np.arange(0, 2)]*3)).reshape(3,-1)
array([[0, 0, 1, 1, 0, 0, 1, 1],
[0, 0, 0, 0, 1, 1, 1, 1],
[0, 1, 0, 1, 0, 1, 0, 1]])
``````

Each column is a 3D vector. Each of these eight vectors represents a corner of a `1x1x1` cube (a 3D mesh with a length of 1 in step 1 and length 1 in all dimensions).

Let’s call this array `vectors` (which contains all the 3D vectors that represent points in the mesh). Then, prepare a bool mask to select those vectors that meet your mod2 criteria:

``````    mod2bool = np.sum(vectors, axis=0) % 2 == 0
``````

`np.sum(vectors, axis=0)` creates a `1 x M` array containing the sum of elements for each column vector. Thus, `mod2bool` is a `1 x M` array with a bool value for each column vector. Now use this bool mask:

``````    vectorsubset = vectors[:,mod2bool]
``````

This selects all rows (:) And use bool indexes to filter columns, both are quick operations in numpy. Calculate the length of the remaining vector using the native numpy method:

``````    lengths = np.sqrt(np.sum(vectorsubset**2, axis=0))
``````

This is fairly fast — however, `scipy.stats.ss` and `bottleneck.ss` can perform sum-squared calculations faster than that.

Convert the length using your instructions:

``````    with np.errstate(divide='ignore'):
p = (r/lengths)**n
``````

This involves dividing a finite number by zero, resulting in an `INF` in the output array. That’s totally fine. We use numpy’s `errstate` context manager to ensure that these zero divisions do not throw exceptions or runtime warnings.

Now sum the finite elements (ignoring infs) and return the sum:

``````    return  np.sum(p[np.isfinite(p)])
``````

I’ve implemented this method twice below. Once exactly as just explained, once involving the `ss` and `nansum` functions of Bottleneck. I’ve also added your comparison method, as well as skipping the modified version of the `np.where((x*x+y*y+z*z)!=0)` index, but instead creating `an inf` and finally summarizing `the isfinite` way.

``````import sys
import numpy as np
import bottleneck as bn

N = 100
n = 12
r = np.sqrt(2)

x,y,z = np.meshgrid(*[np.arange(-N, N+1)]*3)
gridvectors = np.vstack((x,y,z)).reshape(3, -1)

def measure_time(func):
import time
def modified_func(*args, **kwargs):
t0 = time.time()
result = func(*args, **kwargs)
duration = time.time() - t0
print("%s duration: %.3f s" % (func.__name__, duration))
return result
return modified_func

@measure_time
def method_columnvecs(vectors):
mod2bool = np.sum(vectors, axis=0) % 2 == 0
vectorsubset = vectors[:,mod2bool]
lengths = np.sqrt(np.sum(vectorsubset**2, axis=0))
with np.errstate(divide='ignore'):
p = (r/lengths)**n
return  np.sum(p[np.isfinite(p)])

@measure_time
def method_columnvecs_opt(vectors):
# On my system, bn.nansum is even slightly faster than np.sum.
mod2bool = bn.nansum(vectors, axis=0) % 2 == 0
# Use ss from bottleneck or scipy.stats (axis=0 is default).
lengths = np.sqrt(bn.ss(vectors[:,mod2bool]))
with np.errstate(divide='ignore'):
p = (r/lengths)**n
return  bn.nansum(p[np.isfinite(p)])

@measure_time
def method_original(x,y,z):
ind = np.where((x+y+z)%2==0)
x = x[ind]
y = y[ind]
z = z[ind]
ind = np.where((x*x+y*y+z*z)!=0)
x = x[ind]
y = y[ind]
z = z[ind]
p=np.sqrt(x*x+y*y+z*z)/r
return np.sum((1/p)**n)

@measure_time
def method_original_finitesum(x,y,z):
ind = np.where((x+y+z)%2==0)
x = x[ind]
y = y[ind]
z = z[ind]
lengths = np.sqrt(x*x+y*y+z*z)
with np.errstate(divide='ignore'):
p = (r/lengths)**n
return  np.sum(p[np.isfinite(p)])

print method_columnvecs(gridvectors)
print method_columnvecs_opt(gridvectors)
print method_original(x,y,z)
print method_original_finitesum(x,y,z)
``````

Here is the output:

``````\$ python test.py
method_columnvecs duration: 1.295 s
12.1318801965
method_columnvecs_opt duration: 1.162 s
12.1318801965
method_original duration: 1.936 s
12.1318801965
method_original_finitesum duration: 1.714 s
12.1318801965
``````

All methods produce the same result. When performing `isfinite` style summation, your method becomes a little faster. My method is faster, but I would say that this is an exercise of an academic nature, not an important improvement 🙂

I have another question: you mean that for N=3, the calculation should be 12. Even yours won’t do it. For N=3, all of the above methods generate 12.1317530867. Is this expected?