Python – 3D distance vectorization

3D distance vectorization… here is a solution to the problem.

3D distance vectorization

I need help vectorizing this code. Now, N=100, it takes a minute or so to run. I want to speed things up. I’ve done something like this for double loops, but never used 3D loops, and I’m having a hard time.

import numpy as np
N = 100
n = 12
r = np.sqrt(2)

x = np.arange(-N,N+1)
y = np.arange(-N,N+1)
z = np.arange(-N,N+1)

C = 0

for i in x:
    for j in y:
        for k in z:
            if (i+j+k)%2==0 and (i*i+j*j+k*k!=0):
                p = np.sqrt(i*i+j*j+k*k)
                p = p/r
                q = (1/p)**n
                C += q

print '\n'
print C

Solution

The meshgrid/where/indexing solution is already very fast. I made it about 65% faster. It’s not excessive, but I’ll explain it step by step :

I’m easiest to solve this problem because all the 3D vectors in the grid are columns in a large 2D 3 x M array. The meshgrid is the right tool to create all combinations (note that numpy version >= 1.7 is required for 3D meshgrid), vstack + reshape to transform the data into the desired form. Example:

>>> np.vstack(np.meshgrid(*[np.arange(0, 2)]*3)).reshape(3,-1)
array([[0, 0, 1, 1, 0, 0, 1, 1],
       [0, 0, 0, 0, 1, 1, 1, 1],
       [0, 1, 0, 1, 0, 1, 0, 1]])

Each column is a 3D vector. Each of these eight vectors represents a corner of a 1x1x1 cube (a 3D mesh with a length of 1 in step 1 and length 1 in all dimensions).

Let’s call this array vectors (which contains all the 3D vectors that represent points in the mesh). Then, prepare a bool mask to select those vectors that meet your mod2 criteria:

    mod2bool = np.sum(vectors, axis=0) % 2 == 0

np.sum(vectors, axis=0) creates a 1 x M array containing the sum of elements for each column vector. Thus, mod2bool is a 1 x M array with a bool value for each column vector. Now use this bool mask:

    vectorsubset = vectors[:,mod2bool]

This selects all rows (:) And use bool indexes to filter columns, both are quick operations in numpy. Calculate the length of the remaining vector using the native numpy method:

    lengths = np.sqrt(np.sum(vectorsubset**2, axis=0))

This is fairly fast — however, scipy.stats.ss and bottleneck.ss can perform sum-squared calculations faster than that.

Convert the length using your instructions:

    with np.errstate(divide='ignore'):
        p = (r/lengths)**n

This involves dividing a finite number by zero, resulting in an INF in the output array. That’s totally fine. We use numpy’s errstate context manager to ensure that these zero divisions do not throw exceptions or runtime warnings.

Now sum the finite elements (ignoring infs) and return the sum:

    return  np.sum(p[np.isfinite(p)])

I’ve implemented this method twice below. Once exactly as just explained, once involving the ss and nansum functions of Bottleneck. I’ve also added your comparison method, as well as skipping the modified version of the np.where((x*x+y*y+z*z)!=0) index, but instead creating an inf and finally summarizing the isfinite way.

import sys
import numpy as np
import bottleneck as bn

N = 100
n = 12
r = np.sqrt(2)

x,y,z = np.meshgrid(*[np.arange(-N, N+1)]*3)
gridvectors = np.vstack((x,y,z)).reshape(3, -1)

def measure_time(func):
    import time
    def modified_func(*args, **kwargs):
        t0 = time.time()
        result = func(*args, **kwargs)
        duration = time.time() - t0
        print("%s duration: %.3f s" % (func.__name__, duration))
        return result
    return modified_func

@measure_time
def method_columnvecs(vectors):
    mod2bool = np.sum(vectors, axis=0) % 2 == 0
    vectorsubset = vectors[:,mod2bool]
    lengths = np.sqrt(np.sum(vectorsubset**2, axis=0))
    with np.errstate(divide='ignore'):
        p = (r/lengths)**n
    return  np.sum(p[np.isfinite(p)])

@measure_time
def method_columnvecs_opt(vectors):
    # On my system, bn.nansum is even slightly faster than np.sum.
    mod2bool = bn.nansum(vectors, axis=0) % 2 == 0
    # Use ss from bottleneck or scipy.stats (axis=0 is default).
    lengths = np.sqrt(bn.ss(vectors[:,mod2bool]))
    with np.errstate(divide='ignore'):
        p = (r/lengths)**n
    return  bn.nansum(p[np.isfinite(p)])

@measure_time
def method_original(x,y,z):
    ind = np.where((x+y+z)%2==0)
    x = x[ind]
    y = y[ind]
    z = z[ind]
    ind = np.where((x*x+y*y+z*z)!=0)
    x = x[ind]
    y = y[ind]
    z = z[ind]
    p=np.sqrt(x*x+y*y+z*z)/r
    return np.sum((1/p)**n)

@measure_time
def method_original_finitesum(x,y,z):
    ind = np.where((x+y+z)%2==0)
    x = x[ind]
    y = y[ind]
    z = z[ind]
    lengths = np.sqrt(x*x+y*y+z*z)
    with np.errstate(divide='ignore'):
        p = (r/lengths)**n
    return  np.sum(p[np.isfinite(p)])

print method_columnvecs(gridvectors)
print method_columnvecs_opt(gridvectors)
print method_original(x,y,z)
print method_original_finitesum(x,y,z)

Here is the output:

$ python test.py
method_columnvecs duration: 1.295 s
12.1318801965
method_columnvecs_opt duration: 1.162 s
12.1318801965
method_original duration: 1.936 s
12.1318801965
method_original_finitesum duration: 1.714 s
12.1318801965

All methods produce the same result. When performing isfinite style summation, your method becomes a little faster. My method is faster, but I would say that this is an exercise of an academic nature, not an important improvement 🙂

I have another question: you mean that for N=3, the calculation should be 12. Even yours won’t do it. For N=3, all of the above methods generate 12.1317530867. Is this expected?

Related Problems and Solutions