Python – ‘numpy.random.normal’ generates different numbers on different systems

‘numpy.random.normal’ generates different numbers on different systems… here is a solution to the problem.

‘numpy.random.normal’ generates different numbers on different systems

I’m comparing using np.random.normal The generated numbers use the following code on two different systems (details below) (I’m using the old version np.random.seed because it is used by another program , I eventually want to verify its output) (1) :

import numpy as np

np.random.seed(0)
x = np.random.normal(scale=1e-3, size=10**5)
np.save('test.npy', x)

Then I copied test.npy from one system to another and compared the two versions:

>>> other = np.load('test.npy')
>>> (x != other).sum(), len(x)
(29, 100000)
>>> mask = x != other
>>> np.abs(x[mask] - other[mask])
array([5.42101086e-20, 1.35525272e-20, 2.71050543e-20, 5.42101086e-20,
       1.08420217e-19, 1.08420217e-19, 2.16840434e-19, 2.16840434e-19,
       1.35525272e-20, 1.08420217e-19, 1.08420217e-19, 5.42101086e-20,
       2.71050543e-20, 1.08420217e-19, 2.16840434e-19, 5.42101086e-20,
       2.71050543e-20, 2.16840434e-19, 2.16840434e-19, 2.71050543e-20,
       2.71050543e-20, 1.08420217e-19, 1.08420217e-19, 1.08420217e-19,
       5.42101086e-20, 1.08420217e-19, 1.08420217e-19, 5.42101086e-20,
       2.71050543e-20])
>>> x[mask]
array([ 4.52489093e-04,  9.78961454e-05, -1.47113076e-04, -3.67859222e-04,
       -5.33279620e-04,  8.40794952e-04, -7.75987295e-04,  1.34205479e-03,
        6.34459482e-05,  5.07109360e-04, -7.68363366e-04,  3.33350262e-04,
       -2.19367067e-04,  6.11402140e-04, -1.30486526e-03, -4.42699624e-04,
        1.45463287e-04, -1.22491651e-03,  1.05226781e-03, -2.43032730e-04,
       -2.40551279e-04,  4.95396595e-04, -7.25454745e-04, -8.50779215e-04,
       -2.66274662e-04,  7.28854386e-04,  8.38515107e-04,  3.36152654e-04,
       -1.26550328e-04])

So 29 out of 100,000 elements is a small difference. However, I don’t understand where this difference comes from. I confirm that I have the same version of Python and NumPy installed on both systems: python==3.9.4 and numpy==1.20.2 (get python by -m pip install numpy==1.20.2; But I also checked the latest version numpy==1.23.0 and the result is exactly the same). I verified that the RNG state (via np.random.get_state()) was the same on both systems before and after calling np.random.normal. I saved and copied the test.npy file several times, and I also verified it with an MD5 checksum, so the difference must stem from the random number generation itself (1). However, I don’t see how this is possible, as both are started in the same random state.

System information

System A (the one that holds test.npy):

$ uname -a
Linux SystemA 3.10.0-1160.31.1.el7.x86_64 #1 SMP Thu Jun 10 13:32:12 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

(I also tested another system, A2, which installed the same OS version as A, but

with a different CPU, but the result did not change from A to A2, i.e. I suspect the OS version).

System B (the system on which test.npy is loaded):

$ uname -a
Linux SystemB 5.4.0-113-generic #127-Ubuntu SMP Wed May 18 14:30:56 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Footnote (1): When I use When the recommended approach given in the documentation for np.random.seed, i.e. rs = RandomState(MT19937(SeedSequence(0)), I found that the differences between the two systems still exist. However, when I use np.random.default_rng (seed=0) instead, that is, the new one PCG64, I noticed that the difference disappeared.

Solution

Given that the difference is so small, this suggests that the underlying bit generator is doing the same thing. This is just related to the differences between the underlying math libraries.

NumPy legacy Generator uses sqrt and log in libm, and you can see that it extracts these symbols by first finding the shared object that provides the generator:

import numpy as np

print(np.random.mtrand.__file__)

Then dump the symbol:

nm -C -gD mtrand.*.so | grep GLIBC

The mtrand file name comes from the output above.

I get a lot of other symbol output, but that might explain the difference.

I’m guessing this is related to the log implementation, so you can test with the following approach:

import numpy as np

np.random.seed(0)

x = 2 * np.random.rand(2, 10**5) - 1
r2 = np.sum(x * x, axis=0)

np.save('test-log.npy', np.log(r2))

And compare the two systems.

Related Problems and Solutions