Python – Unable to determine the shape of numpy array in loop containing transpose operation

Unable to determine the shape of numpy array in loop containing transpose operation… here is a solution to the problem.

Unable to determine the shape of numpy array in loop containing transpose operation

I’ve been trying to create a small neural network to learn softmax functions, article from the following site: https://mlxai.github.io/2017/01/09/implementing-softmax-classifier-with-vectorized-operations.html

It is suitable for a single iteration. However, when I create a loop to train a network with updated weights, I get the following error: ValueError: Operand cannot be broadcast with shape (5,10) (1,5) (5,10). I have attached a screenshot of the output here. enter image description here

Debugging this issue, I found that np.max() returns arrays of shapes (5,1) and (1,5) in different iterations, even though the axes are set to 1. Please help me determine what the following code is wrong.

import numpy as np

N = 5
D = 10
C = 10

W = np.random.rand(D,C)
X = np.random.randint(255, size = (N,D))
X = X/255
y = np.random.randint(C, size = (N))
#print (y)
lr = 0.1

for i in range(100):
  print (i)
  loss = 0.0
  dW = np.zeros_like(W)
  N = X.shape[0]
  C = W.shape[1]

f = X.dot(W)
  #print (f)

print (np.matrix(np.max(f, axis=1)))
  print (np.matrix(np.max(f, axis=1)). T)
  f -= np.matrix(np.max(f, axis=1)). T
  #print (f)  

term1 = -f[np.arange(N), y]
  sum_j = np.sum(np.exp(f), axis=1)
  term2 = np.log(sum_j)
  loss = term1 + term2
  loss /= N 
  loss += 0.5 * reg * np.sum(W * W)
  #print (loss)

coef = np.exp(f) / np.matrix(sum_j). T
  coef[np.arange(N),y] -= 1
  dW = X.T.dot(coef)
  dW /= N
  dW += reg*W

W = W - lr*dW

Solution

In your first iteration, W is the instance and shape of np.ndarray (D, C). f inherits ndarray, so when you do np.max(f, axis = 1), it returns an ndarray shape (D,), which np.matrix() transforms into (1, D). Then transpose it to (D, 1).

But in your next iteration, W is an instance of np.matrix (which inherits from W = W - lr*dW in dW). f then inherits np.matrix , and np.max (f, axis = 1) returns the np.matrix shape (D, 1), which is unphased by np.matrix() and becomes shape (1, D) In the . After T

To resolve this issue, make sure you don’t mix np.ndarray with np.matrix.Define everything as np.matrix from the beginning (i.e. W = np.matrix(np.random.rand(D,C)) or use keepdims Maintain your shaft like this:

f -= np.max(f, axis = 1, keepdims = True)

This will allow you to keep all your 2D content without having to convert to np.matrix. (Do this for sum_j as well).

Related Problems and Solutions