Python – How do I use fit_generator with sequential data divided into batches?

How do I use fit_generator with sequential data divided into batches?… here is a solution to the problem.

How do I use fit_generator with sequential data divided into batches?

I’m trying to write a generator for my Keras lstm model. Use it with the fit_generator method.
My first question is what should my generator return? Batch? Sequence?
The example in the Keras documentation returns x,y for each data entry, but what if my data is sequential? I want to divide it into batches?

This is a Python method that creates a batch for a given input

def get_batch(data, batch_num, batch_size, seq_length):
    i_start = batch_num*batch_size;
    batch_sequences = []
    batch_labels = []
    batch_chunk = data.iloc[i_start:(i_start+batch_size)+seq_length].values
    for i in range(0, batch_size):
        sequence = batch_chunk[(i_start+i):(i_start+i)+seq_length];
        label = data.iloc[(i_start+i)+seq_length].values;
        batch_labels.append(label)
        batch_sequences.append(sequence)
    return np.array(batch_sequences), np.array(batch_labels);

For such input, the output of this method:

get_batch(data, batch_num=0, batch_size=2, seq_length=3):

It will be:

x = [
      [[1],[2],[3]],
      [[2],[3],[4]]
    ]

This is my imagination of the model :

model = Sequential()
model.add(LSTM(256, return_sequences=True, input_shape=(seq_length, num_features)))
model.add(Dropout(0.2))
model.add(LSTM(256))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam')

My question is how do I turn my method into a generator?

Solution

This is a solution that uses Sequence, which is like the generator in Keras:

class MySequence(Sequence):
  def __init__(self, num_batches):
    self.num_batches = num_batches

def __len__(self):
    return self.num_batches # the length is the number of batches

def __getitem__(self, batch_id):
    return get_batch(data, batch_id, self.batch_size, seq_length)

I think this is more concise and doesn’t modify your original functionality. Now you pass an instance of MySequence to the model.fit_generator.

Related Problems and Solutions