How do I use fit_generator with sequential data divided into batches?… here is a solution to the problem.
How do I use fit_generator with sequential data divided into batches?
I’m trying to write a generator for my Keras lstm model. Use it with the fit_generator method.
My first question is what should my generator return? Batch? Sequence?
The example in the Keras documentation returns x,y for each data entry, but what if my data is sequential? I want to divide it into batches?
This is a Python method that creates a batch for a given input
def get_batch(data, batch_num, batch_size, seq_length):
i_start = batch_num*batch_size;
batch_sequences = []
batch_labels = []
batch_chunk = data.iloc[i_start:(i_start+batch_size)+seq_length].values
for i in range(0, batch_size):
sequence = batch_chunk[(i_start+i):(i_start+i)+seq_length];
label = data.iloc[(i_start+i)+seq_length].values;
batch_labels.append(label)
batch_sequences.append(sequence)
return np.array(batch_sequences), np.array(batch_labels);
For such input, the output of this method:
get_batch(data, batch_num=0, batch_size=2, seq_length=3):
It will be:
x = [
[[1],[2],[3]],
[[2],[3],[4]]
]
This is my imagination of the model :
model = Sequential()
model.add(LSTM(256, return_sequences=True, input_shape=(seq_length, num_features)))
model.add(Dropout(0.2))
model.add(LSTM(256))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam')
My question is how do I turn my method into a generator?
Solution
This is a solution that uses Sequence, which is like the generator in Keras:
class MySequence(Sequence):
def __init__(self, num_batches):
self.num_batches = num_batches
def __len__(self):
return self.num_batches # the length is the number of batches
def __getitem__(self, batch_id):
return get_batch(data, batch_id, self.batch_size, seq_length)
I think this is more concise and doesn’t modify your original functionality. Now you pass an instance of MySequence
to the model.fit_generator
.