Create custom datasets
I
guess this is a broadly applicable question, but I’m trying to create a dataset for a specific competition that involves flying a drone over a field with cardboard geometry and painted with alphanumeric characters. The goal is to detect and classify shapes and characters.
Currently, I’m using SURF to detect shapes, K-means to split shapes and characters, and convolutional neural networks to classify them. However, I hit a bottleneck when it comes to training data that can handle real data well.
I tried
Using Keas’ ImageDataGenerator generates random rotation, scaling, and skew of template images for each alphanumeric character of the dataset typing fonts and geometric shapes: too deviant
Use the MNIST dataset: silent, but only numeric
Use the EMNIST ByClass dataset (which is different from the MNIST dataset; Also contains letters): It is not easy to train due to size, and even if it is trained to a suitable level, it does not perform well with high accuracy. In the dataset itself, many images bear little resemblance to the claimed categories, and some categories have different rotation angles than others
Use Tesseract OCR for characters. This does not have a good result
Something I haven’t tried yet
Make several flyover flights using the real cardboard Papercut we created and use multiple frames from each video for the dataset. Cons: This will require a lot of nautical and cardboard cutouts and won’t provide much variation in the data.
Use ImageDataGenerator, but use several different fonts instead of one.
Does anyone have any suggestions on how to create a custom dataset for such a task?
Solution
This is where my dataSetGenerator might help you generate your own dataset
import numpy as np
from os import listdir
from glob import glob
import cv2
def dataSetGenerator(path,resize=False,resize_to=224,percentage=100):
"""
DataSetsFolder
|
|----------class-1
| . |-------image-1
| . | .
| . | .
| . | .
| . |-------image-n
| .
|-------class-n
:param path: <path>/DataSetsFolder
:param resize:
:param resize_to:
:param percentage:
:return: images, labels, classes
"""
classes = listdir(path)
image_list = []
labels = []
for classe in classes:
for filename in glob(path+'/'+classe+'/*.tif'):
if resize:image_list.append(cv2.resize(cv2.imread(filename),(resize_to, resize_to)))
else:image_list.append(cv2.imread(filename))
label=np.zeros(len(classes))
label[classes.index(classe)]=1
labels.append(label)
indice = np.random.permutation(len(image_list))[:int(len(image_list)*percentage/100)]
return np.array([image_list[x] for x in indice]),np.array([labels[x] for x in indice]),np.array(classes)