ImageDataGenerator

keras.preprocessing.image.ImageDataGenerator(featurewise_center=True,
    samplewise_center=False,
    featurewise_std_normalization=True,
    samplewise_std_normalization=False,
    zca_whitening=False,
    rotation_range=0.,
    width_shift_range=0.,
    height_shift_range=0.,
    shear_range=0.,
    horizontal_flip=False,
    vertical_flip=False)

Generate batches of tensor image data with real-time data augmentation. The data will be looped over (in batches) indefinitely.

  • Arguments:

    • featurewise_center: Boolean. Set input mean to 0 over the dataset.
    • samplewise_center: Boolean. Set each sample mean to 0.
    • featurewise_std_normalization: Boolean. Divide inputs by std of the dataset.
    • samplewise_std_normalization: Boolean. Divide each input by its std.
    • zca_whitening: Boolean. Apply ZCA whitening.
    • rotation_range: Int. Degree range for random rotations.
    • width_shift_range: Float (fraction of total width). Range for random horizontal shifts.
    • height_shift_range: Float (fraction of total height). Range for random vertical shifts.
    • shear_range: Float. Shear Intensity (Shear angle in counter-clockwise direction as radians)
    • horizontal_flip: Boolean. Randomly flip inputs horizontally.
    • vertical_flip: Boolean. Randomly flip inputs vertically.
  • Methods:

    • fit(X): Required if featurewise_center or featurewise_std_normalization or zca_whitening. Compute necessary quantities on some sample data.
      • Arguments:
        • X: sample data.
        • augment: Boolean (default: False). Whether to fit on randomly augmented samples.
        • rounds: int (default: 1). If augment, how many augmentation passes over the data to use.
    • flow(X, y):
      • Arguments:
        • X: data.
        • y: labels.
        • batch_size: int (default: 32).
        • shuffle: boolean (defaut: False).
        • save_to_dir: None or str. This allows you to optimally specify a directory to which to save the augmented pictures being generated (useful for visualizing what you are doing).
        • save_prefix: str. Prefix to use for filenames of saved pictures.
        • save_format: one of "png", jpeg".
  • Example:

(X_train, y_train), (X_test, y_test) = cifar10.load_data(test_split=0.1)
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)

datagen = ImageDataGenerator(
    featurewise_center=True,
    featurewise_std_normalization=True,
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2,
    horizontal_flip=True)

# compute quantities required for featurewise normalization
# (std, mean, and principal components if ZCA whitening is applied)
datagen.fit(X_train)

# fits the model on batches with real-time data augmentation:
model.fit_generator(datagen.flow(X_train, Y_train, batch_size=32),
                    samples_per_epoch=len(X_train), nb_epoch=nb_epoch)

# here's a more "manual" example
for e in range(nb_epoch):
    print 'Epoch', e
    batches = 0
    for X_batch, Y_batch in datagen.flow(X_train, Y_train, batch_size=32):
        loss = model.train(X_batch, Y_batch)
        batches += 1
        if batches >= len(X_train) / 32:
            # we need to break the loop by hand because
            # the generator loops indefinitely
            break