Image Preprocessing

ImageDataGenerator

keras.preprocessing.image.ImageDataGenerator(featurewise_center=True,
    samplewise_center=False,
    featurewise_std_normalization=True,
    samplewise_std_normalization=False,
    zca_whitening=False,
    rotation_range=0.,
    width_shift_range=0.,
    height_shift_range=0.,
    shear_range=0.,
    horizontal_flip=False,
    vertical_flip=False)

Generate batches of tensor image data with real-time data augmentation. The data will be looped over (in batches) indefinitely.

Arguments:
- featurewise_center: Boolean. Set input mean to 0 over the dataset.
- samplewise_center: Boolean. Set each sample mean to 0.
- featurewise_std_normalization: Boolean. Divide inputs by std of the dataset.
- samplewise_std_normalization: Boolean. Divide each input by its std.
- zca_whitening: Boolean. Apply ZCA whitening.
- rotation_range: Int. Degree range for random rotations.
- width_shift_range: Float (fraction of total width). Range for random horizontal shifts.
- height_shift_range: Float (fraction of total height). Range for random vertical shifts.
- shear_range: Float. Shear Intensity (Shear angle in counter-clockwise direction as radians)
- horizontal_flip: Boolean. Randomly flip inputs horizontally.
- vertical_flip: Boolean. Randomly flip inputs vertically.
Methods:
- fit(X): Required if featurewise_center or featurewise_std_normalization or zca_whitening. Compute necessary quantities on some sample data.
  - Arguments:
    - X: sample data.
    - augment: Boolean (default: False). Whether to fit on randomly augmented samples.
    - rounds: int (default: 1). If augment, how many augmentation passes over the data to use.
- flow(X, y):
  - Arguments:
    - X: data.
    - y: labels.
    - batch_size: int (default: 32).
    - shuffle: boolean (defaut: False).
    - save_to_dir: None or str. This allows you to optimally specify a directory to which to save the augmented pictures being generated (useful for visualizing what you are doing).
    - save_prefix: str. Prefix to use for filenames of saved pictures.
    - save_format: one of "png", jpeg".
Example:

(X_train, y_train), (X_test, y_test) = cifar10.load_data(test_split=0.1)
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)

datagen = ImageDataGenerator(
    featurewise_center=True,
    featurewise_std_normalization=True,
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2,
    horizontal_flip=True)

# compute quantities required for featurewise normalization
# (std, mean, and principal components if ZCA whitening is applied)
datagen.fit(X_train)

# fits the model on batches with real-time data augmentation:
model.fit_generator(datagen.flow(X_train, Y_train, batch_size=32),
                    samples_per_epoch=len(X_train), nb_epoch=nb_epoch)

# here's a more "manual" example
for e in range(nb_epoch):
    print 'Epoch', e
    batches = 0
    for X_batch, Y_batch in datagen.flow(X_train, Y_train, batch_size=32):
        loss = model.train(X_batch, Y_batch)
        batches += 1
        if batches >= len(X_train) / 32:
            # we need to break the loop by hand because
            # the generator loops indefinitely
            break