Keras backends

What is a "backend"?

Keras is a model-level library, providing high-level building blocks for developing deep learning models. It does not handle itself low-level operations such as tensor products, convolutions and so on. Instead, it relies on a specialized, well-optimized tensor manipulation library to do so, serving as the "backend engine" of Keras. Rather than picking one single tensor library and making the implementation of Keras tied to that library, Keras handles the problem in a modular way, and several different backend engines can be plugged seamlessly into Keras.

At this time, Keras has two backend implementations available: the Theano backend and the TensorFlow backend.

Theano is an open-source symbolic tensor manipulation framework developed by LISA/MILA Lab at Université de Montréal.
TensorFlow is an open-source symbolic tensor manipulation framework developed by Google, Inc.

Switching from one backend to another

If you have run Keras at least once, you will find the Keras configuration file at:

~/.keras/keras.json

If it isn't there, you can create it.

It probably looks like this:

{"epsilon": 1e-07, "floatx": "float32", "backend": "theano"}

Simply change the field backend to either "theano" or "tensorflow", and Keras will use the new configuration next time you run any Keras code.

You can also define the environment variable KERAS_BACKEND and this will override what is defined in your config file :

KERAS_BACKEND=tensorflow python -c "from keras import backend; print backend._BACKEND"
Using TensorFlow backend.
tensorflow

Using the abstract Keras backend to write new code

If you want the Keras modules you write to be compatible with both Theano and TensorFlow, you have to write them via the abstract Keras backend API. Here's an intro.

You can import the backend module via:

from keras import backend as K

The code below instantiates an input placeholder. It's equivalent to tf.placeholder() or T.matrix(), T.tensor3(), etc.

input = K.placeholder(shape=(2, 4, 5))
# also works:
input = K.placeholder(shape=(None, 4, 5))
# also works:
input = K.placeholder(ndim=3)

The code below instantiates a shared variable. It's equivalent to tf.variable() or theano.shared().

val = np.random.random((3, 4, 5))
var = K.variable(value=val)

# all-zeros variable:
var = K.zeros(shape=(3, 4, 5))
# all-ones:
var = K.ones(shape=(3, 4, 5))

Most tensor operations you will need can be done as you would in TensorFlow or Theano:

a = b + c * K.abs(d)
c = K.dot(a, K.transpose(b))
a = K.sum(b, axis=2)
a = K.softmax(b)
a = concatenate([b, c], axis=-1)
# etc...

Backend functions

epsilon

epsilon()

Returns the value of the fuzz factor used in numeric expressions.

set_epsilon

set_epsilon(e)

Sets the value of the fuzz factor used in numeric expressions.

floatx

floatx()

Returns the default float type, as a string (e.g. 'float16', 'float32', 'float64').

cast_to_floatx

cast_to_floatx(x)

Cast a Numpy array to floatx.

image_dim_ordering

image_dim_ordering()

Returns the image dimension ordering convention ('th' or 'tf').

set_image_dim_ordering

set_image_dim_ordering(dim_ordering)

Sets the value of the image dimension ordering convention ('th' or 'tf').

variable

variable(value, dtype='float32', name=None)

Instantiate a tensor variable.

placeholder

placeholder(shape=None, ndim=None, dtype='float32', name=None)

Instantiate an input data placeholder variable.

shape

shape(x)

Return the shape of a tensor.

Warning: type returned will be different for Theano backend (Theano tensor type) and TF backend (TF TensorShape).

eval

eval(x)

Run a graph.

zeros

zeros(shape, dtype='float32', name=None)

Instantiate an all-zeros variable.

ones

ones(shape, dtype='float32', name=None)

Instantiate an all-ones variable.

eye

eye(size, dtype='float32', name=None)

Instantiate an identity matrix.

count_params

count_params(x)

Return number of scalars in a tensor.

Return: numpy integer.

batch_dot

batch_dot(x, y, axes=None)

batchwise dot product batch_dot results in a tensor with less dimensions than the input. If the number of dimensions is reduced to 1, we use expand_dims to make sure that ndim is at least 2.

Example

Assume x = [[1, 2], [3, 4]] and y = [[5, 6], [7, 8]] batch_dot(x, y, axes=1) = [[17, 53]] which is the main diagonal of x.dot(y.T), although we never have to calculate the off-diagonal elements.

Arguments

x, y: tensors with ndim >= 2 - axes: list (or single) int with target dimensions

Returns

Tensor with ndim >= 2

gather

gather(reference, indices)

reference: a tensor. - indices: an int tensor of indices.

Return: a tensor of same type as reference.

sum

sum(x, axis=None, keepdims=False)

Sum of the values in a tensor, alongside the specified axis.

prod

prod(x, axis=None, keepdims=False)

Multiply the values in a tensor, alongside the specified axis.

any

any(x, axis=None, keepdims=False)

Bitwise reduction (logical OR).

all

all(x, axis=None, keepdims=False)

Bitwise reduction (logical AND).

normalize_batch_in_training

normalize_batch_in_training(x, gamma, beta, reduction_axes, epsilon=0.0001)

Compute mean and std for batch then apply batch_normalization on batch.

batch_normalization

batch_normalization(x, mean, std, beta, gamma, epsilon=0.0001)

Apply batch normalization on x given mean, std, beta and gamma.

permute_dimensions

permute_dimensions(x, pattern)

Transpose dimensions.

pattern should be a tuple or list of dimension indices, e.g. [0, 2, 1].

repeat_elements

repeat_elements(x, rep, axis)

Repeat the elements of a tensor along an axis, like np.repeat.

If x has shape (s1, s2, s3) and axis=1, the output will have shape (s1, s2 * rep, s3).

resize_images

resize_images(X, height_factor, width_factor, dim_ordering)

Resize the images contained in a 4D tensor of shape - [batch, channels, height, width] (for 'th' dim_ordering) - [batch, height, width, channels] (for 'tf' dim_ordering) by a factor of (height_factor, width_factor). Both factors should be positive integers.

resize_volumes

resize_volumes(X, depth_factor, height_factor, width_factor, dim_ordering)

Resize the volume contained in a 5D tensor of shape - [batch, channels, depth, height, width] (for 'th' dim_ordering) - [batch, depth, height, width, channels] (for 'tf' dim_ordering) by a factor of (depth_factor, height_factor, width_factor). Both factors should be positive integers.

repeat

repeat(x, n)

Repeat a 2D tensor.

If x has shape (samples, dim) and n=2, the output will have shape (samples, 2, dim).

batch_flatten

batch_flatten(x)

Turn a n-D tensor into a 2D tensor where the first dimension is conserved.

expand_dims

expand_dims(x, dim=-1)

Add a 1-sized dimension at index "dim".

squeeze

squeeze(x, axis)

Remove a 1-dimension from the tensor at index "axis".

temporal_padding

temporal_padding(x, padding=1)

Pad the middle dimension of a 3D tensor with "padding" zeros left and right.

Apologies for the inane API, but Theano makes this really hard.

spatial_2d_padding

spatial_2d_padding(x, padding=(1, 1), dim_ordering='th')

Pad the 2nd and 3rd dimensions of a 4D tensor with "padding[0]" and "padding[1]" (resp.) zeros left and right.

spatial_3d_padding

spatial_3d_padding(x, padding=(1, 1, 1), dim_ordering='th')

Pad the 2nd, 3rd and 4th dimensions of a 5D tensor with "padding[0]", "padding[1]" and "padding[2]" (resp.) zeros left and right.

batch_get_value

batch_get_value(xs)

Returns the value of more than one tensor variable, as a list of Numpy arrays.

stop_gradient

stop_gradient(variables)

Returns variables but with zero gradient with respect to every other variables.

rnn

rnn(step_function, inputs, initial_states, go_backwards=False, mask=None, constants=None, unroll=False, input_length=None)

Iterates over the time dimension of a tensor.

Arguments

inputs: tensor of temporal data of shape (samples, time, ...) (at least 3D).
step_function:
Parameters:
- input: tensor with shape (samples, ...) (no time dimension), representing input for the batch of samples at a certain time step.
- states: list of tensors.
Returns:
- output: tensor with shape (samples, ...) (no time dimension),
- new_states: list of tensors, same length and shapes as 'states'.
initial_states: tensor with shape (samples, ...) (no time dimension), containing the initial values for the states used in the step function.
go_backwards: boolean. If True, do the iteration over the time dimension in reverse order.
mask: binary tensor with shape (samples, time), with a zero for every element that is masked.
constants: a list of constant values passed at each step.
unroll: whether to unroll the RNN or to use a symbolic loop (scan).
input_length: must be specified if using unroll.

Returns

A tuple (last_output, outputs, new_states). - last_output: the latest output of the rnn, of shape (samples, ...) - outputs: tensor with shape (samples, time, ...) where each entry outputs[s, t] is the output of the step function at time t for sample s. - new_states: list of tensors, latest states returned by the step function, of shape (samples, ...).

switch

switch(condition, then_expression, else_expression)

condition: scalar tensor.

conv2d

conv2d(x, kernel, strides=(1, 1), border_mode='valid', dim_ordering='th', image_shape=None, filter_shape=None, filter_dilation=(1, 1))

2D convolution.

Arguments

kernel: kernel tensor.
strides: strides tuple.
border_mode: string, "same" or "valid".
dim_ordering: "tf" or "th". Whether to use Theano or TensorFlow dimension ordering in inputs/kernels/ouputs.

conv3d

conv3d(x, kernel, strides=(1, 1, 1), border_mode='valid', dim_ordering='th', volume_shape=None, filter_shape=None)

Run on cuDNN if available. - border_mode: string, "same" or "valid".