Keras backends

What is a "backend"?

Keras is a model-level library, providing high-level building blocks for developing deep learning models. It does not handle itself low-level operations such as tensor products, convolutions and so on. Instead, it relies on a specialized, well-optimized tensor manipulation library to do so, serving as the "backend engine" of Keras. Rather than picking one single tensor library and making the implementation of Keras tied to that library, Keras handles the problem in a modular way, and several different backend engines can be plugged seamlessly into Keras.

At this time, Keras has two backend implementations available: the TensorFlow backend and the Theano backend.

TensorFlow is an open-source symbolic tensor manipulation framework developed by Google, Inc.
Theano is an open-source symbolic tensor manipulation framework developed by LISA/MILA Lab at Université de Montréal.

In the future, we are likely to add more backend options. If you are interested in developing a new backend, get in touch!

Switching from one backend to another

If you have run Keras at least once, you will find the Keras configuration file at:

~/.keras/keras.json

If it isn't there, you can create it.

The default configuration file looks like this:

{
    "image_dim_ordering": "tf",
    "epsilon": 1e-07,
    "floatx": "float32",
    "backend": "tensorflow"
}

Simply change the field backend to either "theano" or "tensorflow", and Keras will use the new configuration next time you run any Keras code.

You can also define the environment variable KERAS_BACKEND and this will override what is defined in your config file :

KERAS_BACKEND=tensorflow python -c "from keras import backend"
Using TensorFlow backend.

keras.json details

{
    "image_dim_ordering": "tf",
    "epsilon": 1e-07,
    "floatx": "float32",
    "backend": "tensorflow"
}

You can change these settings by editing ~/.keras/keras.json.

image_dim_ordering: string, either "tf" or "th". It specifies which dimension ordering convention Keras will follow. (keras.backend.image_dim_ordering() returns it.)
For 2D data (e.g. image), "tf" assumes (rows, cols, channels) while "th" assumes (channels, rows, cols).
For 3D data, "tf" assumes (conv_dim1, conv_dim2, conv_dim3, channels) while "th" assumes (channels, conv_dim1, conv_dim2, conv_dim3).
epsilon: float, a numeric fuzzing constant used to avoid dividing by zero in some operations.
floatx: string, "float16", "float32", or "float64". Default float precision.
backend: string, "tensorflow" or "theano".

Using the abstract Keras backend to write new code

If you want the Keras modules you write to be compatible with both Theano and TensorFlow, you have to write them via the abstract Keras backend API. Here's an intro.

You can import the backend module via:

from keras import backend as K

The code below instantiates an input placeholder. It's equivalent to tf.placeholder() or T.matrix(), T.tensor3(), etc.

input = K.placeholder(shape=(2, 4, 5))
# also works:
input = K.placeholder(shape=(None, 4, 5))
# also works:
input = K.placeholder(ndim=3)

The code below instantiates a shared variable. It's equivalent to tf.variable() or theano.shared().

val = np.random.random((3, 4, 5))
var = K.variable(value=val)

# all-zeros variable:
var = K.zeros(shape=(3, 4, 5))
# all-ones:
var = K.ones(shape=(3, 4, 5))

Most tensor operations you will need can be done as you would in TensorFlow or Theano:

a = b + c * K.abs(d)
c = K.dot(a, K.transpose(b))
a = K.sum(b, axis=2)
a = K.softmax(b)
a = concatenate([b, c], axis=-1)
# etc...

Backend functions

set_epsilon

set_epsilon(e)

Sets the value of the fuzz factor used in numeric expressions.

Arguments

e: float. New value of epsilon.

Example

>>> from keras import backend as K
>>> K.epsilon()
1e-08
>>> K.set_epsilon(1e-05)
>>> K.epsilon()
1e-05

floatx

floatx()

Returns the default float type, as a string (e.g. 'float16', 'float32', 'float64').

Returns

String, the current default float type.

Example

>>> keras.backend.floatx()
'float32'

set_floatx

set_floatx(floatx)

Sets the default float type.

Arguments

String: 'float16', 'float32', or 'float64'.

Example

>>> from keras import backend as K
>>> K.floatx()
'float32'
>>> K.set_floatx('float16')
>>> K.floatx()
'float16'

cast_to_floatx

cast_to_floatx(x)

Cast a Numpy array to the default Keras float type.

Arguments

x: Numpy array.

Returns

The same Numpy array, cast to its new type.

Example

>>> from keras import backend as K
>>> K.floatx()
'float32'
>>> arr = numpy.array([1.0, 2.0], dtype='float64')
>>> arr.dtype
dtype('float64')
>>> new_arr = K.cast_to_floatx(arr)
>>> new_arr
array([ 1.,  2.], dtype=float32)
>>> new_arr.dtype
dtype('float32')

image_dim_ordering

image_dim_ordering()

Returns the default image dimension ordering convention ('th' or 'tf').

Returns

A string, either 'th' or 'tf'

Example

>>> keras.backend.image_dim_ordering()
'th'

set_image_dim_ordering

set_image_dim_ordering(dim_ordering)

Sets the value of the image dimension ordering convention ('th' or 'tf').

Arguments

dim_ordering: string. 'th' or 'tf'.

Example

>>> from keras import backend as K
>>> K.image_dim_ordering()
'th'
>>> K.set_image_dim_ordering('tf')
>>> K.image_dim_ordering()
'tf'

get_uid

get_uid(prefix='')

Provides a unique UID given a string prefix.

Arguments

prefix: string.

Returns

An integer.

Example

>>> keras.backend.get_uid('dense')
>>> 1
>>> keras.backend.get_uid('dense')
>>> 2

is_keras_tensor

is_keras_tensor(x)

Returns whether x is a Keras tensor.

Arguments

x: a potential tensor.

Returns

A boolean: whether the argument is a Keras tensor.

Examples

>>> from keras import backend as K
>>> np_var = numpy.array([1, 2])
>>> K.is_keras_tensor(np_var)
False
>>> keras_var = K.variable(np_var)
>>> K.is_keras_tensor(keras_var)  # A variable is not a Tensor.
False
>>> keras_placeholder = K.placeholder(shape=(2, 4, 5))
>>> K.is_keras_tensor(keras_placeholder)  # A placeholder is a Tensor.
True

epsilon

epsilon()

Returns the value of the fuzz factor used in numeric expressions.

Returns

A float.

Example

>>> keras.backend.epsilon()
1e-08

variable

variable(value, dtype=None, name=None)

Instantiates a variable and returns it.

Arguments

value: Numpy array, initial value of the tensor.
dtype: Tensor type.
name: Optional name string for the tensor.

Returns

A variable instance (with Keras metadata included).

placeholder

placeholder(shape=None, ndim=None, dtype=None, sparse=False, name=None)

Instantiate an input data placeholder variable.

shape

shape(x)

Returns the shape of a tensor.

Warning: type returned will be different for Theano backend (Theano tensor type) and TF backend (TF TensorShape).

int_shape

int_shape(x)

Returns the shape of a Keras tensor or a Keras variable as a tuple of integers or None entries.

Arguments

x: Tensor or variable.

Returns

A tuple of integers (or None entries).

eval

eval(x)

Returns the value of a tensor.

zeros

zeros(shape, dtype=None, name=None)

Instantiates an all-zeros variable.

ones

ones(shape, dtype=None, name=None)

Instantiates an all-ones variable.

eye

eye(size, dtype=None, name=None)

Instantiates an identity matrix.

count_params

count_params(x)

Returns the number of scalars in a tensor.

Return: numpy integer.

batch_dot

batch_dot(x, y, axes=None)

Batchwise dot product.

batch_dot results in a tensor with less dimensions than the input. If the number of dimensions is reduced to 1, we use expand_dims to make sure that ndim is at least 2.

Arguments

x, y: tensors with ndim >= 2 - axes: list (or single) int with target dimensions

Returns

A tensor with shape equal to the concatenation of x's shape (less the dimension that was summed over) and y's shape (less the batch dimension and the dimension that was summed over). If the final rank is 1, we reshape it to (batch_size, 1).

Examples

Assume x = [[1, 2], [3, 4]] and y = [[5, 6], [7, 8]] batch_dot(x, y, axes=1) = [[17, 53]] which is the main diagonal of x.dot(y.T), although we never have to calculate the off-diagonal elements.

Shape inference: Let x's shape be (100, 20) and y's shape be (100, 30, 20). If dot_axes is (1, 2), to find the output shape of resultant tensor, loop through each dimension in x's shape and y's shape: x.shape[0] : 100 : append to output shape x.shape[1] : 20 : do not append to output shape, dimension 1 of x has been summed over. (dot_axes[0] = 1) y.shape[0] : 100 : do not append to output shape, always ignore first dimension of y y.shape[1] : 30 : append to output shape y.shape[2] : 20 : do not append to output shape, dimension 2 of y has been summed over. (dot_axes[1] = 2)

output_shape = (100, 30)

gather

gather(reference, indices)

reference: a tensor. - indices: an int tensor of indices.

Return: a tensor of same type as reference.

sum

sum(x, axis=None, keepdims=False)

Sum of the values in a tensor, alongside the specified axis.

prod

prod(x, axis=None, keepdims=False)

Multiply the values in a tensor, alongside the specified axis.

mean

mean(x, axis=None, keepdims=False)

Mean of a tensor, alongside the specified axis.

any

any(x, axis=None, keepdims=False)

Bitwise reduction (logical OR).

all

all(x, axis=None, keepdims=False)

Bitwise reduction (logical AND).

normalize_batch_in_training

normalize_batch_in_training(x, gamma, beta, reduction_axes, epsilon=0.001)

Computes mean and std for batch then apply batch_normalization on batch.

batch_normalization

batch_normalization(x, mean, var, beta, gamma, epsilon=0.001)

Apply batch normalization on x given mean, var, beta and gamma.

permute_dimensions

permute_dimensions(x, pattern)

Transpose dimensions.

pattern should be a tuple or list of dimension indices, e.g. [0, 2, 1].

repeat_elements

repeat_elements(x, rep, axis)

Repeat the elements of a tensor along an axis, like np.repeat.

If x has shape (s1, s2, s3) and axis=1, the output will have shape (s1, s2 * rep, s3).

resize_images

resize_images(X, height_factor, width_factor, dim_ordering)

Resize the images contained in a 4D tensor of shape - [batch, channels, height, width] (for 'th' dim_ordering) - [batch, height, width, channels] (for 'tf' dim_ordering) by a factor of (height_factor, width_factor). Both factors should be positive integers.

resize_volumes

resize_volumes(X, depth_factor, height_factor, width_factor, dim_ordering)

Resize the volume contained in a 5D tensor of shape - [batch, channels, depth, height, width] (for 'th' dim_ordering) - [batch, depth, height, width, channels] (for 'tf' dim_ordering) by a factor of (depth_factor, height_factor, width_factor). Both factors should be positive integers.

repeat

repeat(x, n)

Repeat a 2D tensor.

If x has shape (samples, dim) and n=2, the output will have shape (samples, 2, dim).

arange

arange(start, stop=None, step=1, dtype='int32')

Creates a 1-D tensor containing a sequence of integers.

The function arguments use the same convention as Theano's arange: if only one argument is provided, it is in fact the "stop" argument.

The default type of the returned tensor is 'int32' to match TensorFlow's default.

batch_flatten

batch_flatten(x)

Turn a n-D tensor into a 2D tensor where the first dimension is conserved.

expand_dims

expand_dims(x, dim=-1)

Add a 1-sized dimension at index "dim".

squeeze

squeeze(x, axis)

Remove a 1-dimension from the tensor at index "axis".

temporal_padding

temporal_padding(x, padding=1)

Pad the middle dimension of a 3D tensor with "padding" zeros left and right.

Apologies for the inane API, but Theano makes this really hard.

asymmetric_temporal_padding

asymmetric_temporal_padding(x, left_pad=1, right_pad=1)

Pad the middle dimension of a 3D tensor with "left_pad" zeros left and "right_pad" right.

Apologies for the inane API, but Theano makes this really hard.

spatial_2d_padding

spatial_2d_padding(x, padding=(1, 1), dim_ordering='default')

Pad the 2nd and 3rd dimensions of a 4D tensor with "padding[0]" and "padding[1]" (resp.) zeros left and right.

asymmetric_spatial_2d_padding

asymmetric_spatial_2d_padding(x, top_pad=1, bottom_pad=1, left_pad=1, right_pad=1, dim_ordering='default')

Pad the rows and columns of a 4D tensor with "top_pad", "bottom_pad", "left_pad", "right_pad" (resp.) zeros rows on top, bottom; cols on left, right.

spatial_3d_padding

spatial_3d_padding(x, padding=(1, 1, 1), dim_ordering='default')

Pad the 2nd, 3rd and 4th dimensions of a 5D tensor with "padding[0]", "padding[1]" and "padding[2]" (resp.) zeros left and right.

one_hot

one_hot(indices, nb_classes)

Input: nD integer tensor of shape (batch_size, dim1, dim2, ... dim(n-1)) - Output: (n + 1)D one hot representation of the input with shape (batch_size, dim1, dim2, ... dim(n-1), nb_classes)

reverse

reverse(x, axes)

Reverse a tensor along the the specified axes

batch_get_value

batch_get_value(xs)

Returns the value of more than one tensor variable, as a list of Numpy arrays.

print_tensor

print_tensor(x, message='')

Print the message and the tensor when evaluated and return the same tensor.

stop_gradient

stop_gradient(variables)

Returns variables but with zero gradient with respect to every other variables.

rnn

rnn(step_function, inputs, initial_states, go_backwards=False, mask=None, constants=None, unroll=False, input_length=None)

Iterates over the time dimension of a tensor.

Arguments

inputs: tensor of temporal data of shape (samples, time, ...) (at least 3D).
step_function:
Parameters:
- input: tensor with shape (samples, ...) (no time dimension), representing input for the batch of samples at a certain time step.
- states: list of tensors.
Returns:
- output: tensor with shape (samples, ...) (no time dimension),
- new_states: list of tensors, same length and shapes as 'states'.
initial_states: tensor with shape (samples, ...) (no time dimension), containing the initial values for the states used in the step function.
go_backwards: boolean. If True, do the iteration over the time dimension in reverse order.
mask: binary tensor with shape (samples, time), with a zero for every element that is masked.
constants: a list of constant values passed at each step.
unroll: whether to unroll the RNN or to use a symbolic loop (scan).
input_length: must be specified if using unroll.

Returns

A tuple (last_output, outputs, new_states). - last_output: the latest output of the rnn, of shape (samples, ...) - outputs: tensor with shape (samples, time, ...) where each entry outputs[s, t] is the output of the step function at time t for sample s. - new_states: list of tensors, latest states returned by the step function, of shape (samples, ...).

switch

switch(condition, then_expression, else_expression)

condition: scalar tensor.

elu

elu(x, alpha=1.0)

Exponential linear unit

Arguments

x: Tensor to compute the activation function for.
alpha: scalar

dropout

dropout(x, level, noise_shape=None, seed=None)

Sets entries in x to zero at random, while scaling the entire tensor.

Arguments

x: tensor
level: fraction of the entries in the tensor that will be set to 0.
noise_shape: shape for randomly generated keep/drop flags, must be broadcastable to the shape of x
seed: random seed to ensure determinism.

in_top_k

in_top_k(predictions, targets, k)

Returns whether the targets are in the top k predictions

Arguments

predictions: A tensor of shape batch_size x classess and type float32.
targets: A tensor of shape batch_size and type int32 or int64.
k: An int, number of top elements to consider.

Returns

A tensor of shape batch_size and type int. output_i is 1 if targets_i is within top-k values of predictions_i

conv1d

conv1d(x, kernel, stride=1, border_mode='valid', image_shape=None, filter_shape=None)

1D convolution.

Arguments

kernel: kernel tensor.
strides: stride integer.
border_mode: string, "same" or "valid".

conv2d

conv2d(x, kernel, strides=(1, 1), border_mode='valid', dim_ordering='default', image_shape=None, filter_shape=None, filter_dilation=(1, 1))

2D convolution.

Arguments

kernel: kernel tensor.
strides: strides tuple.
border_mode: string, "same" or "valid".
dim_ordering: "tf" or "th". Whether to use Theano or TensorFlow dimension ordering in inputs/kernels/ouputs.

deconv2d

deconv2d(x, kernel, output_shape, strides=(1, 1), border_mode='valid', dim_ordering='default', image_shape=None, filter_shape=None)

2D deconvolution (transposed convolution).

Arguments

kernel: kernel tensor.
output_shape: desired dimensions of output.
strides: strides tuple.
border_mode: string, "same" or "valid".
dim_ordering: "tf" or "th". Whether to use Theano or TensorFlow dimension ordering in inputs/kernels/ouputs.

conv3d

conv3d(x, kernel, strides=(1, 1, 1), border_mode='valid', dim_ordering='default', volume_shape=None, filter_shape=None, filter_dilation=(1, 1, 1))

3D convolution.

Arguments

kernel: kernel tensor.
strides: strides tuple.
border_mode: string, "same" or "valid".
dim_ordering: "tf" or "th". Whether to use Theano or TensorFlow dimension ordering in inputs/kernels/ouputs.

ctc_batch_cost

ctc_batch_cost(y_true, y_pred, input_length, label_length)

Runs CTC loss algorithm on each batch element.

Arguments

y_true: tensor (samples, max_string_length) containing the truth labels
y_pred: tensor (samples, time_steps, num_categories) containing the prediction, or output of the softmax
input_length: tensor (samples,1) containing the sequence length for each batch item in y_pred
label_length: tensor (samples,1) containing the sequence length for each batch item in y_true

Returns

Tensor with shape (samples,1) containing the CTC loss of each element

map_fn

map_fn(fn, elems, name=None)

Map the function fn over the elements elems and return the outputs.

Arguments

fn: Callable that will be called upon each element in elems
elems: tensor, at least 2 dimensional
name: A string name for the map node in the graph

Returns

Tensor with first dimension equal to the elems and second depending on fn

foldl

foldl(fn, elems, initializer=None, name=None)

Reduce elems using fn to combine them from left to right.

Arguments

fn: Callable that will be called upon each element in elems and an accumulator, for instance lambda acc, x: acc + x
elems: tensor
initializer: The first value used (elems[0] in case of None)
name: A string name for the foldl node in the graph

Returns

Same type and shape as initializer

foldr

foldr(fn, elems, initializer=None, name=None)

Reduce elems using fn to combine them from right to left.

Arguments

fn: Callable that will be called upon each element in elems and an accumulator, for instance lambda acc, x: acc + x
elems: tensor
initializer: The first value used (elems[-1] in case of None)
name: A string name for the foldr node in the graph

Returns

Same type and shape as initializer

backend

backend()

Publicly accessible method for determining the current backend.