Keras backends
What is a "backend"?
Keras is a model-level library, providing high-level building blocks for developing deep learning models. It does not handle itself low-level operations such as tensor products, convolutions and so on. Instead, it relies on a specialized, well-optimized tensor manipulation library to do so, serving as the "backend engine" of Keras. Rather than picking one single tensor library and making the implementation of Keras tied to that library, Keras handles the problem in a modular way, and several different backend engines can be plugged seamlessly into Keras.
At this time, Keras has two backend implementations available: the Theano backend and the TensorFlow backend.
- Theano is an open-source symbolic tensor manipulation framework developed by LISA/MILA Lab at Université de Montréal.
- TensorFlow is an open-source symbolic tensor manipulation framework developed by Google, Inc.
Switching from one backend to another
If you have run Keras at least once, you will find the Keras configuration file at:
~/.keras/keras.json
If it isn't there, you can create it.
It probably looks like this:
{"epsilon": 1e-07, "floatx": "float32", "backend": "theano"}
Simply change the field backend
to either "theano"
or "tensorflow"
, and Keras will use the new configuration next time you run any Keras code.
You can also define the environment variable KERAS_BACKEND
and this will
override what is defined in your config file :
KERAS_BACKEND=tensorflow python -c "from keras import backend; print backend._BACKEND"
Using TensorFlow backend.
tensorflow
Using the abstract Keras backend to write new code
If you want the Keras modules you write to be compatible with both Theano and TensorFlow, you have to write them via the abstract Keras backend API. Here's an intro.
You can import the backend module via:
from keras import backend as K
The code below instantiates an input placeholder. It's equivalent to tf.placeholder()
or T.matrix()
, T.tensor3()
, etc.
input = K.placeholder(shape=(2, 4, 5))
# also works:
input = K.placeholder(shape=(None, 4, 5))
# also works:
input = K.placeholder(ndim=3)
The code below instantiates a shared variable. It's equivalent to tf.variable()
or theano.shared()
.
val = np.random.random((3, 4, 5))
var = K.variable(value=val)
# all-zeros variable:
var = K.zeros(shape=(3, 4, 5))
# all-ones:
var = K.ones(shape=(3, 4, 5))
Most tensor operations you will need can be done as you would in TensorFlow or Theano:
a = b + c * K.abs(d)
c = K.dot(a, K.transpose(b))
a = K.sum(b, axis=2)
a = K.softmax(b)
a = concatenate([b, c], axis=-1)
# etc...
Backend functions
epsilon
epsilon()
Returns the value of the fuzz factor used in numeric expressions.
set_epsilon
set_epsilon(e)
Sets the value of the fuzz factor used in numeric expressions.
floatx
floatx()
Returns the default float type, as a string (e.g. 'float16', 'float32', 'float64').
cast_to_floatx
cast_to_floatx(x)
Cast a Numpy array to floatx.
image_dim_ordering
image_dim_ordering()
Returns the image dimension ordering convention ('th' or 'tf').
set_image_dim_ordering
set_image_dim_ordering(dim_ordering)
Sets the value of the image dimension ordering convention ('th' or 'tf').
variable
variable(value, dtype='float32', name=None)
Instantiate a tensor variable.
placeholder
placeholder(shape=None, ndim=None, dtype='float32', name=None)
Instantiate an input data placeholder variable.
shape
shape(x)
Return the shape of a tensor.
- Warning: type returned will be different for Theano backend (Theano tensor type) and TF backend (TF TensorShape).
eval
eval(x)
Run a graph.
zeros
zeros(shape, dtype='float32', name=None)
Instantiate an all-zeros variable.
ones
ones(shape, dtype='float32', name=None)
Instantiate an all-ones variable.
eye
eye(size, dtype='float32', name=None)
Instantiate an identity matrix.
count_params
count_params(x)
Return number of scalars in a tensor.
- Return: numpy integer.
batch_dot
batch_dot(x, y, axes=None)
batchwise dot product
batch_dot results in a tensor with less dimensions than the input.
If the number of dimensions is reduced to 1, we use expand_dims
to
make sure that ndim is at least 2.
Example
Assume x = [[1, 2], [3, 4]] and y = [[5, 6], [7, 8]] batch_dot(x, y, axes=1) = [[17, 53]] which is the main diagonal of x.dot(y.T), although we never have to calculate the off-diagonal elements.
Arguments
x, y: tensors with ndim >= 2 - axes: list (or single) int with target dimensions
Returns
Tensor with ndim >= 2
gather
gather(reference, indices)
reference: a tensor. - indices: an int tensor of indices.
- Return: a tensor of same type as reference.
sum
sum(x, axis=None, keepdims=False)
Sum of the values in a tensor, alongside the specified axis.
prod
prod(x, axis=None, keepdims=False)
Multiply the values in a tensor, alongside the specified axis.
any
any(x, axis=None, keepdims=False)
Bitwise reduction (logical OR).
all
all(x, axis=None, keepdims=False)
Bitwise reduction (logical AND).
normalize_batch_in_training
normalize_batch_in_training(x, gamma, beta, reduction_axes, epsilon=0.0001)
Compute mean and std for batch then apply batch_normalization on batch.
batch_normalization
batch_normalization(x, mean, std, beta, gamma, epsilon=0.0001)
Apply batch normalization on x given mean, std, beta and gamma.
permute_dimensions
permute_dimensions(x, pattern)
Transpose dimensions.
pattern should be a tuple or list of dimension indices, e.g. [0, 2, 1].
repeat_elements
repeat_elements(x, rep, axis)
Repeat the elements of a tensor along an axis, like np.repeat.
If x has shape (s1, s2, s3) and axis=1, the output will have shape (s1, s2 * rep, s3).
resize_images
resize_images(X, height_factor, width_factor, dim_ordering)
Resize the images contained in a 4D tensor of shape - [batch, channels, height, width] (for 'th' dim_ordering) - [batch, height, width, channels] (for 'tf' dim_ordering) by a factor of (height_factor, width_factor). Both factors should be positive integers.
resize_volumes
resize_volumes(X, depth_factor, height_factor, width_factor, dim_ordering)
Resize the volume contained in a 5D tensor of shape - [batch, channels, depth, height, width] (for 'th' dim_ordering) - [batch, depth, height, width, channels] (for 'tf' dim_ordering) by a factor of (depth_factor, height_factor, width_factor). Both factors should be positive integers.
repeat
repeat(x, n)
Repeat a 2D tensor.
If x has shape (samples, dim) and n=2, the output will have shape (samples, 2, dim).
batch_flatten
batch_flatten(x)
Turn a n-D tensor into a 2D tensor where the first dimension is conserved.
expand_dims
expand_dims(x, dim=-1)
Add a 1-sized dimension at index "dim".
squeeze
squeeze(x, axis)
Remove a 1-dimension from the tensor at index "axis".
temporal_padding
temporal_padding(x, padding=1)
Pad the middle dimension of a 3D tensor with "padding" zeros left and right.
Apologies for the inane API, but Theano makes this really hard.
spatial_2d_padding
spatial_2d_padding(x, padding=(1, 1), dim_ordering='th')
Pad the 2nd and 3rd dimensions of a 4D tensor with "padding[0]" and "padding[1]" (resp.) zeros left and right.
spatial_3d_padding
spatial_3d_padding(x, padding=(1, 1, 1), dim_ordering='th')
Pad the 2nd, 3rd and 4th dimensions of a 5D tensor with "padding[0]", "padding[1]" and "padding[2]" (resp.) zeros left and right.
batch_get_value
batch_get_value(xs)
Returns the value of more than one tensor variable, as a list of Numpy arrays.
stop_gradient
stop_gradient(variables)
Returns variables
but with zero gradient with respect to every other
variables.
rnn
rnn(step_function, inputs, initial_states, go_backwards=False, mask=None, constants=None, unroll=False, input_length=None)
Iterates over the time dimension of a tensor.
Arguments
- inputs: tensor of temporal data of shape (samples, time, ...) (at least 3D).
- step_function:
- Parameters:
- input: tensor with shape (samples, ...) (no time dimension), representing input for the batch of samples at a certain time step.
- states: list of tensors.
- Returns:
- output: tensor with shape (samples, ...) (no time dimension),
- new_states: list of tensors, same length and shapes as 'states'.
- initial_states: tensor with shape (samples, ...) (no time dimension), containing the initial values for the states used in the step function.
- go_backwards: boolean. If True, do the iteration over the time dimension in reverse order.
- mask: binary tensor with shape (samples, time), with a zero for every element that is masked.
- constants: a list of constant values passed at each step.
- unroll: whether to unroll the RNN or to use a symbolic loop (
scan
). - input_length: must be specified if using
unroll
.
Returns
A tuple (last_output, outputs, new_states). - last_output: the latest output of the rnn, of shape (samples, ...) - outputs: tensor with shape (samples, time, ...) where each entry outputs[s, t] is the output of the step function at time t for sample s. - new_states: list of tensors, latest states returned by the step function, of shape (samples, ...).
switch
switch(condition, then_expression, else_expression)
condition: scalar tensor.
conv2d
conv2d(x, kernel, strides=(1, 1), border_mode='valid', dim_ordering='th', image_shape=None, filter_shape=None, filter_dilation=(1, 1))
2D convolution.
Arguments
- kernel: kernel tensor.
- strides: strides tuple.
- border_mode: string, "same" or "valid".
- dim_ordering: "tf" or "th". Whether to use Theano or TensorFlow dimension ordering in inputs/kernels/ouputs.
conv3d
conv3d(x, kernel, strides=(1, 1, 1), border_mode='valid', dim_ordering='th', volume_shape=None, filter_shape=None)
Run on cuDNN if available. - border_mode: string, "same" or "valid".