Keras backends
What is a "backend"?
Keras is a model-level library, providing high-level building blocks for developing deep learning models. It does not handle itself low-level operations such as tensor products, convolutions and so on. Instead, it relies on a specialized, well-optimized tensor manipulation library to do so, serving as the "backend engine" of Keras. Rather than picking one single tensor library and making the implementation of Keras tied to that library, Keras handles the problem in a modular way, and several different backend engines can be plugged seamlessly into Keras.
At this time, Keras has two backend implementations available: the TensorFlow backend and the Theano backend.
- TensorFlow is an open-source symbolic tensor manipulation framework developed by Google, Inc.
- Theano is an open-source symbolic tensor manipulation framework developed by LISA/MILA Lab at Université de Montréal.
In the future, we are likely to add more backend options. Go ask Microsoft about how their CNTK backend project is doing.
Switching from one backend to another
If you have run Keras at least once, you will find the Keras configuration file at:
$HOME/.keras/keras.json
If it isn't there, you can create it.
NOTE for Windows Users: Please change $HOME
with %USERPROFILE%
.
The default configuration file looks like this:
{
"image_data_format": "channels_last",
"epsilon": 1e-07,
"floatx": "float32",
"backend": "tensorflow"
}
Simply change the field backend
to either "theano"
or "tensorflow"
, and Keras will use the new configuration next time you run any Keras code.
You can also define the environment variable KERAS_BACKEND
and this will
override what is defined in your config file :
KERAS_BACKEND=tensorflow python -c "from keras import backend"
Using TensorFlow backend.
keras.json details
{
"image_data_format": "channels_last",
"epsilon": 1e-07,
"floatx": "float32",
"backend": "tensorflow"
}
You can change these settings by editing $HOME/.keras/keras.json
.
image_data_format
: string, either"channels_last"
or"channels_first"
. It specifies which data format convention Keras will follow. (keras.backend.image_data_format()
returns it.)- For 2D data (e.g. image),
"channels_last"
assumes(rows, cols, channels)
while"channels_first"
assumes(channels, rows, cols)
. - For 3D data,
"channels_last"
assumes(conv_dim1, conv_dim2, conv_dim3, channels)
while"channels_first"
assumes(channels, conv_dim1, conv_dim2, conv_dim3)
. epsilon
: float, a numeric fuzzing constant used to avoid dividing by zero in some operations.floatx
: string,"float16"
,"float32"
, or"float64"
. Default float precision.backend
: string,"tensorflow"
or"theano"
.
Using the abstract Keras backend to write new code
If you want the Keras modules you write to be compatible with both Theano (th
) and TensorFlow (tf
), you have to write them via the abstract Keras backend API. Here's an intro.
You can import the backend module via:
*from keras import backend as K*
The code below instantiates an input placeholder. It's equivalent to tf.placeholder()
or th.tensor.matrix()
, th.tensor.tensor3()
, etc.
input = K.placeholder(shape=(2, 4, 5))
# also works:
input = K.placeholder(shape=(None, 4, 5))
# also works:
input = K.placeholder(ndim=3)
The code below instantiates a shared variable. It's equivalent to tf.Variable()
or th.shared()
.
import numpy as np
val = np.random.random((3, 4, 5))
var = K.variable(value=val)
# all-zeros variable:
var = K.zeros(shape=(3, 4, 5))
# all-ones:
var = K.ones(shape=(3, 4, 5))
Most tensor operations you will need can be done as you would in TensorFlow or Theano:
# Initializing Tensors with Random Numbers
b = K.random_uniform_variable(shape=(3, 4)). # Uniform distribution
c = K.random_normal_variable(shape=(3, 4)). # Gaussian distribution
d = K.random_normal_variable(shape=(3, 4)).
# Tensor Arithmetics
a = b + c * K.abs(d)
c = K.dot(a, K.transpose(b))
a = K.sum(b, axis=1)
a = K.softmax(b)
a = K.concatenate([b, c], axis=-1)
# etc...
Backend functions
epsilon
epsilon()
Returns the value of the fuzz factor used in numeric expressions.
Returns
A float.
Example
>>> keras.backend.epsilon()
1e-08
set_epsilon
set_epsilon(e)
Sets the value of the fuzz factor used in numeric expressions.
Arguments
- e: float. New value of epsilon.
Example
>>> from keras import backend as K
>>> K.epsilon()
1e-08
>>> K.set_epsilon(1e-05)
>>> K.epsilon()
1e-05
floatx
floatx()
Returns the default float type, as a string (e.g. 'float16', 'float32', 'float64').
Returns
String, the current default float type.
Example
>>> keras.backend.floatx()
'float32'
set_floatx
set_floatx(floatx)
Sets the default float type.
Arguments
- String: 'float16', 'float32', or 'float64'.
Example
>>> from keras import backend as K
>>> K.floatx()
'float32'
>>> K.set_floatx('float16')
>>> K.floatx()
'float16'
cast_to_floatx
cast_to_floatx(x)
Cast a Numpy array to the default Keras float type.
Arguments
- x: Numpy array.
Returns
The same Numpy array, cast to its new type.
Example
>>> from keras import backend as K
>>> K.floatx()
'float32'
>>> arr = numpy.array([1.0, 2.0], dtype='float64')
>>> arr.dtype
dtype('float64')
>>> new_arr = K.cast_to_floatx(arr)
>>> new_arr
array([ 1., 2.], dtype=float32)
>>> new_arr.dtype
dtype('float32')
image_data_format
image_data_format()
Returns the default image data format convention ('channels_first' or 'channels_last').
Returns
A string, either 'channels_first'
or 'channels_last'
Example
>>> keras.backend.image_data_format()
'channels_first'
set_image_data_format
set_image_data_format(data_format)
Sets the value of the data format convention.
Arguments
- data_format: string.
'channels_first'
or'channels_last'
.
Example
>>> from keras import backend as K
>>> K.image_data_format()
'channels_first'
>>> K.set_image_data_format('channels_last')
>>> K.image_data_format()
'channels_last'
is_keras_tensor
is_keras_tensor(x)
Returns whether x
is a Keras tensor.
Arguments
- x: a potential tensor.
Returns
A boolean: whether the argument is a Keras tensor.
Examples
>>> from keras import backend as K
>>> np_var = numpy.array([1, 2])
>>> K.is_keras_tensor(np_var)
False
>>> keras_var = K.variable(np_var)
>>> K.is_keras_tensor(keras_var) # A variable is not a Tensor.
False
>>> keras_placeholder = K.placeholder(shape=(2, 4, 5))
>>> K.is_keras_tensor(keras_placeholder) # A placeholder is a Tensor.
True
set_image_dim_ordering
set_image_dim_ordering(dim_ordering)
Legacy setter for image_data_format
.
Arguments
- dim_ordering: string.
'tf'
or'th'
.
Example
>>> from keras import backend as K
>>> K.image_data_format()
'channels_first'
>>> K.set_image_data_format('channels_last')
>>> K.image_data_format()
'channels_last'
image_dim_ordering
image_dim_ordering()
Legacy getter for image_data_format
.
learning_phase
learning_phase()
set_learning_phase
set_learning_phase(value)
get_uid
get_uid(prefix='')
Provides a unique UID given a string prefix.
Arguments
- prefix: string.
Returns
An integer.
Example
>>> keras.backend.get_uid('dense')
>>> 1
>>> keras.backend.get_uid('dense')
>>> 2
reset_uids
reset_uids()
is_sparse
is_sparse(tensor)
to_dense
to_dense(tensor)
name_scope
name_scope()
variable
variable(value, dtype=None, name=None)
Instantiates a variable and returns it.
Arguments
- value: Numpy array, initial value of the tensor.
- dtype: Tensor type.
- name: Optional name string for the tensor.
Returns
A variable instance (with Keras metadata included).
constant
constant(value, dtype=None, shape=None, name=None)
placeholder
placeholder(shape=None, ndim=None, dtype=None, sparse=False, name=None)
Instantiate an input data placeholder variable.
shape
shape(x)
Returns the shape of a tensor.
- Warning: type returned will be different for Theano backend (Theano tensor type) and TF backend (TF TensorShape).
int_shape
int_shape(x)
Returns the shape of a Keras tensor or a Keras variable as a tuple of integers or None entries.
Arguments
- x: Tensor or variable.
Returns
A tuple of integers (or None entries).
ndim
ndim(x)
dtype
dtype(x)
eval
eval(x)
Returns the value of a tensor.
zeros
zeros(shape, dtype=None, name=None)
Instantiates an all-zeros variable.
ones
ones(shape, dtype=None, name=None)
Instantiates an all-ones variable.
eye
eye(size, dtype=None, name=None)
Instantiates an identity matrix.
ones_like
ones_like(x, dtype=None, name=None)
zeros_like
zeros_like(x, dtype=None, name=None)
random_uniform_variable
random_uniform_variable(shape, low, high, dtype=None, name=None)
random_normal_variable
random_normal_variable(shape, mean, scale, dtype=None, name=None)
count_params
count_params(x)
Returns the number of scalars in a tensor.
- Return: numpy integer.
cast
cast(x, dtype)
update
update(x, new_x)
update_add
update_add(x, increment)
update_sub
update_sub(x, decrement)
moving_average_update
moving_average_update(variable, value, momentum)
dot
dot(x, y)
batch_dot
batch_dot(x, y, axes=None)
Batchwise dot product.
batch_dot results in a tensor with less dimensions than the input.
If the number of dimensions is reduced to 1, we use expand_dims
to
make sure that ndim is at least 2.
Arguments
x, y: tensors with ndim >= 2 - axes: list (or single) int with target dimensions
Returns
A tensor with shape equal to the concatenation of x's shape (less the dimension that was summed over) and y's shape (less the batch dimension and the dimension that was summed over). If the final rank is 1, we reshape it to (batch_size, 1).
Examples
Assume x = [[1, 2], [3, 4]] and y = [[5, 6], [7, 8]] batch_dot(x, y, axes=1) = [[17, 53]] which is the main diagonal of x.dot(y.T), although we never have to calculate the off-diagonal elements.
Shape inference: Let x's shape be (100, 20) and y's shape be (100, 30, 20). If dot_axes is (1, 2), to find the output shape of resultant tensor, loop through each dimension in x's shape and y's shape: x.shape[0] : 100 : append to output shape x.shape[1] : 20 : do not append to output shape, dimension 1 of x has been summed over. (dot_axes[0] = 1) y.shape[0] : 100 : do not append to output shape, always ignore first dimension of y y.shape[1] : 30 : append to output shape y.shape[2] : 20 : do not append to output shape, dimension 2 of y has been summed over. (dot_axes[1] = 2)
output_shape = (100, 30)
transpose
transpose(x)
gather
gather(reference, indices)
reference: a tensor. - indices: an int tensor of indices.
- Return: a tensor of same type as reference.
max
max(x, axis=None, keepdims=False)
min
min(x, axis=None, keepdims=False)
sum
sum(x, axis=None, keepdims=False)
Sum of the values in a tensor, alongside the specified axis.
prod
prod(x, axis=None, keepdims=False)
Multiply the values in a tensor, alongside the specified axis.
mean
mean(x, axis=None, keepdims=False)
Mean of a tensor, alongside the specified axis.
std
std(x, axis=None, keepdims=False)
var
var(x, axis=None, keepdims=False)
any
any(x, axis=None, keepdims=False)
Bitwise reduction (logical OR).
all
all(x, axis=None, keepdims=False)
Bitwise reduction (logical AND).
argmax
argmax(x, axis=-1)
argmin
argmin(x, axis=-1)
square
square(x)
abs
abs(x)
sqrt
sqrt(x)
exp
exp(x)
log
log(x)
round
round(x)
sign
sign(x)
pow
pow(x, a)
clip
clip(x, min_value, max_value)
equal
equal(x, y)
not_equal
not_equal(x, y)
greater
greater(x, y)
greater_equal
greater_equal(x, y)
less
less(x, y)
less_equal
less_equal(x, y)
maximum
maximum(x, y)
minimum
minimum(x, y)
sin
sin(x)
cos
cos(x)
normalize_batch_in_training
normalize_batch_in_training(x, gamma, beta, reduction_axes, epsilon=0.001)
Computes mean and std for batch then apply batch_normalization on batch.
batch_normalization
batch_normalization(x, mean, var, beta, gamma, epsilon=0.001)
Apply batch normalization on x given mean, var, beta and gamma.
concatenate
concatenate(tensors, axis=-1)
reshape
reshape(x, shape)
permute_dimensions
permute_dimensions(x, pattern)
Transpose dimensions.
pattern should be a tuple or list of dimension indices, e.g. [0, 2, 1].
repeat_elements
repeat_elements(x, rep, axis)
Repeat the elements of a tensor along an axis, like np.repeat.
If x has shape (s1, s2, s3) and axis=1, the output will have shape (s1, s2 * rep, s3).
resize_images
resize_images(X, height_factor, width_factor, data_format)
Resize the images contained in a 4D tensor of shape - [batch, channels, height, width] (for 'channels_first' data_format) - [batch, height, width, channels] (for 'channels_last' data_format) by a factor of (height_factor, width_factor). Both factors should be positive integers.
resize_volumes
resize_volumes(X, depth_factor, height_factor, width_factor, data_format)
Resize the volume contained in a 5D tensor of shape - [batch, channels, depth, height, width] (for 'channels_first' data_format) - [batch, depth, height, width, channels] (for 'channels_last' data_format) by a factor of (depth_factor, height_factor, width_factor). Both factors should be positive integers.
repeat
repeat(x, n)
Repeat a 2D tensor.
If x has shape (samples, dim) and n=2, the output will have shape (samples, 2, dim).
arange
arange(start, stop=None, step=1, dtype='int32')
Creates a 1-D tensor containing a sequence of integers.
The function arguments use the same convention as Theano's arange: if only one argument is provided, it is in fact the "stop" argument.
The default type of the returned tensor is 'int32' to match TensorFlow's default.
tile
tile(x, n)
flatten
flatten(x)
batch_flatten
batch_flatten(x)
Turn a n-D tensor into a 2D tensor where the first dimension is conserved.
expand_dims
expand_dims(x, axis=-1)
Add a 1-sized dimension at index "dim".
squeeze
squeeze(x, axis)
Remove a 1-dimension from the tensor at index "axis".
temporal_padding
temporal_padding(x, padding=(1, 1))
Pad the middle dimension of a 3D tensor with "padding" zeros left and right.
Apologies for the inane API, but Theano makes this really hard.
spatial_2d_padding
spatial_2d_padding(x, padding=((1, 1), (1, 1)), data_format=None)
Pad the 2nd and 3rd dimensions of a 4D tensor with "padding[0]" and "padding[1]" (resp.) zeros left and right.
spatial_3d_padding
spatial_3d_padding(x, padding=((1, 1), (1, 1), (1, 1)), data_format=None)
Pad the 2nd, 3rd and 4th dimensions of a 5D tensor with "padding[0]", "padding[1]" and "padding[2]" (resp.) zeros left and right.
stack
stack(x, axis=0)
one_hot
one_hot(indices, num_classes)
Input: nD integer tensor of shape (batch_size, dim1, dim2, ... dim(n-1)) - Output: (n + 1)D one hot representation of the input with shape (batch_size, dim1, dim2, ... dim(n-1), num_classes)
reverse
reverse(x, axes)
Reverse a tensor along the specified axes
pattern_broadcast
pattern_broadcast(x, broatcastable)
get_value
get_value(x)
batch_get_value
batch_get_value(xs)
Returns the value of more than one tensor variable, as a list of Numpy arrays.
set_value
set_value(x, value)
batch_set_value
batch_set_value(tuples)
get_variable_shape
get_variable_shape(x)
print_tensor
print_tensor(x, message='')
Print the message and the tensor when evaluated and return the same tensor.
function
function(inputs, outputs, updates=[])
gradients
gradients(loss, variables)
stop_gradient
stop_gradient(variables)
Returns variables
but with zero gradient with respect to every other
variables.
rnn
rnn(step_function, inputs, initial_states, go_backwards=False, mask=None, constants=None, unroll=False, input_length=None)
Iterates over the time dimension of a tensor.
Arguments
- inputs: tensor of temporal data of shape (samples, time, ...) (at least 3D).
- step_function:
- Parameters:
- input: tensor with shape (samples, ...) (no time dimension), representing input for the batch of samples at a certain time step.
- states: list of tensors.
- Returns:
- output: tensor with shape (samples, ...) (no time dimension),
- new_states: list of tensors, same length and shapes as 'states'.
- initial_states: tensor with shape (samples, ...) (no time dimension), containing the initial values for the states used in the step function.
- go_backwards: boolean. If True, do the iteration over the time dimension in reverse order.
- mask: binary tensor with shape (samples, time), with a zero for every element that is masked.
- constants: a list of constant values passed at each step.
- unroll: whether to unroll the RNN or to use a symbolic loop (
while_loop
orscan
depending on backend). - input_length: must be specified if using
unroll
.
Returns
A tuple (last_output, outputs, new_states). - last_output: the latest output of the rnn, of shape (samples, ...) - outputs: tensor with shape (samples, time, ...) where each entry outputs[s, t] is the output of the step function at time t for sample s. - new_states: list of tensors, latest states returned by the step function, of shape (samples, ...).
switch
switch(condition, then_expression, else_expression)
condition: scalar tensor.
in_train_phase
in_train_phase(x, alt, training=None)
Selects x
in train phase, and alt
otherwise.
Note that alt
should have the same shape as x
.
Returns
Either x
or alt
based on the training
flag.
the training
flag defaults to K.learning_phase()
.
in_test_phase
in_test_phase(x, alt, training=None)
Selects x
in test phase, and alt
otherwise.
Note that alt
should have the same shape as x
.
Returns
Either x
or alt
based on K.learning_phase
.
elu
elu(x, alpha=1.0)
Exponential linear unit
Arguments
- x: Tensor to compute the activation function for.
- alpha: scalar
relu
relu(x, alpha=0.0, max_value=None)
softmax
softmax(x)
softplus
softplus(x)
softsign
softsign(x)
categorical_crossentropy
categorical_crossentropy(output, target, from_logits=False)
sparse_categorical_crossentropy
sparse_categorical_crossentropy(output, target, from_logits=False)
binary_crossentropy
binary_crossentropy(output, target, from_logits=False)
sigmoid
sigmoid(x)
hard_sigmoid
hard_sigmoid(x)
tanh
tanh(x)
dropout
dropout(x, level, noise_shape=None, seed=None)
Sets entries in x
to zero at random,
while scaling the entire tensor.
Arguments
- x: tensor
- level: fraction of the entries in the tensor that will be set to 0.
- noise_shape: shape for randomly generated keep/drop flags,
must be broadcastable to the shape of
x
- seed: random seed to ensure determinism.
l2_normalize
l2_normalize(x, axis)
in_top_k
in_top_k(predictions, targets, k)
Returns whether the targets
are in the top k
predictions
Arguments
- predictions: A tensor of shape batch_size x classess and type float32.
- targets: A tensor of shape batch_size and type int32 or int64.
- k: An int, number of top elements to consider.
Returns
A tensor of shape batch_size and type int. output_i is 1 if targets_i is within top-k values of predictions_i
conv1d
conv1d(x, kernel, strides=1, padding='valid', data_format=None, dilation_rate=1)
1D convolution.
Arguments
- kernel: kernel tensor.
- strides: stride integer.
- padding: string,
"same"
,"causal"
or"valid"
. - data_format: string, one of "channels_last", "channels_first"
- dilation_rate: integer.
conv2d
conv2d(x, kernel, strides=(1, 1), padding='valid', data_format=None, dilation_rate=(1, 1))
2D convolution.
Arguments
- kernel: kernel tensor.
- strides: strides tuple.
- padding: string, "same" or "valid".
- data_format: "channels_last" or "channels_first". Whether to use Theano or TensorFlow data format in inputs/kernels/ouputs.
conv2d_transpose
conv2d_transpose(x, kernel, output_shape, strides=(1, 1), padding='valid', data_format=None)
2D deconvolution (transposed convolution).
Arguments
- kernel: kernel tensor.
- output_shape: desired dimensions of output.
- strides: strides tuple.
- padding: string, "same" or "valid".
- data_format: "channels_last" or "channels_first". Whether to use Theano or TensorFlow data format in inputs/kernels/ouputs.
separable_conv2d
separable_conv2d(x, depthwise_kernel, pointwise_kernel, strides=(1, 1), padding='valid', data_format=None, dilation_rate=(1, 1))
conv3d
conv3d(x, kernel, strides=(1, 1, 1), padding='valid', data_format=None, dilation_rate=(1, 1, 1))
3D convolution.
Arguments
- kernel: kernel tensor.
- strides: strides tuple.
- padding: string, "same" or "valid".
- data_format: "channels_last" or "channels_first". Whether to use Theano or TensorFlow data format in inputs/kernels/ouputs.
pool2d
pool2d(x, pool_size, strides=(1, 1), padding='valid', data_format=None, pool_mode='max')
pool3d
pool3d(x, pool_size, strides=(1, 1, 1), padding='valid', data_format=None, pool_mode='max')
bias_add
bias_add(x, bias, data_format=None)
random_normal
random_normal(shape, mean=0.0, stddev=1.0, dtype=None, seed=None)
random_uniform
random_uniform(shape, minval=0.0, maxval=1.0, dtype=None, seed=None)
random_binomial
random_binomial(shape, p=0.0, dtype=None, seed=None)
truncated_normal
truncated_normal(shape, mean=0.0, stddev=1.0, dtype=None, seed=None)
ctc_interleave_blanks
ctc_interleave_blanks(Y)
ctc_create_skip_idxs
ctc_create_skip_idxs(Y)
ctc_update_log_p
ctc_update_log_p(skip_idxs, zeros, active, log_p_curr, log_p_prev)
ctc_path_probs
ctc_path_probs(predict, Y, alpha=0.0001)
ctc_cost
ctc_cost(predict, Y)
ctc_batch_cost
ctc_batch_cost(y_true, y_pred, input_length, label_length)
Runs CTC loss algorithm on each batch element.
Arguments
- y_true: tensor (samples, max_string_length) containing the truth labels
- y_pred: tensor (samples, time_steps, num_categories) containing the prediction, or output of the softmax
- input_length: tensor (samples,1) containing the sequence length for each batch item in y_pred
- label_length: tensor (samples,1) containing the sequence length for each batch item in y_true
Returns
Tensor with shape (samples,1) containing the CTC loss of each element
map_fn
map_fn(fn, elems, name=None)
Map the function fn over the elements elems and return the outputs.
Arguments
- fn: Callable that will be called upon each element in elems
- elems: tensor, at least 2 dimensional
- name: A string name for the map node in the graph
Returns
Tensor with first dimension equal to the elems and second depending on fn
foldl
foldl(fn, elems, initializer=None, name=None)
Reduce elems using fn to combine them from left to right.
Arguments
- fn: Callable that will be called upon each element in elems and an accumulator, for instance lambda acc, x: acc + x
- elems: tensor
- initializer: The first value used (elems[0] in case of None)
- name: A string name for the foldl node in the graph
Returns
Same type and shape as initializer
foldr
foldr(fn, elems, initializer=None, name=None)
Reduce elems using fn to combine them from right to left.
Arguments
- fn: Callable that will be called upon each element in elems and an accumulator, for instance lambda acc, x: acc + x
- elems: tensor
- initializer: The first value used (elems[-1] in case of None)
- name: A string name for the foldr node in the graph
Returns
Same type and shape as initializer
backend
backend()
Publicly accessible method for determining the current backend.
Returns
String, the name of the backend Keras is currently using.
Example
>>> keras.backend.backend()
'tensorflow'