Usage of metrics
A metric is a function that is used to judge the performance of your model. Metric functions are to be supplied in the metrics
parameter when a model is compiled.
A metric function is similar to an objective function, except that the results from evaluating a metric are not used when training the model.
You can either pass the name of an existing metric, or pass a Theano/TensorFlow symbolic function (see Custom metrics).
Arguments
- y_true: True labels. Theano/TensorFlow tensor.
- y_pred: Predictions. Theano/TensorFlow tensor of the same shape as y_true.
Returns
Single tensor value representing the mean of the output array across all datapoints.
Available metrics
matthews_correlation
matthews_correlation(y_true, y_pred)
Matthews correlation metric.
It is only computed as a batch-wise average, not globally.
Computes the Matthews correlation coefficient measure for quality of binary classification problems.
precision
precision(y_true, y_pred)
Precision metric.
Only computes a batch-wise average of precision.
Computes the precision, a metric for multi-label classification of how many selected items are relevant.
recall
recall(y_true, y_pred)
Recall metric.
Only computes a batch-wise average of recall.
Computes the recall, a metric for multi-label classification of how many relevant items are selected.
fbeta_score
fbeta_score(y_true, y_pred, beta=1)
Computes the F score.
The F score is the weighted harmonic mean of precision and recall. Here it is only computed as a batch-wise average, not globally.
This is useful for multi-label classification, where input samples can be classified as sets of labels. By only using accuracy (precision) a model would achieve a perfect score by simply assigning every class to every input. In order to avoid this, a metric should penalize incorrect class assignments as well (recall). The F-beta score (ranged from 0.0 to 1.0) computes this, as a weighted mean of the proportion of correct class assignments vs. the proportion of incorrect class assignments.
With beta = 1, this is equivalent to a F-measure. With beta < 1, assigning correct classes becomes more important, and with beta > 1 the metric is instead weighted towards penalizing incorrect class assignments.
fmeasure
fmeasure(y_true, y_pred)
Computes the f-measure, the harmonic mean of precision and recall.
Here it is only computed as a batch-wise average, not globally.
Custom metrics
Custom metrics can be defined and passed via the compilation step. The
function would need to take (y_true, y_pred)
as arguments and return
either a single tensor value or a dict metric_name -> metric_value
.
# for custom metrics
import keras.backend as K
def mean_pred(y_true, y_pred):
return K.mean(y_pred)
def false_rates(y_true, y_pred):
false_neg = ...
false_pos = ...
return {
'false_neg': false_neg,
'false_pos': false_pos,
}
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy', mean_pred, false_rates])