其他分享
首页 > 其他分享> > label smoothing

label smoothing

作者:互联网

 

An overconfident model is not calibrated and its predicted probabilities are consistently higher than the accuracy.

For example, it may predict 0.9 for inputs where the accuracy is only 0.6.

Notice that models with small test errors can still be overconfident, and therefore can benefit from label smoothing.

 

Label smoothing replaces one-hot encoded label vector y_hot with a mixture of y_hot and the uniform distribution:

 

  y_ls = (1 - a) * y_hot + a / k

 

where K is the number of label classes, and α is a hyperparameter that determines the amount of smoothing.

If α = 0, we obtain the original one-hot encoded y_hot. If α = 1, we get the uniform distribution.

 

ref:

What is Label Smoothing?. A technique to make your model less… | by Wanshun Wong | Towards Data Science

 

def label_smoothing(inputs, epsilon=0.1):
    '''Applies label smoothing. See 5.4 and https://arxiv.org/abs/1512.00567.
    inputs: 3d tensor. [N, T, V], where V is the number of vocabulary.
    epsilon: Smoothing rate.
    
    For example,
    
    ```
    import tensorflow as tf
    inputs = tf.convert_to_tensor([[[0, 0, 1], 
       [0, 1, 0],
       [1, 0, 0]],

      [[1, 0, 0],
       [1, 0, 0],
       [0, 1, 0]]], tf.float32)
       
    outputs = label_smoothing(inputs)
    
    with tf.Session() as sess:
        print(sess.run([outputs]))
    
    >>
    [array([[[ 0.03333334,  0.03333334,  0.93333334],
        [ 0.03333334,  0.93333334,  0.03333334],
        [ 0.93333334,  0.03333334,  0.03333334]],

       [[ 0.93333334,  0.03333334,  0.03333334],
        [ 0.93333334,  0.03333334,  0.03333334],
        [ 0.03333334,  0.93333334,  0.03333334]]], dtype=float32)]   
    ```    
    '''
    V = inputs.get_shape().as_list()[-1] # number of channels
    return ((1-epsilon) * inputs) + (epsilon / V)

 

标签:inputs,0.03333334,label,hot,0.93333334,smoothing
来源: https://www.cnblogs.com/tpoy/p/14505129.html