首页 > 其他分享> > L1 、 L2 、 smooth L1 loss

L1 、 L2 、 smooth L1 loss

2021-04-05 22:31:14 作者：互联网

均方误差MSE (L2 Loss)

均方误差（Mean Square Error，MSE）是模型预测值f(x) 与真实样本值y 之间差值平方的平均值，其公式如下
在这里插入图片描述

平均绝对误差(L1 Loss)

在这里插入图片描述

MSE和MAE的选择

在这里插入图片描述

Smooth L1 Loss

在Faster R-CNN以及SSD中对边框的回归使用的损失函数都是Smooth L1 作为损失函数，
在这里插入图片描述
对比L1 Loss 和 L2 Loss

其中x 为预测框与 groud truth 之间 elementwise 的差异：
在这里插入图片描述从上面可以看出，该函数实际上就是一个分段函数，在[-1,1]之间实际上就是L2损失，这样解决了L1的不光滑问题，在[-1,1]区间外，实际上就是L1损失，这样就解决了离群点梯度爆炸的问题

实现 (PyTorch)

def _smooth_l1_loss(input, target, reduction='none'):
    # type: (Tensor, Tensor) -> Tensor
    t = torch.abs(input - target)
    ret = torch.where(t < 1, 0.5 * t ** 2, t - 0.5)
    if reduction != 'none':
        ret = torch.mean(ret) if reduction == 'mean' else torch.sum(ret)
    return ret

现在主流的实现方式：

也可以添加个限制条件beta=1. / 9 这样就可以控制，什么范围的误差使用MSE，什么范围内的误差使用MAE了。

def smooth_l1_loss(input, target, beta=1. / 9, reduction = 'none'):
    """
    very similar to the smooth_l1_loss from pytorch, but with
    the extra beta parameter
    """
    n = torch.abs(input - target)
    cond = n < beta
    ret = torch.where(cond, 0.5 * n ** 2 / beta, n - 0.5 * beta)
    if reduction != 'none':
        ret = torch.mean(ret) if reduction == 'mean' else torch.sum(ret)
    return ret

在这里插入图片描述 https://github.com/rbgirshick/py-faster-rcnn/issues/89

总结

对于大多数CNN网络，我们一般是使用L2-loss而不是L1-loss，因为L2-loss的收敛速度要比L1-loss要快得多。

对于边框预测回归问题，通常也可以选择平方损失函数（L2损失），但L2范数的缺点是当存在离群点（outliers)的时候，这些点会占loss的主要组成部分。比如说真实值为1，预测10次，有一次预测值为1000，其余次的预测值为1左右，显然loss值主要由1000决定。所以FastRCNN采用稍微缓和一点绝对损失函数（smooth L1损失），它是随着误差线性增长，而不是平方增长。

Smooth L1 和 L1 Loss 函数的区别在于，L1 Loss 在0点处导数不唯一，可能影响收敛。Smooth L1的解决办法是在 0 点附近使用平方函数使得它更加平滑。

Smooth L1的优点

相比于L1损失函数，可以收敛得更快。且在0点有导数，便于收敛的好处。
相比于L2损失函数，对离群点、异常值不敏感，梯度变化相对更小，训练时不容易跑飞。

转：
https://www.cnblogs.com/wangguchangqing/p/12021638.html

标签：loss,Loss,torch,smooth,ret,L2,L1
来源： https://blog.csdn.net/W1995S/article/details/115448270