其他分享
首页 > 其他分享> > Deeplearning知识蒸馏

Deeplearning知识蒸馏

作者:互联网

Deeplearning知识蒸馏

merge

paddleslim.dist.merge(teacher_programstudent_programdata_name_mapplacescope=fluid.global_scope()name_prefix='teacher_')

merge将teacher_program融合到student_program中。在融合的program中,可以为其中合适的teacher特征图和student特征图添加蒸馏损失函数,从而达到用teacher模型的暗知识(Dark Knowledge)指导student模型学习的目的。

参数:

返回: 无

注解

data_name_map 是 teacher_var namestudent_var name的映射 ,如果写反可能无法正确进行merge

使用示例:

import paddle.fluid as fluid

import paddleslim.dist as dist

student_program = fluid.Program()

with fluid.program_guard(student_program):

    x = fluid.layers.data(name='x', shape=[1, 28, 28])

    conv = fluid.layers.conv2d(x, 32, 1)

    out = fluid.layers.conv2d(conv, 64, 3, padding=1)

teacher_program = fluid.Program()

with fluid.program_guard(teacher_program):

    y = fluid.layers.data(name='y', shape=[1, 28, 28])

    conv = fluid.layers.conv2d(y, 32, 1)

    conv = fluid.layers.conv2d(conv, 32, 3, padding=1)

    out = fluid.layers.conv2d(conv, 64, 3, padding=1)

data_name_map = {'y':'x'}

USE_GPU = False

place = fluid.CUDAPlace(0) if USE_GPU else fluid.CPUPlace()

dist.merge(teacher_program, student_program,

                          data_name_map, place)

fsp_loss

paddleslim.dist.fsp_loss(teacher_var1_nameteacher_var2_namestudent_var1_namestudent_var2_nameprogram=fluid.default_main_program())

fsp_loss为program内的teacher var和student var添加fsp loss,出自论文 A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning

参数:

返回: 由teacher_var1, teacher_var2, student_var1, student_var2组合得到的fsp_loss

使用示例:

l2_loss

paddleslim.dist.l2_loss(teacher_var_namestudent_var_nameprogram=fluid.default_main_program())[[]](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/paddleslim/dist/single_distiller.py#L118)

: l2_loss为program内的teacher var和student var添加l2 loss

参数:

返回: 由teacher_var, student_var组合得到的l2_loss

使用示例:

soft_label_loss

paddleslim.dist.soft_label_loss(teacher_var_namestudent_var_nameprogram=fluid.default_main_program()teacher_temperature=1.student_temperature=1.)[[]](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/paddleslim/dist/single_distiller.py#L136)

soft_label_loss为program内的teacher var和student var添加soft label loss,出自论文 Distilling the Knowledge in a Neural Network

参数:

返回: 由teacher_var, student_var组合得到的soft_label_loss

使用示例:

loss

paddleslim.dist.loss(loss_funcprogram=fluid.default_main_program()**kwargs) [[]](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/paddleslim/dist/single_distiller.py#L165)

: loss函数支持对任意多对teacher_var和student_var使用自定义损失函数

参数:

返回 :自定义的损失函数loss

使用示例:

注解

在添加蒸馏loss时会引入新的variable,需要注意新引入的variable不要与student variables命名冲突。这里建议两种用法(两种方法任选其一即可):

  1. 建议与student_program使用同一个命名空间,以避免一些未指定名称的variables(例如tmp_0, tmp_1...)多次定义为同一名称出现命名冲突
  2. 建议在添加蒸馏loss时指定一个命名空间前缀

标签:蒸馏,知识,fluid,program,Deeplearning,student,var,teacher,name
来源: https://www.cnblogs.com/wujianming-110117/p/14424091.html