其他分享
首页 > 其他分享> > 第六节:深度学习的模型训练技巧——优化卷积核,多通道卷积,批量归一化

第六节:深度学习的模型训练技巧——优化卷积核,多通道卷积,批量归一化

作者:互联网

1、优化卷积核技术

在实际的卷积训练中,为了加快速度,常常把卷积核裁开。比如一个3x3的卷积核,可以裁成一个3x1和1x3的卷积核(通过矩阵乘法得知),分别对原有输入做卷积运算,这样可以大大提升运算的速度。

原理:在浮点运算中乘法消耗的资源比较多,我们目的就是尽量减少乘法运算。

#1.卷积层 ->池化层
W_conv1 = weight_variable([5,5,3,64])
b_conv1 = bias_variable([64])


h_conv1 = tf.nn.relu(conv2d(x_image,W_conv1) + b_conv1)    #输出为[-1,24,24,64]
print_op_shape(h_conv1)
h_pool1 = max_pool_2x2(h_conv1)                            #输出为[-1,12,12,64]
print_op_shape(h_pool1)


#2.卷积层 ->池化层   卷积核做优化
W_conv21 = weight_variable([5,1,64,64])
b_conv21 = bias_variable([64])


h_conv21 = tf.nn.relu(conv2d(h_pool1,W_conv21) + b_conv21)    #输出为[-1,12,12,64]
print_op_shape(h_conv21)

W_conv2 = weight_variable([1,5,64,64])
b_conv2 = bias_variable([64])


h_conv2 = tf.nn.relu(conv2d(h_conv21,W_conv2) + b_conv2)     #输出为[-1,12,12,64]
print_op_shape(h_conv2)

h_pool2 = max_pool_2x2(h_conv2)                              #输出为[-1,6,6,64]
print_op_shape(h_pool2)

 

将原来的第二层5x5的卷积操作换成两个5x1和1x5的卷积操作,代码运行后准确率没有变化,但是速度快了一些。

2、多通道卷积技术:可以理解为一种新型的CNN网络模型,在原有的卷积模型基础上的扩展。

#2.卷积层 ->池化层  这里使用多通道卷积
W_conv2_1x1 = weight_variable([1,1,64,64])
b_conv2_1x1 = bias_variable([64])

W_conv2_3x3 = weight_variable([3,3,64,64])
b_conv2_3x3 = bias_variable([64])

W_conv2_5x5 = weight_variable([5,5,64,64])
b_conv2_5x5 = bias_variable([64])

W_conv2_7x7 = weight_variable([7,7,64,64])
b_conv2_7x7 = bias_variable([64])


h_conv2_1x1 = tf.nn.relu(conv2d(h_pool1,W_conv2_1x1) + b_conv2_1x1)    #输出为[-1,12,12,64]
h_conv2_3x3 = tf.nn.relu(conv2d(h_pool1,W_conv2_3x3) + b_conv2_3x3)    #输出为[-1,12,12,64]
h_conv2_5x5 = tf.nn.relu(conv2d(h_pool1,W_conv2_5x5) + b_conv2_5x5)    #输出为[-1,12,12,64]
h_conv2_7x7 = tf.nn.relu(conv2d(h_pool1,W_conv2_7x7) + b_conv2_7x7)    #输出为[-1,12,12,64]

#合并 3表示沿着通道合并
h_conv2  = tf.concat((h_conv2_1x1,h_conv2_3x3,h_conv2_5x5,h_conv2_7x7),axis=3)  #输出为[-1,12,12,256]

h_pool2 = max_pool_2x2(h_conv2)                            #输出为[-1,6,6,256]

 

 3、批量归一化

  最大限度的保证每次的正向传播输出在同意分布上,这样反向计算时参照的数据样本分布就会与正向计算时的数据分布一样了。

批量归一化在tensorflow中的函数定义tf.nn.batch_normalization(x,mean,variance,offset,scale,variance_epsilon,name=None)

使用这个函数必有由另一个函数配合——tf.nn.moments,计算均值和方差tf.nn,moments(x,axes,name=None,keep_dims=False)

#需要引入头文件
from tensorflow.contrib.layers.python.layers import batch_norm
# 为BN函数添加占位符参数
train = tf.placeholder(tf.float32)
......

def batch_norm_layer(value,train=None,name='batch_norm):
    if train is not None:
        return batch_norm(value,decay=0.9,updates_collections=None,is_training=True)
    else:
        return batch_norm(value,decay=0.9,updates_collections=None,is_training=Flase)

.......

#在第一层h_conv1与第二层h_conv2的输出之前卷积之后加入BN层
h_conv1 = tf.nn.relu(batch_norm_layer((conv2(x_image,W_conv1)+b_conv1),train)
h_pool1 = max_pool_2x2(h_conv1)

h_conv2 = tf.nn.relu(batch_norm_layer((conv2(h_pool1,W_conv2)+b_conv2),train)
h_pool2 = max_pool_2x2(h_conv2)
......

#在运行session中添加训练标志
for i in range(20000):
    image_batch,label_batch = sess.run([image_train,labels_train])
    label_b = np.eye(10,dtype=float)[label_batch] #one hot编码
    train_step.run(feed_dict={x:image_batch,y:label_b,train:1},session=sess)
.......

 

标签:12,variable,卷积,多通道,conv2,64,归一化,tf
来源: https://www.cnblogs.com/wyx501/p/10560809.html