序列模型RNN及LSTM实现
作者:互联网
具体RNN和LSTM的介绍如下连接:https://www.jianshu.com/p/9dc9f41f0b29
图1. RNN结构图
图2. RNN的时序展开图
图3. 传统RNN的内部结构
上面是传统RNN图示,图1为RNN的结构图,它和传统的网络相比,在具体层上进行了按时序展开,具体看图2,其可以被看做是同一神经网络的多次复制,每个神经网络模块会把消息传递给下一个模块。图3为传统RNN内部实现结构,非常简单的结构,只有一个 tanh
层。
图4. LSTM结构图
Long Short Term 网络—— 一般就叫做 LSTM ——是一种 RNN 特殊的类型,可以学习长期依赖信息。其在传统RNN基础上引入了细胞状态,可以对输入信息进行选择和保存,其实LSTM的关键。
具体LSTM细节请参照文章:https://www.jianshu.com/p/9dc9f41f0b29
本文只负责对LSTM内部手动代码实现
def _generate_params_for_lstm_cell(x_size, h_size, bias_size):
"""generates parameters for pure lstm implementation."""
x_w = tf.get_variable('x_weights', x_size)
h_w = tf.get_variable('h_weights', h_size)
b = tf.get_variable('biases', bias_size,
initializer=tf.constant_initializer(0.0))
return x_w, h_w, b
with tf.variable_scope('lstm_nn'):
'''LSTM实现'''
#hps.num_embedding_size:输入向量长度
#hps.num_lstm_nodes:lstm内部单元个数
with tf.variable_scope('inputs'):
'''定义输入门'''
ix, ih, ib = _generate_params_for_lstm_cell(
x_size = [hps.num_embedding_size, hps.num_lstm_nodes[0]],
h_size = [hps.num_lstm_nodes[0], hps.num_lstm_nodes[0]],
bias_size = [1, hps.num_lstm_nodes[0]]
)
with tf.variable_scope('outputs'):
ox, oh, ob = _generate_params_for_lstm_cell(
x_size = [hps.num_embedding_size, hps.num_lstm_nodes[0]],
h_size = [hps.num_lstm_nodes[0], hps.num_lstm_nodes[0]],
bias_size = [1, hps.num_lstm_nodes[0]]
)
with tf.variable_scope('forget'):
fx, fh, fb = _generate_params_for_lstm_cell(
x_size = [hps.num_embedding_size, hps.num_lstm_nodes[0]],
h_size = [hps.num_lstm_nodes[0], hps.num_lstm_nodes[0]],
bias_size = [1, hps.num_lstm_nodes[0]]
)
with tf.variable_scope('memory'):
cx, ch, cb = _generate_params_for_lstm_cell(
x_size = [hps.num_embedding_size, hps.num_lstm_nodes[0]],
h_size = [hps.num_lstm_nodes[0], hps.num_lstm_nodes[0]],
bias_size = [1, hps.num_lstm_nodes[0]]
)
state = tf.Variable(
tf.zeros([batch_size, hps.num_lstm_nodes[0]]),
trainable = False
)
h = tf.Variable(
tf.zeros([batch_size, hps.num_lstm_nodes[0]]),
trainable = False
)
for i in range(num_timesteps):
# [batch_size, 1, embed_size]
embed_input = embed_inputs[:, i, :]
embed_input = tf.reshape(embed_input,
[batch_size, hps.num_embedding_size])
forget_gate = tf.sigmoid(
tf.matmul(embed_input, fx) + tf.matmul(h, fh) + fb)
input_gate = tf.sigmoid(
tf.matmul(embed_input, ix) + tf.matmul(h, ih) + ib)
output_gate = tf.sigmoid(
tf.matmul(embed_input, ox) + tf.matmul(h, oh) + ob)
mid_state = tf.tanh(
tf.matmul(embed_input, cx) + tf.matmul(h, ch) + cb)
state = mid_state * input_gate + state * forget_gate
h = output_gate * tf.tanh(state)
last = h
本文连接:https://mp.csdn.net/postedit?not_checkout=1
标签:RNN,hps,nodes,tf,num,序列,LSTM,lstm,size 来源: https://blog.csdn.net/weixin_44402973/article/details/100554874