其他分享
首页 > 其他分享> > 序列模型RNN及LSTM实现

序列模型RNN及LSTM实现

作者:互联网

具体RNN和LSTM的介绍如下连接:https://www.jianshu.com/p/9dc9f41f0b29

                                                                                            图1. RNN结构图

                                                                             图2. RNN的时序展开图

                                                                                图3. 传统RNN的内部结构

上面是传统RNN图示,图1为RNN的结构图,它和传统的网络相比,在具体层上进行了按时序展开,具体看图2,其可以被看做是同一神经网络的多次复制,每个神经网络模块会把消息传递给下一个模块。图3为传统RNN内部实现结构,非常简单的结构,只有一个 tanh 层。

                                                                                 图4. LSTM结构图

Long Short Term 网络—— 一般就叫做 LSTM ——是一种 RNN 特殊的类型,可以学习长期依赖信息。其在传统RNN基础上引入了细胞状态,可以对输入信息进行选择和保存,其实LSTM的关键。

具体LSTM细节请参照文章:https://www.jianshu.com/p/9dc9f41f0b29

本文只负责对LSTM内部手动代码实现

def _generate_params_for_lstm_cell(x_size, h_size, bias_size):
    """generates parameters for pure lstm implementation."""
     x_w = tf.get_variable('x_weights', x_size)
     h_w = tf.get_variable('h_weights', h_size)
     b = tf.get_variable('biases', bias_size,
                            initializer=tf.constant_initializer(0.0))
     return x_w, h_w, b
    
with tf.variable_scope('lstm_nn'):
     '''LSTM实现'''
     #hps.num_embedding_size:输入向量长度
     #hps.num_lstm_nodes:lstm内部单元个数
     with tf.variable_scope('inputs'):
            '''定义输入门'''
            ix, ih, ib = _generate_params_for_lstm_cell(
                x_size = [hps.num_embedding_size, hps.num_lstm_nodes[0]],
                h_size = [hps.num_lstm_nodes[0], hps.num_lstm_nodes[0]],
                bias_size = [1, hps.num_lstm_nodes[0]]
            )
      with tf.variable_scope('outputs'):
            ox, oh, ob = _generate_params_for_lstm_cell(
                x_size = [hps.num_embedding_size, hps.num_lstm_nodes[0]],
                h_size = [hps.num_lstm_nodes[0], hps.num_lstm_nodes[0]],
                bias_size = [1, hps.num_lstm_nodes[0]]
            )
      with tf.variable_scope('forget'):
            fx, fh, fb = _generate_params_for_lstm_cell(
                x_size = [hps.num_embedding_size, hps.num_lstm_nodes[0]],
                h_size = [hps.num_lstm_nodes[0], hps.num_lstm_nodes[0]],
                bias_size = [1, hps.num_lstm_nodes[0]]
            )
      with tf.variable_scope('memory'):
            cx, ch, cb = _generate_params_for_lstm_cell(
                x_size = [hps.num_embedding_size, hps.num_lstm_nodes[0]],
                h_size = [hps.num_lstm_nodes[0], hps.num_lstm_nodes[0]],
                bias_size = [1, hps.num_lstm_nodes[0]]
            )
        state = tf.Variable(
            tf.zeros([batch_size, hps.num_lstm_nodes[0]]),
            trainable = False
        )
        h = tf.Variable(
            tf.zeros([batch_size, hps.num_lstm_nodes[0]]),
            trainable = False
        )
        
      for i in range(num_timesteps):
            # [batch_size, 1, embed_size]
            embed_input = embed_inputs[:, i, :]
            embed_input = tf.reshape(embed_input,
                                     [batch_size, hps.num_embedding_size])
            forget_gate = tf.sigmoid(
                tf.matmul(embed_input, fx) + tf.matmul(h, fh) + fb)
            input_gate = tf.sigmoid(
                tf.matmul(embed_input, ix) + tf.matmul(h, ih) + ib)
            output_gate = tf.sigmoid(
                tf.matmul(embed_input, ox) + tf.matmul(h, oh) + ob)
            mid_state = tf.tanh(
                tf.matmul(embed_input, cx) + tf.matmul(h, ch) + cb)
            state = mid_state * input_gate + state * forget_gate
            h = output_gate * tf.tanh(state)
      last = h
    

本文连接:https://mp.csdn.net/postedit?not_checkout=1

标签:RNN,hps,nodes,tf,num,序列,LSTM,lstm,size
来源: https://blog.csdn.net/weixin_44402973/article/details/100554874