其他分享
首页 > 其他分享> > attention_textcnn

attention_textcnn

作者:互联网

attention_textcnn

first

 本文原创,如有任何问题欢迎评论,两天内回复.如果对您有所帮助,希望能点个赞(卑微脸)

 另外,本文章代码直接可用,按顺序复制粘贴运行即可。如有帮助,希望能点个赞0.0

build_model

	建立基于textcnn的注意力模型, 本文的注意力指的是:对于一个sentence, 里面的每
个word对该sentence的注意力权重。
import tensorflow as tf
from tensorflow.keras import Model
from tensorflow.keras.layers import Dense, Embedding, MaxPooling2D, Conv2D, Flatten, Concatenate, Dropout, concatenate, TimeDistributed
from tensorflow.keras.constraints import max_norm

class attention_cnn(Model):  
    def __init__(self, vocab_size=10000, embedding_dim=100, max_len=64):
        super(attention_cnn, self).__init__()
        self.max_len = max_len
        self.embedding = Embedding(vocab_size+1, embedding_dim, input_length=max_len)
        self.W3 = TimeDistributed(Dense(max_len, activation='tanh'))
        self.f1 = Dense(1)  # 令embedding_dim维度 = 1
        self.conv1 = Conv2D(filters=100, kernel_size=(3, embedding_dim), padding='valid', kernel_constraint=max_norm(3, [0, 1, 2]))  # 1-D卷积
        self.conv2 = Conv2D(filters=100, kernel_size=(4, embedding_dim), padding='valid', kernel_constraint=max_norm(3, [0, 1, 2]))
        self.conv3 = Conv2D(filters=100, kernel_size=(5, embedding_dim), padding='valid', kernel_constraint=max_norm(3, [0, 1, 2]))
        self.pool1 = MaxPooling2D(pool_size=(max_len - 3 + 1, 1))
        self.pool2 = MaxPooling2D(pool_size=(max_len - 4 + 1, 1))
        self.pool3 = MaxPooling2D(pool_size=(max_len - 5 + 1, 1))
        self.drop = Dropout(rate=0.5)
        self.flatten = Flatten()
        self.dense = Dense(1, activation='sigmoid')

    def call(self, inputs):
        x_input = inputs
        x = self.embedding(x_input)

        # ======================== attention =========================
        sentence_embedding = tf.reduce_mean(x, 1, keepdims=True)
        con_vector1 = concatenate([x, sentence_embedding], axis=1)
        score = self.f1(self.W3(con_vector1))  # (50, 65, 1)
        # ============================================================

        weights = tf.nn.softmax(score, axis=1)
        context_vector = tf.reduce_sum(weights * con_vector1, axis=1, keepdims=True)
        # ============================================================

        # ---------------- concatenate(or Concatenate) ----------------
        # con_vector1 = concatenate([x, context_vector], axis=1)
        con_vector2 = Concatenate(axis=1)([x, context_vector])
        # -------------------------------------------------------------

        con_vector2 = con_vector2[..., tf.newaxis]      # channel_first
        x = self.conv1(con_vector2)
        p1 = self.pool1(x)
        x = self.conv2(con_vector2)
        p2 = self.pool2(x)
        x = self.conv3(con_vector2)
        p3 = self.pool3(x)

        # ---------------- concatenate(or Concatenate) ----------------
        con_vector3 = concatenate([p1, p2, p3], axis=2)
        # con_vector3 = Concatenate(axis=2)([p1, p2, p3])
        # -------------------------------------------------------------

        x = self.drop(con_vector3)
        x = self.flatten(x)
        output = self.dense(x)

        return output

data_process

	选择imdb数据集进行数据预处理,,根据model的输出函数为sigmoid,可知为2分类问题,y的标签为0或1。
def load_imdb(num_words):
    (x_train, y_train), (x_test, y_test) = keras.datasets.imdb.load_data(num_words=num_words)
    return x_train, y_train, x_test, y_test

def pad_sentence(x_train, x_test, max_len=64):
    x_train = keras.preprocessing.sequence.pad_sequences(x_train, value=0, padding='post', maxlen=max_len)
    x_test = keras.preprocessing.sequence.pad_sequences(x_test, value=0, padding='post', maxlen=max_len)
    return x_train, x_test, max_len

run model

	调用model, 然后设置train和test的数据格式以及训练参数,进行训练。
MAX_LEN = 64
batchsz = 64

x_train, y_train, x_test, y_test, reverse_word_index = load_imdb(10000)
x_train, x_test, max_len = pad_sentence(x_train, x_test, max_len=MAX_LEN)

model = attention_cnn(max_len=MAX_LEN)
model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])
history = model.fit(x_train, y_train, epochs=10, batch_size=batchsz, validation_data=(x_test, y_test), verbose=2)
model.summary()
本文设置epoch为10使得模型过拟合了, 实则在第二个epoch时已经取得了良好的效果。

在这里插入图片描述

注释

	在实现本文moddel时, 有1个改进点和3个注意点:
	改进点:attention有两种方式,分别是[x, x1]和[x+x1]
	注意点:1)在self.init里面定义的layer一定要使用,否则summary调用model的
layer层信息会出is not built问题
	   2)concatenate适用于funciton,Concatenate适用于Sequential
	   3)用tf.expand_dims或者x=x[..., tf.newaxis]来解决维度不一致(ndims)问题

	为了代码的易读性,本文使用的是[x, x1]注意力模式,关于其它改进点和注意点的code
实现省略了,如果需要或有其它问题可以留言,博主两天内会给答复。

转载声明

	本文为原创博客,地址为https://editor.csdn.net/md?not_checkout=1&articleId=
110628901,如需转载请注明出处。另外本博客持续更新,欢迎大家批评指正。

标签:max,self,attention,len,textcnn,train,test,con
来源: https://blog.csdn.net/qq_39667545/article/details/110628901