attention_textcnn
作者:互联网
attention_textcnn
first
本文原创,如有任何问题欢迎评论,两天内回复.如果对您有所帮助,希望能点个赞(卑微脸)
另外,本文章代码直接可用,按顺序复制粘贴运行即可。如有帮助,希望能点个赞0.0
build_model
建立基于textcnn的注意力模型, 本文的注意力指的是:对于一个sentence, 里面的每
个word对该sentence的注意力权重。
import tensorflow as tf
from tensorflow.keras import Model
from tensorflow.keras.layers import Dense, Embedding, MaxPooling2D, Conv2D, Flatten, Concatenate, Dropout, concatenate, TimeDistributed
from tensorflow.keras.constraints import max_norm
class attention_cnn(Model):
def __init__(self, vocab_size=10000, embedding_dim=100, max_len=64):
super(attention_cnn, self).__init__()
self.max_len = max_len
self.embedding = Embedding(vocab_size+1, embedding_dim, input_length=max_len)
self.W3 = TimeDistributed(Dense(max_len, activation='tanh'))
self.f1 = Dense(1) # 令embedding_dim维度 = 1
self.conv1 = Conv2D(filters=100, kernel_size=(3, embedding_dim), padding='valid', kernel_constraint=max_norm(3, [0, 1, 2])) # 1-D卷积
self.conv2 = Conv2D(filters=100, kernel_size=(4, embedding_dim), padding='valid', kernel_constraint=max_norm(3, [0, 1, 2]))
self.conv3 = Conv2D(filters=100, kernel_size=(5, embedding_dim), padding='valid', kernel_constraint=max_norm(3, [0, 1, 2]))
self.pool1 = MaxPooling2D(pool_size=(max_len - 3 + 1, 1))
self.pool2 = MaxPooling2D(pool_size=(max_len - 4 + 1, 1))
self.pool3 = MaxPooling2D(pool_size=(max_len - 5 + 1, 1))
self.drop = Dropout(rate=0.5)
self.flatten = Flatten()
self.dense = Dense(1, activation='sigmoid')
def call(self, inputs):
x_input = inputs
x = self.embedding(x_input)
# ======================== attention =========================
sentence_embedding = tf.reduce_mean(x, 1, keepdims=True)
con_vector1 = concatenate([x, sentence_embedding], axis=1)
score = self.f1(self.W3(con_vector1)) # (50, 65, 1)
# ============================================================
weights = tf.nn.softmax(score, axis=1)
context_vector = tf.reduce_sum(weights * con_vector1, axis=1, keepdims=True)
# ============================================================
# ---------------- concatenate(or Concatenate) ----------------
# con_vector1 = concatenate([x, context_vector], axis=1)
con_vector2 = Concatenate(axis=1)([x, context_vector])
# -------------------------------------------------------------
con_vector2 = con_vector2[..., tf.newaxis] # channel_first
x = self.conv1(con_vector2)
p1 = self.pool1(x)
x = self.conv2(con_vector2)
p2 = self.pool2(x)
x = self.conv3(con_vector2)
p3 = self.pool3(x)
# ---------------- concatenate(or Concatenate) ----------------
con_vector3 = concatenate([p1, p2, p3], axis=2)
# con_vector3 = Concatenate(axis=2)([p1, p2, p3])
# -------------------------------------------------------------
x = self.drop(con_vector3)
x = self.flatten(x)
output = self.dense(x)
return output
data_process
选择imdb数据集进行数据预处理,,根据model的输出函数为sigmoid,可知为2分类问题,y的标签为0或1。
def load_imdb(num_words):
(x_train, y_train), (x_test, y_test) = keras.datasets.imdb.load_data(num_words=num_words)
return x_train, y_train, x_test, y_test
def pad_sentence(x_train, x_test, max_len=64):
x_train = keras.preprocessing.sequence.pad_sequences(x_train, value=0, padding='post', maxlen=max_len)
x_test = keras.preprocessing.sequence.pad_sequences(x_test, value=0, padding='post', maxlen=max_len)
return x_train, x_test, max_len
run model
调用model, 然后设置train和test的数据格式以及训练参数,进行训练。
MAX_LEN = 64
batchsz = 64
x_train, y_train, x_test, y_test, reverse_word_index = load_imdb(10000)
x_train, x_test, max_len = pad_sentence(x_train, x_test, max_len=MAX_LEN)
model = attention_cnn(max_len=MAX_LEN)
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
history = model.fit(x_train, y_train, epochs=10, batch_size=batchsz, validation_data=(x_test, y_test), verbose=2)
model.summary()
本文设置epoch为10使得模型过拟合了, 实则在第二个epoch时已经取得了良好的效果。
注释
在实现本文moddel时, 有1个改进点和3个注意点:
改进点:attention有两种方式,分别是[x, x1]和[x+x1]
注意点:1)在self.init里面定义的layer一定要使用,否则summary调用model的
layer层信息会出is not built问题
2)concatenate适用于funciton,Concatenate适用于Sequential
3)用tf.expand_dims或者x=x[..., tf.newaxis]来解决维度不一致(ndims)问题
为了代码的易读性,本文使用的是[x, x1]注意力模式,关于其它改进点和注意点的code
实现省略了,如果需要或有其它问题可以留言,博主两天内会给答复。
转载声明
本文为原创博客,地址为https://editor.csdn.net/md?not_checkout=1&articleId=
110628901,如需转载请注明出处。另外本博客持续更新,欢迎大家批评指正。
标签:max,self,attention,len,textcnn,train,test,con 来源: https://blog.csdn.net/qq_39667545/article/details/110628901