其他分享
首页 > 其他分享> > 6-2训练模型的3种方法——eat_tensorflow2_in_30_days

6-2训练模型的3种方法——eat_tensorflow2_in_30_days

作者:互联网

6-2训练模型的3种方法

模型的训练主要有内置fit方法、内置train_on_batch方法、自定义训练循环。
注:fit_generator方法在tf.keras中不推荐使用,其功能已经被fit包含。

import numpy as np 
import pandas as pd 
import tensorflow as tf
from tensorflow.keras import * 

#打印时间分割线
@tf.function
def printbar():
    ts = tf.timestamp()
    today_ts = ts%(24*60*60)

    hour = tf.cast(today_ts//3600+8,tf.int32)%tf.constant(24)
    minite = tf.cast((today_ts%3600)//60,tf.int32)
    second = tf.cast(tf.floor(today_ts%60),tf.int32)
    
    def timeformat(m):
        if tf.strings.length(tf.strings.format("{}",m))==1:
            return(tf.strings.format("0{}",m))
        else:
            return(tf.strings.format("{}",m))
    
    timestring = tf.strings.join([timeformat(hour),timeformat(minite),
                timeformat(second)],separator = ":")
    tf.print("=========="*8,end = "")
    tf.print(timestring)
MAX_LEN = 300
BATCH_SIZE = 32
(x_train, y_train), (x_test, y_test) = datasets.reuters.load_data()  # 路透社(reuters)数据集(多分类问题)
x_train = preprocessing.sequence.pad_sequences(x_train, maxlen=MAX_LEN)  # 本函数的作用是将序列填充到相同的长度
x_test = preprocessing.sequence.pad_sequences(x_test, maxlen=MAX_LEN)

MAX_WORDS = x_train.max() + 1
CAT_NUM = y_train.max() + 1

ds_train = tf.data.Dataset.from_tensor_slices((x_train, y_train)) \
        .shuffle(buffer_size=1000).batch(BATCH_SIZE) \
        .prefetch(tf.data.experimental.AUTOTUNE).cache()

ds_test = tf.data.Dataset.from_tensor_slices((x_test, y_test)) \
        .shuffle(buffer_size=1000).batch(BATCH_SIZE) \
        .prefetch(tf.data.experimental.AUTOTUNE).cache()

内置fit方法

该方法功能非常强大, 支持对numpy array, tf.data.Dataset以及 Python generator数据进行训练。
并且可以通过设置回调函数实现对训练过程的复杂控制逻辑。

tf.keras.backend.clear_session()

def create_model():
    model = models.Sequential()
    model.add(layers.Embedding(MAX_WORDS, 7, input_length=MAX_LEN))
    model.add(layers.Conv1D(filters=64, kernel_size=5, activation="relu"))
    model.add(layers.MaxPool1D(2))
    model.add(layers.Conv1D(filters=32, kernel_size=3, activation="relu"))
    model.add(layers.MaxPool1D(2))
    model.add(layers.Flatten())
    model.add(layers.Dense(CAT_NUM, activation="softmax"))
    return model
    
def compile_model(model):
    model.compile(optimizer=optimizers.Nadam(), loss=losses.SparseCategoricalCrossentropy(), 
                 metrics=[metrics.SparseCategoricalAccuracy(), metrics.SparseTopKCategoricalAccuracy(5)])
    return model

model = create_model()
model.summary()
model = compile_model(model)

"""
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding (Embedding)        (None, 300, 7)            216874    
_________________________________________________________________
conv1d (Conv1D)              (None, 296, 64)           2304      
_________________________________________________________________
max_pooling1d (MaxPooling1D) (None, 148, 64)           0         
_________________________________________________________________
conv1d_1 (Conv1D)            (None, 146, 32)           6176      
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 73, 32)            0         
_________________________________________________________________
flatten (Flatten)            (None, 2336)              0         
_________________________________________________________________
dense (Dense)                (None, 46)                107502    
=================================================================
Total params: 332,856
Trainable params: 332,856
Non-trainable params: 0
_________________________________________________________________
"""
history = model.fit(ds_train, validation_data=ds_test, epochs=10)

"""
Epoch 1/10
281/281 [==============================] - 5s 9ms/step - loss: 2.0197 - sparse_categorical_accuracy: 0.4672 - sparse_top_k_categorical_accuracy: 0.7467 - val_loss: 1.6577 - val_sparse_categorical_accuracy: 0.5565 - val_sparse_top_k_categorical_accuracy: 0.7605
Epoch 2/10
281/281 [==============================] - 3s 9ms/step - loss: 1.4731 - sparse_categorical_accuracy: 0.6158 - sparse_top_k_categorical_accuracy: 0.7964 - val_loss: 1.5053 - val_sparse_categorical_accuracy: 0.6233 - val_sparse_top_k_categorical_accuracy: 0.7943
Epoch 3/10
281/281 [==============================] - 3s 9ms/step - loss: 1.1874 - sparse_categorical_accuracy: 0.6900 - sparse_top_k_categorical_accuracy: 0.8554 - val_loss: 1.5012 - val_sparse_categorical_accuracy: 0.6420 - val_sparse_top_k_categorical_accuracy: 0.8103
Epoch 4/10
281/281 [==============================] - 3s 10ms/step - loss: 0.9202 - sparse_categorical_accuracy: 0.7582 - sparse_top_k_categorical_accuracy: 0.9096 - val_loss: 1.6661 - val_sparse_categorical_accuracy: 0.6349 - val_sparse_top_k_categorical_accuracy: 0.8085
Epoch 5/10
281/281 [==============================] - 3s 9ms/step - loss: 0.6896 - sparse_categorical_accuracy: 0.8196 - sparse_top_k_categorical_accuracy: 0.9482 - val_loss: 1.9133 - val_sparse_categorical_accuracy: 0.6198 - val_sparse_top_k_categorical_accuracy: 0.8090
Epoch 6/10
281/281 [==============================] - 3s 10ms/step - loss: 0.5202 - sparse_categorical_accuracy: 0.8705 - sparse_top_k_categorical_accuracy: 0.9693 - val_loss: 2.1718 - val_sparse_categorical_accuracy: 0.6109 - val_sparse_top_k_categorical_accuracy: 0.7970
Epoch 7/10
281/281 [==============================] - 3s 10ms/step - loss: 0.4118 - sparse_categorical_accuracy: 0.9014 - sparse_top_k_categorical_accuracy: 0.9798 - val_loss: 2.4270 - val_sparse_categorical_accuracy: 0.6033 - val_sparse_top_k_categorical_accuracy: 0.7974
Epoch 8/10
281/281 [==============================] - 3s 10ms/step - loss: 0.3455 - sparse_categorical_accuracy: 0.9167 - sparse_top_k_categorical_accuracy: 0.9861 - val_loss: 2.6544 - val_sparse_categorical_accuracy: 0.6042 - val_sparse_top_k_categorical_accuracy: 0.8023
Epoch 9/10
281/281 [==============================] - 3s 10ms/step - loss: 0.2986 - sparse_categorical_accuracy: 0.9281 - sparse_top_k_categorical_accuracy: 0.9896 - val_loss: 2.8394 - val_sparse_categorical_accuracy: 0.6042 - val_sparse_top_k_categorical_accuracy: 0.7983
Epoch 10/10
281/281 [==============================] - 3s 10ms/step - loss: 0.2660 - sparse_categorical_accuracy: 0.9338 - sparse_top_k_categorical_accuracy: 0.9928 - val_loss: 2.9911 - val_sparse_categorical_accuracy: 0.6077 - val_sparse_top_k_categorical_accuracy: 0.8001
"""

内置train_on_batch方法

该内置方法相比较fit方法更加灵活,可以不通过回调函数而直接在批次层次上更加精细地控制训练的过程。

tf.keras.backend.clear_session()

def create_model():
    model = models.Sequential()
    model.add(layers.Embedding(MAX_WORDS, 7, input_length=MAX_LEN))
    model.add(layers.Conv1D(filters=64, kernel_size=5, activation="relu"))
    model.add(layers.MaxPool1D(2))
    model.add(layers.Conv1D(filters=32, kernel_size=3, activation="relu"))
    model.add(layers.MaxPool1D(2))
    model.add(layers.Flatten())
    model.add(layers.Dense(CAT_NUM, activation="softmax"))
    return model
    
def compile_model(model):
    model.compile(optimizer=optimizers.Nadam(), loss=losses.SparseCategoricalCrossentropy(), 
                 metrics=[metrics.SparseCategoricalAccuracy(), metrics.SparseTopKCategoricalAccuracy(5)])
    return model

model = create_model()
model.summary()
model = compile_model(model)

"""
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding (Embedding)        (None, 300, 7)            216874    
_________________________________________________________________
conv1d (Conv1D)              (None, 296, 64)           2304      
_________________________________________________________________
max_pooling1d (MaxPooling1D) (None, 148, 64)           0         
_________________________________________________________________
conv1d_1 (Conv1D)            (None, 146, 32)           6176      
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 73, 32)            0         
_________________________________________________________________
flatten (Flatten)            (None, 2336)              0         
_________________________________________________________________
dense (Dense)                (None, 46)                107502    
=================================================================
Total params: 332,856
Trainable params: 332,856
Non-trainable params: 0
_________________________________________________________________
"""
def train_model(model, ds_train, ds_valid, epoches):
    for epoch in tf.range(1, epoches+1):
        model.reset_metrics()
        
        # 在后期降低学习率
        if epoch == 5:
            model.optimizer.lr.assign(model.optimizer.lr/2.0)
            tf.print("Lowering optimizer Learning Rate...\n\n")
            
        for x, y in ds_train:
            train_result = model.train_on_batch(x, y)
            
        for x, y in ds_valid:
            valid_result = model.test_on_batch(x, y, reset_metrics=False)
            
        if epoch % 1 == 0:
            printbar()
            tf.print("epoch=", epoch)
            tf.print("train:", dict(zip(model.metrics_names, train_result)))
            tf.print("valid:", dict(zip(model.metrics_names, valid_result)))
            tf.print()
train_model(model, ds_train, ds_test, 10)

"""
================================================================================14:17:29
epoch= 1
train: {'loss': 0.12633401155471802,
 'sparse_categorical_accuracy': 0.9090909361839294,
 'sparse_top_k_categorical_accuracy': 1.0}
valid: {'loss': 3.175718307495117,
 'sparse_categorical_accuracy': 0.6286731958389282,
 'sparse_top_k_categorical_accuracy': 0.8018699884414673}

================================================================================14:17:32
epoch= 2
train: {'loss': 0.12677741050720215,
 'sparse_categorical_accuracy': 0.9090909361839294,
 'sparse_top_k_categorical_accuracy': 1.0}
valid: {'loss': 3.2581818103790283,
 'sparse_categorical_accuracy': 0.6291184425354004,
 'sparse_top_k_categorical_accuracy': 0.8036509156227112}

================================================================================14:17:36
epoch= 3
train: {'loss': 0.12046194076538086,
 'sparse_categorical_accuracy': 0.9090909361839294,
 'sparse_top_k_categorical_accuracy': 1.0}
valid: {'loss': 3.3311474323272705,
 'sparse_categorical_accuracy': 0.6277827024459839,
 'sparse_top_k_categorical_accuracy': 0.8036509156227112}

================================================================================14:17:40
epoch= 4
train: {'loss': 0.12313029915094376,
 'sparse_categorical_accuracy': 0.9090909361839294,
 'sparse_top_k_categorical_accuracy': 1.0}
valid: {'loss': 3.409259796142578,
 'sparse_categorical_accuracy': 0.6273375153541565,
 'sparse_top_k_categorical_accuracy': 0.8032057285308838}

Lowering optimizer Learning Rate...


================================================================================14:17:43
epoch= 5
train: {'loss': 0.0890919640660286,
 'sparse_categorical_accuracy': 0.9545454382896423,
 'sparse_top_k_categorical_accuracy': 1.0}
valid: {'loss': 3.4837234020233154,
 'sparse_categorical_accuracy': 0.6286731958389282,
 'sparse_top_k_categorical_accuracy': 0.8023152351379395}

================================================================================14:17:46
epoch= 6
train: {'loss': 0.09986814856529236,
 'sparse_categorical_accuracy': 0.9090909361839294,
 'sparse_top_k_categorical_accuracy': 1.0}
valid: {'loss': 3.535616397857666,
 'sparse_categorical_accuracy': 0.6286731958389282,
 'sparse_top_k_categorical_accuracy': 0.8014247417449951}

================================================================================14:17:50
epoch= 7
train: {'loss': 0.10269368439912796,
 'sparse_categorical_accuracy': 0.9090909361839294,
 'sparse_top_k_categorical_accuracy': 1.0}
valid: {'loss': 3.5861194133758545,
 'sparse_categorical_accuracy': 0.6277827024459839,
 'sparse_top_k_categorical_accuracy': 0.8005343079566956}

================================================================================14:17:53
epoch= 8
train: {'loss': 0.10358986258506775,
 'sparse_categorical_accuracy': 0.9090909361839294,
 'sparse_top_k_categorical_accuracy': 1.0}
valid: {'loss': 3.6353237628936768,
 'sparse_categorical_accuracy': 0.628227949142456,
 'sparse_top_k_categorical_accuracy': 0.8005343079566956}

================================================================================14:17:57
epoch= 9
train: {'loss': 0.10312973707914352,
 'sparse_categorical_accuracy': 0.9545454382896423,
 'sparse_top_k_categorical_accuracy': 1.0}
valid: {'loss': 3.683328151702881,
 'sparse_categorical_accuracy': 0.6286731958389282,
 'sparse_top_k_categorical_accuracy': 0.7996438145637512}

================================================================================14:18:00
epoch= 10
train: {'loss': 0.10231606662273407,
 'sparse_categorical_accuracy': 0.9545454382896423,
 'sparse_top_k_categorical_accuracy': 1.0}
valid: {'loss': 3.729416608810425,
 'sparse_categorical_accuracy': 0.6300088763237,
 'sparse_top_k_categorical_accuracy': 0.8000890612602234}
"""

自定义训练循环

自定义训练循环无需编译模型,直接利用优化器根据损失函数反向传播迭代参数,拥有最高的灵活性。

tf.keras.backend.clear_session()

def create_model():
    
    model = models.Sequential()

    model.add(layers.Embedding(MAX_WORDS,7,input_length=MAX_LEN))
    model.add(layers.Conv1D(filters = 64,kernel_size = 5,activation = "relu"))
    model.add(layers.MaxPool1D(2))
    model.add(layers.Conv1D(filters = 32,kernel_size = 3,activation = "relu"))
    model.add(layers.MaxPool1D(2))
    model.add(layers.Flatten())
    model.add(layers.Dense(CAT_NUM,activation = "softmax"))
    return(model)

model = create_model()
model.summary()

"""
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding (Embedding)        (None, 300, 7)            216874    
_________________________________________________________________
conv1d (Conv1D)              (None, 296, 64)           2304      
_________________________________________________________________
max_pooling1d (MaxPooling1D) (None, 148, 64)           0         
_________________________________________________________________
conv1d_1 (Conv1D)            (None, 146, 32)           6176      
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 73, 32)            0         
_________________________________________________________________
flatten (Flatten)            (None, 2336)              0         
_________________________________________________________________
dense (Dense)                (None, 46)                107502    
=================================================================
Total params: 332,856
Trainable params: 332,856
Non-trainable params: 0
_________________________________________________________________
"""
optimizer = optimizers.Nadam()
loss_func = losses.SparseCategoricalCrossentropy()

train_loss = metrics.Mean(name="train_loss")
train_metric = metrics.SparseCategoricalAccuracy(name="train_accuracy")
valid_loss = metrics.Mean(name="valid_loss")
valid_metric = metrics.SparseCategoricalAccuracy(name="valid_accuracy")

@tf.function
def train_step(model, features, labels):
    with tf.GradientTape() as tape:
        predictions = model(features, training=True)
        loss = loss_func(labels, predictions)
    gradients = tape.gradient(loss, model.trainable_variables)
    optimizer.apply_gradients(zip(gradients, model.trainable_variables))
    
    train_loss.update_state(loss)
    train_metric.update_state(labels, predictions)
    
@tf.function
def valid_step(model, features, labels):
    predictions = model(features)
    batch_loss = loss_func(labels, predictions)
    valid_loss.update_state(batch_loss)
    valid_metric.update_state(labels, predictions)
    
def train_model(model, ds_train, ds_valid, epochs):
    for epoch in tf.range(1, epochs+1):
        for features, labels in ds_train:
            train_step(model, features, labels)
            
        for features, labels in ds_valid:
            valid_step(model, features, labels)
            
        logs ='Epoch={},Loss={},Accuracy={},Valid Loss={},Valid Accuracy={}'
        
        if epoch % 1 == 0:
            printbar()
            tf.print(tf.strings.format(logs, (epoch, train_loss.result(), train_metric.result(), 
                                              valid_loss.result(), valid_metric.result())))
            tf.print()
            
        train_loss.reset_states()
        valid_loss.reset_states()
        train_metric.reset_states()
        valid_metric.reset_states()
        
train_model(model, ds_train, ds_test, 10)

"""
================================================================================14:36:31
Epoch=1,Loss=0.0903269127,Accuracy=0.961589873,Valid Loss=5.59807396,Valid Accuracy=0.60195905

================================================================================14:36:32
Epoch=2,Loss=0.0847987905,Accuracy=0.961701155,Valid Loss=5.84705496,Valid Accuracy=0.606411397

================================================================================14:36:33
Epoch=3,Loss=0.0838754,Accuracy=0.963037193,Valid Loss=6.18187094,Valid Accuracy=0.593944788

================================================================================14:36:34
Epoch=4,Loss=0.0804576576,Accuracy=0.96426183,Valid Loss=6.42245388,Valid Accuracy=0.589937687

================================================================================14:36:36
Epoch=5,Loss=0.0807235315,Accuracy=0.96426183,Valid Loss=6.82747221,Valid Accuracy=0.593944788

================================================================================14:36:37
Epoch=6,Loss=0.0784740224,Accuracy=0.965709209,Valid Loss=6.98454714,Valid Accuracy=0.597951889

================================================================================14:36:39
Epoch=7,Loss=0.0771239623,Accuracy=0.965931892,Valid Loss=7.47834,Valid Accuracy=0.595725715

================================================================================14:36:40
Epoch=8,Loss=0.0787713081,Accuracy=0.965375185,Valid Loss=7.18225527,Valid Accuracy=0.593499541

================================================================================14:36:41
Epoch=9,Loss=0.0747968331,Accuracy=0.96648854,Valid Loss=7.59963,Valid Accuracy=0.589937687

================================================================================14:36:42
Epoch=10,Loss=0.0723122805,Accuracy=0.966711223,Valid Loss=7.80304813,Valid Accuracy=0.584594846
"""

标签:tensorflow2,loss,categorical,30,days,train,sparse,model,accuracy
来源: https://www.cnblogs.com/lotuslaw/p/16437570.html