《软件工程》-卷积神经网络
作者:互联网
一.MNIST 数据集分类
深度卷积神经网络中,有如下特性
另外值得注意的是,DataLoader是一个比较重要的类,提供的常用操作有:batch_size(每个batch的大小), shuffle(是否进行随机打乱顺序的操作), num_workers(加载数据的时候使用几个子进程)
- 很多层: compositionality
- 卷积: locality + stationarity of images
- 池化: Invariance of object class to translations
import torch import torch.nn as nn import torch.nn.functional as F import torch.optim as optim from torchvision import datasets, transforms import matplotlib.pyplot as plt import numpy # 一个函数,用来计算模型中有多少参数 def get_n_params(model): np=0 for p in list(model.parameters()): np += p.nelement() return np # 使用GPU训练,可以在菜单 "代码执行工具" -> "更改运行时类型" 里进行设置 device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
1. 加载数据 (MNIST)
PyTorch里包含了 MNIST, CIFAR10 等常用数据集,调用 torchvision.datasets 即可把这些数据由远程下载到本地,下面给出MNIST的使用方法:
torchvision.datasets.MNIST(root, train=True, transform=None, target_transform=None, download=False)
- root 为数据集下载到本地后的根目录,包括 training.pt 和 test.pt 文件
- train,如果设置为True,从training.pt创建数据集,否则从test.pt创建。
- download,如果设置为True, 从互联网下载数据并放到root文件夹下
- transform, 一种函数或变换,输入PIL图片,返回变换之后的数据。
- target_transform 一种函数或变换,输入目标,进行变换。
input_size = 28*28 # MNIST上的图像尺寸是 28x28 output_size = 10 # 类别为 0 到 9 的数字,因此为十类 train_loader = torch.utils.data.DataLoader( datasets.MNIST('./data', train=True, download=True, transform=transforms.Compose( [transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])), batch_size=64, shuffle=True) test_loader = torch.utils.data.DataLoader( datasets.MNIST('./data', train=False, transform=transforms.Compose([ transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])), batch_size=1000, shuffle=True)
运行结果为:
Downloading
http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading
http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to
./data/MNIST/raw/train-images-idx3-ubyte.gz Extracting
./data/MNIST/raw/train-images-idx3-ubyte.gz to ./data/MNIST/rawDownloading
http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading
http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to
./data/MNIST/raw/train-labels-idx1-ubyte.gz Extracting
./data/MNIST/raw/train-labels-idx1-ubyte.gz to ./data/MNIST/rawDownloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
to ./data/MNIST/raw/t10k-images-idx3-ubyte.gz Extracting
./data/MNIST/raw/t10k-images-idx3-ubyte.gz to ./data/MNIST/rawDownloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
to ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz Extracting
./data/MNIST/raw/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw/usr/local/lib/python3.7/dist-packages/torchvision/datasets/mnist.py:498:
UserWarning: The given NumPy array is not writeable, and PyTorch does
not support non-writeable tensors. This means you can write to the
underlying (supposedly non-writeable) NumPy array using the tensor.
You may want to copy the array to protect its data or make it
writeable before converting it to a tensor. This type of warning will
be suppressed for the rest of this program. (Triggered internally at
/pytorch/torch/csrc/utils/tensor_numpy.cpp:180.) return
torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)
显示数据集中的部分图像
plt.figure(figsize=(8, 5))
for i in range(20):
plt.subplot(4, 5, i + 1)
image, _ = train_loader.dataset.__getitem__(i)
plt.imshow(image.squeeze().numpy(),'gray')
plt.axis('off');
运行结果为:
2. 创建网络
定义网络时,需要继承nn.Module,并实现它的forward方法,把网络中具有可学习参数的层放在构造函数init中。
只要在nn.Module的子类中定义了forward函数,backward函数就会自动被实现(利用autograd)。
class FC2Layer(nn.Module):
def __init__(self, input_size, n_hidden, output_size):
# nn.Module子类的函数必须在构造函数中执行父类的构造函数
# 下式等价于nn.Module.__init__(self)
super(FC2Layer, self).__init__()
self.input_size = input_size
# 这里直接用 Sequential 就定义了网络,注意要和下面 CNN 的代码区分开
self.network = nn.Sequential(
nn.Linear(input_size, n_hidden),
nn.ReLU(),
nn.Linear(n_hidden, n_hidden),
nn.ReLU(),
nn.Linear(n_hidden, output_size),
nn.LogSoftmax(dim=1)
)
def forward(self, x):
# view一般出现在model类的forward函数中,用于改变输入或输出的形状
# x.view(-1, self.input_size) 的意思是多维的数据展成二维
# 代码指定二维数据的列数为 input_size=784,行数 -1 表示我们不想算,电脑会自己计算对应的数字
# 在 DataLoader 部分,我们可以看到 batch_size 是64,所以得到 x 的行数是64
# 大家可以加一行代码:print(x.cpu().numpy().shape)
# 训练过程中,就会看到 (64, 784) 的输出,和我们的预期是一致的
# forward 函数的作用是,指定网络的运行过程,这个全连接网络可能看不啥意义,
# 下面的CNN网络可以看出 forward 的作用。
x = x.view(-1, self.input_size)
return self.network(x)
class CNN(nn.Module):
def __init__(self, input_size, n_feature, output_size):
# 执行父类的构造函数,所有的网络都要这么写
super(CNN, self).__init__()
# 下面是网络里典型结构的一些定义,一般就是卷积和全连接
# 池化、ReLU一类的不用在这里定义
self.n_feature = n_feature
self.conv1 = nn.Conv2d(in_channels=1, out_channels=n_feature, kernel_size=5)
self.conv2 = nn.Conv2d(n_feature, n_feature, kernel_size=5)
self.fc1 = nn.Linear(n_feature*4*4, 50)
self.fc2 = nn.Linear(50, 10)
# 下面的 forward 函数,定义了网络的结构,按照一定顺序,把上面构建的一些结构组织起来
# 意思就是,conv1, conv2 等等的,可以多次重用
def forward(self, x, verbose=False):
x = self.conv1(x)
x = F.relu(x)
x = F.max_pool2d(x, kernel_size=2)
x = self.conv2(x)
x = F.relu(x)
x = F.max_pool2d(x, kernel_size=2)
x = x.view(-1, self.n_feature*4*4)
x = self.fc1(x)
x = F.relu(x)
x = self.fc2(x)
x = F.log_softmax(x, dim=1)
return x
定义训练和测试函数
# 训练函数
def train(model):
model.train()
# 主里从train_loader里,64个样本一个batch为单位提取样本进行训练
for batch_idx, (data, target) in enumerate(train_loader):
# 把数据送到GPU中
data, target = data.to(device), target.to(device)
optimizer.zero_grad()
output = model(data)
loss = F.nll_loss(output, target)
loss.backward()
optimizer.step()
if batch_idx % 100 == 0:
print('Train: [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
batch_idx * len(data), len(train_loader.dataset),
100. * batch_idx / len(train_loader), loss.item()))
def test(model):
model.eval()
test_loss = 0
correct = 0
for data, target in test_loader:
# 把数据送到GPU中
data, target = data.to(device), target.to(device)
# 把数据送入模型,得到预测结果
output = model(data)
# 计算本次batch的损失,并加到 test_loss 中
test_loss += F.nll_loss(output, target, reduction='sum').item()
# get the index of the max log-probability,最后一层输出10个数,
# 值最大的那个即对应着分类结果,然后把分类结果保存在 pred 里
pred = output.data.max(1, keepdim=True)[1]
# 将 pred 与 target 相比,得到正确预测结果的数量,并加到 correct 中
# 这里需要注意一下 view_as ,意思是把 target 变成维度和 pred 一样的意思
correct += pred.eq(target.data.view_as(pred)).cpu().sum().item()
test_loss /= len(test_loader.dataset)
accuracy = 100. * correct / len(test_loader.dataset)
print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
test_loss, correct, len(test_loader.dataset),
accuracy))
在小型全连接网络上训练(Fully-connected network)
n_hidden = 8 # number of hidden units
model_fnn = FC2Layer(input_size, n_hidden, output_size)
model_fnn.to(device)
optimizer = optim.SGD(model_fnn.parameters(), lr=0.01, momentum=0.5)
print('Number of parameters: {}'.format(get_n_params(model_fnn)))
train(model_fnn)
test(model_fnn)
运行结果为:
Number of parameters: 6442 Train: [0/60000 (0%)] Loss: 2.337591 Train: [6400/60000 (11%)] Loss: 1.948347 Train: [12800/60000 (21%)] Loss: 1.346948 Train: [19200/60000 (32%)] Loss: 0.865751 Train: [25600/60000 (43%)] Loss: 0.688250 Train: [32000/60000 (53%)] Loss: 0.756100 Train: [38400/60000 (64%)] Loss: 0.862340 Train: [44800/60000 (75%)] Loss: 0.509505 Train: [51200/60000 (85%)] Loss: 0.516737 Train: [57600/60000 (96%)] Loss: 0.541380 Test set: Average loss: 0.4560, Accuracy: 8693/10000 (87%)
4. 在卷积神经网络上训练
# Training settings
n_features = 6 # number of feature maps
model_cnn = CNN(input_size, n_features, output_size)
model_cnn.to(device)
optimizer = optim.SGD(model_cnn.parameters(), lr=0.01, momentum=0.5)
print('Number of parameters: {}'.format(get_n_params(model_cnn)))
train(model_cnn)
test(model_cnn)
运行结果为:
Number of parameters: 6422 Train: [0/60000 (0%)] Loss: 2.312314
/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py:718:
UserWarning: Named tensors and all their associated APIs are an
experimental feature and subject to change. Please do not use them for
anything important until they are released as stable. (Triggered
internally at /pytorch/c10/core/TensorImpl.h:1156.) return
torch.max_pool2d(input, kernel_size, stride, padding, dilation,
ceil_mode) Train: [6400/60000 (11%)] Loss: 1.234897 Train:
[12800/60000 (21%)] Loss: 0.575131 Train: [19200/60000 (32%)] Loss:
0.386014 Train: [25600/60000 (43%)] Loss: 0.386902 Train: [32000/60000 (53%)] Loss: 0.276137 Train: [38400/60000 (64%)] Loss: 0.332431 Train:
[44800/60000 (75%)] Loss: 0.423124 Train: [51200/60000 (85%)] Loss:
0.083104 Train: [57600/60000 (96%)] Loss: 0.092805Test set: Average loss: 0.1637, Accuracy: 9526/10000 (95%)
通过上面的测试结果,可以发现,含有相同参数的 CNN 效果要明显优于 简单的全连接网络,是因为 CNN 能够更好的挖掘图像中的信息,主要通过两个手段:
- 卷积:Locality and stationarity in images
- 池化:Builds in some translation invariance
5. 打乱像素顺序再次在两个网络上训练与测试
考虑到CNN在卷积与池化上的优良特性,如果我们把图像中的像素打乱顺序,这样 卷积 和 池化 就难以发挥作用了,为了验证这个想法,我们把图像中的像素打乱顺序再试试。
首先下面代码展示随机打乱像素顺序后,图像的形态:
# 这里解释一下 torch.randperm 函数,给定参数n,返回一个从0到n-1的随机整数排列
perm = torch.randperm(784)
plt.figure(figsize=(8, 4))
for i in range(10):
image, _ = train_loader.dataset.__getitem__(i)
# permute pixels
image_perm = image.view(-1, 28*28).clone()
image_perm = image_perm[:, perm]
image_perm = image_perm.view(-1, 1, 28, 28)
plt.subplot(4, 5, i + 1)
plt.imshow(image.squeeze().numpy(), 'gray')
plt.axis('off')
plt.subplot(4, 5, i + 11)
plt.imshow(image_perm.squeeze().numpy(), 'gray')
plt.axis('off')
重新定义训练与测试函数,我们写了两个函数 train_perm 和 test_perm,分别对应着加入像素打乱顺序的训练函数与测试函数。
与之前的训练与测试函数基本上完全相同,只是对 data 加入了打乱顺序操作。
# 对每个 batch 里的数据,打乱像素顺序的函数
def perm_pixel(data, perm):
# 转化为二维矩阵
data_new = data.view(-1, 28*28)
# 打乱像素顺序
data_new = data_new[:, perm]
# 恢复为原来4维的 tensor
data_new = data_new.view(-1, 1, 28, 28)
return data_new
# 训练函数
def train_perm(model, perm):
model.train()
for batch_idx, (data, target) in enumerate(train_loader):
data, target = data.to(device), target.to(device)
# 像素打乱顺序
data = perm_pixel(data, perm)
optimizer.zero_grad()
output = model(data)
loss = F.nll_loss(output, target)
loss.backward()
optimizer.step()
if batch_idx % 100 == 0:
print('Train: [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
batch_idx * len(data), len(train_loader.dataset),
100. * batch_idx / len(train_loader), loss.item()))
# 测试函数
def test_perm(model, perm):
model.eval()
test_loss = 0
correct = 0
for data, target in test_loader:
data, target = data.to(device), target.to(device)
# 像素打乱顺序
data = perm_pixel(data, perm)
output = model(data)
test_loss += F.nll_loss(output, target, reduction='sum').item()
pred = output.data.max(1, keepdim=True)[1]
correct += pred.eq(target.data.view_as(pred)).cpu().sum().item()
test_loss /= len(test_loader.dataset)
accuracy = 100. * correct / len(test_loader.dataset)
print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
test_loss, correct, len(test_loader.dataset),
accuracy))
在全连接网络上训练与测试:
perm = torch.randperm(784)
n_hidden = 8 # number of hidden units
model_fnn = FC2Layer(input_size, n_hidden, output_size)
model_fnn.to(device)
optimizer = optim.SGD(model_fnn.parameters(), lr=0.01, momentum=0.5)
print('Number of parameters: {}'.format(get_n_params(model_fnn)))
train_perm(model_fnn, perm)
test_perm(model_fnn, perm)
运行结果为:
Number of parameters: 6442 Train: [0/60000 (0%)] Loss: 2.319843 Train:
[6400/60000 (11%)] Loss: 1.820002 Train: [12800/60000 (21%)] Loss:
1.077188 Train: [19200/60000 (32%)] Loss: 0.675928 Train: [25600/60000 (43%)] Loss: 0.658187 Train: [32000/60000 (53%)] Loss: 0.682825 Train:
[38400/60000 (64%)] Loss: 0.629946 Train: [44800/60000 (75%)] Loss:
0.398080 Train: [51200/60000 (85%)] Loss: 0.268625 Train: [57600/60000 (96%)] Loss: 0.600681Test set: Average loss: 0.4063, Accuracy: 8846/10000 (88%)
在卷积神经网络上训练与测试:
perm = torch.randperm(784)
n_features = 6 # number of feature maps
model_cnn = CNN(input_size, n_features, output_size)
model_cnn.to(device)
optimizer = optim.SGD(model_cnn.parameters(), lr=0.01, momentum=0.5)
print('Number of parameters: {}'.format(get_n_params(model_cnn)))
train_perm(model_cnn, perm)
test_perm(model_cnn, perm)
运行结果为:
Number of parameters: 6422 Train: [0/60000 (0%)] Loss: 2.327404 Train:
[6400/60000 (11%)] Loss: 2.251524 Train: [12800/60000 (21%)] Loss:
2.113517 Train: [19200/60000 (32%)] Loss: 1.622411 Train: [25600/60000 (43%)] Loss: 1.146309 Train: [32000/60000 (53%)] Loss: 0.975707 Train:
[38400/60000 (64%)] Loss: 0.841636 Train: [44800/60000 (75%)] Loss:
0.623049 Train: [51200/60000 (85%)] Loss: 0.595479 Train: [57600/60000 (96%)] Loss: 0.610131Test set: Average loss: 0.5359, Accuracy: 8294/10000 (83%)
二、CIFAR10数据分类
对于视觉数据,PyTorch 创建了一个叫做 totchvision 的包,该包含有支持加载类似Imagenet,CIFAR10,MNIST 等公共数据集的数据加载模块 torchvision.datasets 和支持加载图像数据数据转换模块 torch.utils.data.DataLoader。
下面将使用CIFAR10数据集,它包含十个类别:‘airplane’, ‘automobile’, ‘bird’, ‘cat’, ‘deer’, ‘dog’, ‘frog’, ‘horse’, ‘ship’, ‘truck’。CIFAR-10 中的图像尺寸为3x32x32,也就是RGB的3层颜色通道,每层通道内的尺寸为32*32。
首先,加载并归一化 CIFAR10 使用 torchvision 。torchvision 数据集的输出是范围在[0,1]之间的 PILImage,我们将他们转换成归一化范围为[-1,1]之间的张量 Tensors。
大家肯定好奇,下面代码中说的是 0.5,怎么就变化到[-1,1]之间了?PyTorch源码中是这么写的:
input[channel] = (input[channel] - mean[channel]) / std[channel]
这样就是:((0,1)-0.5)/0.5=(-1,1)。
import torch
import torchvision
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
import numpy as np
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
# 使用GPU训练,可以在菜单 "代码执行工具" -> "更改运行时类型" 里进行设置
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
transform = transforms.Compose(
[transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
# 注意下面代码中:训练的 shuffle 是 True,测试的 shuffle 是 false
# 训练时可以打乱顺序增加多样性,测试是没有必要
trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64,
shuffle=True, num_workers=2)
testset = torchvision.datasets.CIFAR10(root='./data', train=False,
download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=8,
shuffle=False, num_workers=2)
classes = ('plane', 'car', 'bird', 'cat',
'deer', 'dog', 'frog', 'horse', 'ship', 'truck')
下面展示 CIFAR10 里面的一些图片:
def imshow(img):
plt.figure(figsize=(8,8))
img = img / 2 + 0.5 # 转换到 [0,1] 之间
npimg = img.numpy()
plt.imshow(np.transpose(npimg, (1, 2, 0)))
plt.show()
# 得到一组图像
images, labels = iter(trainloader).next()
# 展示图像
imshow(torchvision.utils.make_grid(images))
# 展示第一行图像的标签
for j in range(8):
print(classes[labels[j]])
运行结果为:
bird bird horse cat plane dog ship truck
接下来定义网络,损失函数和优化器:
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(3, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 5 * 5, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 5 * 5)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
# 网络放到GPU上
net = Net().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(net.parameters(), lr=0.001)
训练网络:
for epoch in range(10): # 重复多轮训练
for i, (inputs, labels) in enumerate(trainloader):
inputs = inputs.to(device)
labels = labels.to(device)
# 优化器梯度归零
optimizer.zero_grad()
# 正向传播 + 反向传播 + 优化
outputs = net(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
# 输出统计信息
if i % 100 == 0:
print('Epoch: %d Minibatch: %5d loss: %.3f' %(epoch + 1, i + 1, loss.item()))
print('Finished Training')
运行结果为:
Epoch: 1 Minibatch: 1 loss: 2.307
Epoch: 1 Minibatch: 101 loss: 1.804
Epoch: 1 Minibatch: 201 loss: 1.876
Epoch: 1 Minibatch: 301 loss: 1.775
Epoch: 1 Minibatch: 401 loss: 1.470
Epoch: 1 Minibatch: 501 loss: 1.459
Epoch: 1 Minibatch: 601 loss: 1.551
Epoch: 1 Minibatch: 701 loss: 1.424
Epoch: 2 Minibatch: 1 loss: 1.269
Epoch: 2 Minibatch: 101 loss: 1.443
Epoch: 2 Minibatch: 201 loss: 1.369
Epoch: 2 Minibatch: 301 loss: 1.482
Epoch: 2 Minibatch: 401 loss: 1.345
Epoch: 2 Minibatch: 501 loss: 1.452
Epoch: 2 Minibatch: 601 loss: 1.517
Epoch: 2 Minibatch: 701 loss: 1.415
Epoch: 3 Minibatch: 1 loss: 1.137
Epoch: 3 Minibatch: 101 loss: 1.478
Epoch: 3 Minibatch: 201 loss: 1.223
Epoch: 3 Minibatch: 301 loss: 1.140
Epoch: 3 Minibatch: 401 loss: 1.206
Epoch: 3 Minibatch: 501 loss: 1.295
Epoch: 3 Minibatch: 601 loss: 1.085
Epoch: 3 Minibatch: 701 loss: 1.080
Epoch: 4 Minibatch: 1 loss: 1.052
Epoch: 4 Minibatch: 101 loss: 1.137
Epoch: 4 Minibatch: 201 loss: 1.170
Epoch: 4 Minibatch: 301 loss: 1.413
Epoch: 4 Minibatch: 401 loss: 1.017
Epoch: 4 Minibatch: 501 loss: 1.138
Epoch: 4 Minibatch: 601 loss: 1.326
Epoch: 4 Minibatch: 701 loss: 1.070
Epoch: 5 Minibatch: 1 loss: 0.977
Epoch: 5 Minibatch: 101 loss: 1.108
Epoch: 5 Minibatch: 201 loss: 1.245
Epoch: 5 Minibatch: 301 loss: 0.882
Epoch: 5 Minibatch: 401 loss: 1.041
Epoch: 5 Minibatch: 501 loss: 1.032
Epoch: 5 Minibatch: 601 loss: 0.952
Epoch: 5 Minibatch: 701 loss: 0.944
Epoch: 6 Minibatch: 1 loss: 0.953
Epoch: 6 Minibatch: 101 loss: 1.337
Epoch: 6 Minibatch: 201 loss: 1.218
Epoch: 6 Minibatch: 301 loss: 1.098
Epoch: 6 Minibatch: 401 loss: 1.133
Epoch: 6 Minibatch: 501 loss: 1.069
Epoch: 6 Minibatch: 601 loss: 1.080
Epoch: 6 Minibatch: 701 loss: 0.819
Epoch: 7 Minibatch: 1 loss: 1.136
Epoch: 7 Minibatch: 101 loss: 0.975
Epoch: 7 Minibatch: 201 loss: 0.960
Epoch: 7 Minibatch: 301 loss: 0.874
Epoch: 7 Minibatch: 401 loss: 1.104
Epoch: 7 Minibatch: 501 loss: 0.838
Epoch: 7 Minibatch: 601 loss: 1.322
Epoch: 7 Minibatch: 701 loss: 0.850
Epoch: 8 Minibatch: 1 loss: 1.035
Epoch: 8 Minibatch: 101 loss: 1.030
Epoch: 8 Minibatch: 201 loss: 1.284
Epoch: 8 Minibatch: 401 loss: 0.905
Epoch: 8 Minibatch: 501 loss: 1.103
Epoch: 8 Minibatch: 601 loss: 1.027
Epoch: 8 Minibatch: 701 loss: 0.934
Epoch: 9 Minibatch: 1 loss: 0.813
Epoch: 9 Minibatch: 101 loss: 0.858
Epoch: 9 Minibatch: 201 loss: 1.065
Epoch: 9 Minibatch: 301 loss: 0.759
Epoch: 9 Minibatch: 401 loss: 1.160
Epoch: 9 Minibatch: 501 loss: 1.243
Epoch: 9 Minibatch: 601 loss: 1.130
Epoch: 9 Minibatch: 701 loss: 0.973
Epoch: 10 Minibatch: 1 loss: 1.055
Epoch: 10 Minibatch: 101 loss: 0.929
Epoch: 10 Minibatch: 201 loss: 0.896
Epoch: 10 Minibatch: 301 loss: 0.816
Epoch: 10 Minibatch: 401 loss: 0.805
Epoch: 10 Minibatch: 501 loss: 0.888
Epoch: 10 Minibatch: 601 loss: 0.852
Epoch: 10 Minibatch: 701 loss: 0.687
Finished Training
现在我们从测试集中取出8张图片:
# 得到一组图像
images, labels = iter(testloader).next()
# 展示图像
imshow(torchvision.utils.make_grid(images))
# 展示图像的标签
for j in range(8):
print(classes[labels[j]])
运行结果为:
cat
ship
ship
plane
frog
frog
car
frog
我们把图片输入模型,看看CNN把这些图片识别成什么:
outputs = net(images.to(device))
_, predicted = torch.max(outputs, 1)
# 展示预测的结果
for j in range(8):
print(classes[predicted[j]])
识别结果为:
cat
ship
ship
plane
deer
frog
car
bird
可以看到,有几个都识别错了~~~ 让我们看看网络在整个数据集上的表现:
correct = 0
total = 0
for data in testloader:
images, labels = data
images, labels = images.to(device), labels.to(device)
outputs = net(images)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
print('Accuracy of the network on the 10000 test images: %d %%' % (
100 * correct / total))
Accuracy of the network on the 10000 test images: 63 %
三、使用 VGG16 对 CIFAR10 分类
使用 VGG16 对 CIFAR10 分类VGG是由Simonyan 和Zisserman在文献《Very Deep Convolutional Networks for Large Scale Image Recognition》中提出卷积神经网络模型,其名称来源于作者所在的牛津大学视觉几何组(Visual Geometry Group)的缩写。该模型参加2014年的 ImageNet图像分类与定位挑战赛,取得了优异成绩:在分类任务上排名第二,在定位任务上排名第一。VGG16的网络结构如下图所示:
16层网络的结节信息如下:
01:Convolution using 64 filters
02: Convolution using 64 filters + Max pooling
03: Convolution using 128 filters
04: Convolution using 128 filters + Max pooling
05: Convolution using 256 filters
06: Convolution using 256 filters
07: Convolution using 256 filters + Max pooling
08: Convolution using 512 filters
09: Convolution using 512 filters
10: Convolution using 512 filters + Max pooling
11: Convolution using 512 filters
12: Convolution using 512 filters
13: Convolution using 512 filters + Max pooling
14: Fully connected with 4096 nodes
15: Fully connected with 4096 nodes
16: Softmax
1.定义 dataloader
需要注意的是,这里的 transform,dataloader 和之前定义的有所不同
import torch
import torchvision
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
import numpy as np
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
# 使用GPU训练,可以在菜单 "代码执行工具" -> "更改运行时类型" 里进行设置
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
transform_train = transforms.Compose([
transforms.RandomCrop(32, padding=4),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010))])
transform_test = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010))])
trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform_train)
testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform_test)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=128, shuffle=True, num_workers=2)
testloader = torch.utils.data.DataLoader(testset, batch_size=128, shuffle=False, num_workers=2)
classes = ('plane', 'car', 'bird', 'cat',
'deer', 'dog', 'frog', 'horse', 'ship', 'truck')
2. VGG 网络定义
现在的结构基本上是:
64 conv, maxpooling,
128 conv, maxpooling,
256 conv, 256 conv, maxpooling,
512 conv, 512 conv, maxpooling,
512 conv, 512 conv, maxpooling,
softmax
模型的实现代码为:
class VGG(nn.Module):
def __init__(self):
super(VGG, self).__init__()
self.cfg = [64, 'M', 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M']
self.features = self._make_layers(cfg)
self.classifier = nn.Linear(2048, 10)
def forward(self, x):
out = self.features(x)
out = out.view(out.size(0), -1)
out = self.classifier(out)
return out
def _make_layers(self, cfg):
layers = []
in_channels = 3
for x in cfg:
if x == 'M':
layers += [nn.MaxPool2d(kernel_size=2, stride=2)]
else:
layers += [nn.Conv2d(in_channels, x, kernel_size=3, padding=1),
nn.BatchNorm2d(x),
nn.ReLU(inplace=True)]
in_channels = x
layers += [nn.AvgPool2d(kernel_size=1, stride=1)]
return nn.Sequential(*layers)
初始化网络,根据实际需要,修改分类层。因为 tiny-imagenet 是对200类图像分类,这里把输出修改为200。
# 网络放到GPU上
net = VGG().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(net.parameters(), lr=0.001)
3.网络训练
训练的代码和以前是完全一样的:
for epoch in range(10): # 重复多轮训练
for i, (inputs, labels) in enumerate(trainloader):
inputs = inputs.to(device)
labels = labels.to(device)
# 优化器梯度归零
optimizer.zero_grad()
# 正向传播 + 反向传播 + 优化
outputs = net(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
# 输出统计信息
if i % 100 == 0:
print('Epoch: %d Minibatch: %5d loss: %.3f' %(epoch + 1, i + 1, loss.item()))
print('Finished Training')
运行结果为:
Epoch: 1 Minibatch: 1 loss: 3.392
Epoch: 1 Minibatch: 101 loss:1.496
Epoch: 1 Minibatch: 201 loss: 1.341
Epoch: 1 Minibatch: 301 loss: 1.314
Epoch: 2 Minibatch: 1 loss: 1.181
Epoch: 2 Minibatch:101 loss: 1.242
Epoch: 2 Minibatch: 201 loss: 1.123
Epoch: 2Minibatch: 301 loss: 1.196
Epoch: 3 Minibatch: 1 loss: 1.350
Epoch: 3 Minibatch: 101 loss: 1.237Epoch: 3 Minibatch: 201 loss:1.235
Epoch: 3 Minibatch: 301 loss: 1.099
Epoch: 4 Minibatch: 1 loss: 1.140
Epoch: 4 Minibatch: 101 loss: 0.997
Epoch: 4 Minibatch:201 loss: 1.259
Epoch: 4 Minibatch: 301 loss: 1.258
Epoch: 5Minibatch: 1 loss: 1.189
Epoch: 5 Minibatch: 101 loss: 1.042
Epoch: 5 Minibatch: 201 loss: 1.137Epoch: 5 Minibatch: 301 loss:1.142
Epoch: 6 Minibatch: 1 loss: 1.025
Epoch: 6 Minibatch: 101 loss: 1.031
Epoch: 6 Minibatch: 201 loss: 1.199
Epoch: 6 Minibatch:301 loss: 1.168
Epoch: 7 Minibatch: 1 loss: 0.958
Epoch: 7Minibatch: 101 loss: 1.106
Epoch: 7 Minibatch: 201 loss: 1.045
Epoch: 7 Minibatch: 301 loss: 1.169Epoch: 8 Minibatch: 1 loss:1.158
Epoch: 8 Minibatch: 101 loss: 1.066
Epoch: 8 Minibatch: 201 loss: 0.984
Epoch: 8 Minibatch: 301 loss: 1.113
Epoch: 9 Minibatch:1 loss: 1.247
Epoch: 9 Minibatch: 101 loss: 1.102
Epoch: 9Minibatch: 201 loss: 1.209
Epoch: 9 Minibatch: 301 loss: 1.235
Epoch: 10 Minibatch: 1 loss: 0.998Epoch: 10 Minibatch: 101loss: 1.159
Epoch: 10 Minibatch: 201 loss: 1.079
Epoch: 10Minibatch: 301 loss: 1.106 Finished Training
4. 测试验证准确率:
测试的代码和之前也是完全一样的。
correct = 0
total = 0
for data in testloader:
images, labels = data
images, labels = images.to(device), labels.to(device)
outputs = net(images)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
print('Accuracy of the network on the 10000 test images: %.2f %%' % (
100 * correct / total))
运行结果为:
Accuracy of the network on the 10000 test images: 84.92 %
可以看到,使用一个简化版的 VGG 网络,就能够显著地将准确率由 64%,提升到 84.92%
标签:loss,nn,Epoch,卷积,self,Minibatch,神经网络,软件工程,data 来源: https://blog.csdn.net/qq_45779147/article/details/120811952