首页 > 其他分享> > Pytorch Cheatsheet

Pytorch Cheatsheet

2019-06-11 18:01:45 作者：互联网

Pytorch Cheatsheet

torch.no_grad()

CLASS torch.autograd.no_grad[SOURCE]
Context-manager that disabled gradient calculation.

Disabling gradient calculation is useful for inference, when you are sure that you will not call Tensor.backward(). It will reduce memory consumption for computations that would otherwise have requires_grad=True. In this mode, the result of every computation will have requires_grad=False, even when the inputs have requires_grad=True.

Also functions as a decorator.

Example:

>>> x = torch.tensor([1], requires_grad=True)
>>> with torch.no_grad():
...   y = x * 2
>>> y.requires_grad
False
>>> @torch.no_grad()
... def doubler(x):
...     return x * 2
>>> z = doubler(x)
>>> z.requires_grad
False

Dataloader

https://pytorch.org/tutorials/beginner/data_loading_tutorial.html

Model.train,Model.eval

在训练和测试前加上Model.train()和Model.eval()用来切换batch normalization 和drop out在训练和验证时不同的行为

torch.nn.functional和torch.nn

torch.nn.functional（F）和torch.nn（简称nn）中都有一些函数，比如F.relu、nn.ReLU，F中也有损失函数F.nll_loss，不过之前尝试了一下，F.的函数好像在print(model)的时候显示不出来？

CrossEntropyLoss和nll_loss

CrossEntopyLoss会自己计算负log，而nll_loss需要手动计算log softmax

import torch
import torch.nn.functional as F

input = torch.randn(3, 5, requires_grad=True)
target = torch.tensor([1, 0, 4])
output = F.nll_loss(F.log_softmax(input), target)

print("nll_loss:")
print(output)

criterion = torch.nn.CrossEntropyLoss()
loss =criterion(input,target)
print("cross entropy:")
print(loss)

输出为

nll_loss:
tensor(2.3012, grad_fn=<NllLossBackward>)
cross entropy:
tensor(2.3012, grad_fn=<NllLossBackward>)

tensor.item()

返回一维张量的python内置类型的数字

>>> x = torch.tensor([1.0])
>>> x.item()
1.0

随机种子

torch.manual_seed(args.seed)

torch.cat

连接两个张量

r = torch.cat([1,2,3],1)

torch.nn.functional.pad

可以给常量pad常数

>>>import torch
>>>import torch.nn.functional as F
>>>sample = torch.rand((10,3,5,5))
>>>result = F.pad(sample,(0,0,0,0,0,3)) # 默认pad0
>>>result.size()
[10,6,5,5]

pad的第二个参数表明在每一维的开始和结束分别pad多少，从左到右是最后一维到第一维

torch.mm or tensor.mm

矩阵乘法

r = x.mm(Weight)
r = torch.mm(x,Weight)

手动更新权值

fc = torch.nn.Linear(W_target.size(0), 1)
for param in fc.parameters():
    param.data.add_(-0.1 * param.grad.data)

手动更新学习率

参考自博客和官方代码

def adjust_learning_rate(optimizer, epoch):
    """Sets the learning rate to the initial LR decayed by 10 every 30 epochs"""
    lr = args.lr * (0.1 ** (epoch // 30))
    for param_group in optimizer.param_groups:
        param_group['lr'] = lr

在训练时这样使用

for epoch in range(args.start_epoch, args.epochs):
        adjust_learning_rate(optimizer, epoch)
        ...

自适应的avgpool

nn.AdaptiveAvgPool2d((1, 1))

只要给出输出的尺寸即可

坑

Win10使用GPU训练报错

.\aten\src\THC\THCGeneral.cpp:87
用1.0.0取代1.0.1的版本就可以了
https://github.com/pytorch/pytorch/issues/18981

模型的保存和读取

官方文档
首先state dict就是描述模型的一个状态字典，是一个模型各层与其参数的映射，它也可以描述optimizer，详细可见文档

For Inference

保存`state_dict`(推荐)

保存：

torch.save(model.state_dict(), PATH)

加载：

model = TheModelClass(*args, **kwargs)
model.load_state_dict(torch.load(PATH))
model.eval()

通常模型文件以.pt或.pth结尾
通过model.eval()在inference前更改dropout和batch nomalization层的行为
注意不可以直接使用model.load_state_dict(PATH)

保存整个模型

保存：

torch.save(model, PATH)

加载：

model=torch.load(PATH)
model.eval()

这样写很直观，但缺点是保存的模型与模型类绑定，在加载的时候需要用到模型的类，因此，可能会在其他模块中或者重构后出现各种各样的错误，如
直接用torch.save保存模型在导入的时候需要main里有模型的结构，否则会报：
AttributeError: Can't get attribute 'Flatten' on <module '__main__'>

Checkpoint

保存：

torch.save({
            'epoch': epoch,
            'model_state_dict': model.state_dict(),
            'optimizer_state_dict': optimizer.state_dict(),
            'loss': loss,
            ...
            }, PATH)

注意这里也需要保存优化器的state dict，以保证优化器的参数不变
重载：

model = TheModelClass(*args, **kwargs)
optimizer = TheOptimizerClass(*args, **kwargs)

checkpoint = torch.load(PATH)
model.load_state_dict(checkpoint['model_state_dict'])
optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
epoch = checkpoint['epoch']
loss = checkpoint['loss']

model.eval()
# - or -
model.train()

通常保存的checkpoint以.tar结尾

多个模型的保存，使用其他模型的参数热启动不同的模型，在GPU、CPU之间保存加载模型的方法等见文档

If you trained your model using Adam, you need to save the optimizer state dict as well and reload that. Also, if you used any learning rate decay, you need to reload the state of the scheduler because it gets reset if you don’t,and you may end up with a higher learning rate that will make the solution state oscillate. Finally, if you have any dropout or batch norm in your model architecture, and you saved your model after a test loop (in which case model.eval() was called),make sure to call model.train() before the training loop.

计算Acc

请使用.item()，否则算出来是0？？

 acc += torch.sum(pred_label == target).item()

不常想到的函数

torch.numel()

返回tensor元素的个数

训练

try:
	train()
except (RuntimeError, KeyboardInterrupt):
    print('Save ckpt on exception ...')
    save_checkpoint(model, infos, optimizer)
    print('Save ckpt done.')
    stack_trace = traceback.format_exc()
    print(stack_trace)

ToTensor()

torchvision.transforms.ToTensor好像直接可以将uint8的图片转成float格式的，将图片输入到网络中的时候，好像也可以？

transform 的顺序

Nomalize应该在ToTensor之前！！

测试GPU上的时间

使用torch.cuda.synchronize()，因为pytorch在cpu和gpu上的代码是异步运行的，详见
https://blog.csdn.net/u013548568/article/details/81368019

Normalize

cifar10 (0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)

num_workers

Dataloader的num_workers参数设定了加载batch的线程数，通过实验发现，当我使用某个对data预处理比较多的Data类时，更高的num_workers可以显著提高速度（比如我设置为4，相比于1来说，提高了近15~20倍的速度），对num_workers设置的讨论在这里

memory_pin

Dataloader的memory_pin应该总是被设定为True，当你在GPU上训练时，参考自这里

迁移学习

fintune

import torch.nn as nn
import torchvision
model = torchvision.models.resnet18(pretrained=True)
num_ftrs = model.fc.in_features
model.fc = nn.Linear(num_ftrs,num_classes)

feature extractor

model =  torchvision.models.resnet18(pretrained=True)
for param in model_conv.parameters():
	param.requires_grad = False
num_ftrs = model.fc.in_features
model.fc = nn.Linear(num_ftrs,num_classes)

torchvision.uitls.make_grid

def imshow(inp, title=None):
    """Imshow for Tensor."""
    inp = inp.numpy().transpose((1, 2, 0))
    mean = np.array([0.485, 0.456, 0.406])
    std = np.array([0.229, 0.224, 0.225])
    inp = std * inp + mean
    inp = np.clip(inp, 0, 1)
    plt.imshow(inp)
    if title is not None:
        plt.title(title)
    plt.pause(0.001)  # pause a bit so that plots are updated

inputs, classes = next(iter(dataloaders['train']))

# Make a grid from batch
out = torchvision.utils.make_grid(inputs)

imshow(out, title=[class_names[x] for x in classes])

nn.Sequencial

这样写AlexNet forward的时候太麻烦了

class AlexNet(nn.Module):
    def __init__(self):
        super(AlexNet, self).__init__()
        self.conv1 = nn.Conv2d(in_channels=3, out_channels=8, kernel_size=3, padding=1)
        self.conv2 = nn.Conv2d(in_channels=8, out_channels=16, kernel_size=3, padding=1)
        self.conv3 = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=3, padding=1)
        self.conv4 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, padding=1)
        self.conv5 = nn.Conv2d(in_channels=64, out_channels=128, kernel_size=3, padding=1)
        self.fc1 = nn.Linear(128, 256)
        self.fc2 = nn.Linear(256, 256)
        self.fc3 = nn.Linear(256, 10)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        x = F.max_pool2d(x, 2, 2)
        x = F.relu(self.conv2(x))
        x = F.max_pool2d(x, 2, 2)
        x = F.relu(self.conv3(x))
        x = F.max_pool2d(x, 2, 2)
        x = F.relu(self.conv4(x))
        x = F.max_pool2d(x, 2, 2)
        x = F.relu(self.conv5(x))
        x = F.max_pool2d(x, 2, 2)
        x = x.view((x.size(0), -1))
        x = F.dropout(self.fc1(x), 0.5)
        x = F.relu(x)
        x = F.dropout(self.fc2(x), 0.5)
        x = F.relu(x)
        x = self.fc3(x)
        return x

不如

class AlexNet(nn.Module):

    def __init__(self, num_classes):
        super(AlexNet, self).__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 64, kernel_size=11, stride=4, padding=5),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),
            nn.Conv2d(64, 192, kernel_size=5, padding=2),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),
            nn.Conv2d(192, 384, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(384, 256, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(256, 256, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),
        )
        self.fc = nn.Linear(256, num_classes)

    def forward(self, x):
        x = self.features(x)
        x = x.view(x.size(0), -1)
        x = self.fc(x)
        return x

标签：nn,self,torch,state,Pytorch,Cheatsheet,model,size
来源： https://blog.csdn.net/luo3300612/article/details/88316177

Pytorch Cheatsheet

Pytorch Cheatsheet

torch.no_grad()

Dataloader

Model.train,Model.eval

torch.nn.functional和torch.nn

CrossEntropyLoss和nll_loss

tensor.item()

随机种子

torch.cat

torch.nn.functional.pad

torch.mm or tensor.mm

手动更新权值

手动更新学习率

自适应的avgpool

坑

Win10使用GPU训练报错

模型的保存和读取

For Inference

保存state_dict(推荐)

保存整个模型

Checkpoint

计算Acc

不常想到的函数

训练

ToTensor()

transform 的顺序

测试GPU上的时间

Normalize

num_workers

memory_pin

迁移学习

torchvision.uitls.make_grid

nn.Sequencial

保存`state_dict`(推荐)