PyTorch之保存加載模型

前提

本文來源于https://pytorch.org/tutorials/beginner/saving_loading_models.html#

SAVING AND LOADING MODELS

當提到保存和加載模型時,有三個核心功能需要熟悉:
1.torch.save:將序列化的對象保存到disk。這個函數使用Python的pickle實用程序進行序列化。使用這個函數可以保存各種對象的模型、張量和字典。
2.torch.load:使用pickle unpickle工具將pickle的對象文件反序列化為內存。
3.torch.nn.Module.load_state_dict:使用反序列化狀態字典加載model’s參數字典。

一:WHAT IS A STATE_DICT

在PyTorch中,torch.nn.Module的可學習參數(即權重和偏差),模塊模型包含在model's參數中(通過model.parameters()訪問)。state_dict是個簡單的Python dictionary對象,它將每個層映射到它的參數張量。
注意,只有具有可學習參數的層(卷積層、線性層等)才有model's state_dict中的條目。優化器對象(connector .optim)也有一個state_dict,其中包含關于優化器狀態以及所使用的超參數的信息。
Example:

import torch
import torch.nn as nn
import torch.nn.functional as F
# Define model
class TheModelClass(nn.Module):
    def __init__(self):
        super(TheModelClass,self).__init__()
        self.conv1=nn.Conv2d(3,6,5)
        self.pool=nn.MaxPool2d(2,2)
        self.conv2=nn.Conv2d(6,16,5)
        self.fc1=nn.Linear(16*5*5,120)
        self.fc2=nn.Linear(120,84)
        self.fc3=nn.Linear(84,10)
    def farward(self,x):
        x=self.pool(F.relu(self.conv1(x)))
        x=self.pool(F.relu(self.conv2(x)))
        x=x.view(-1,16*5*5)
        x=F.relu(self.fc1(x))
        x=F.relu(self.fc2(x))
        x=self.fc3(x)
        return x
# Initialize model
model=TheModelClass()
# Initialize optimizer
optimizer=torch.optim.SGD(model.parameters(),lr=1e-4,momentum=0.9)

print("Model's state_dict:")
# Print model's state_dict
for param_tensor in model.state_dict():
    print(param_tensor,"\t",model.state_dict()[param_tensor].size())
print("optimizer's state_dict:")
# Print optimizer's state_dict
for var_name in optimizer.state_dict():
    print(var_name,"\t",optimizer.state_dict()[var_name])

Output:

Model's state_dict:
conv1.weight     torch.Size([6, 3, 5, 5])
conv1.bias   torch.Size([6])
conv2.weight     torch.Size([16, 6, 5, 5])
conv2.bias   torch.Size([16])
fc1.weight   torch.Size([120, 400])
fc1.bias     torch.Size([120])
fc2.weight   torch.Size([84, 120])
fc2.bias     torch.Size([84])
fc3.weight   torch.Size([10, 84])
fc3.bias     torch.Size([10])
optimizer's state_dict:
state    {}
param_groups     [{'lr': 0.0001, 'momentum': 0.9, 'dampening': 0, 'weight_decay': 0, 'nesterov': False, 'params': [1310469552240, 1310469552384, 1310469552456, 1310469552528, 1310469552600, 1310469552672, 1310469552744, 1310469552816, 1310469552888, 1310469552960]}]

二:SAVING & LOADING MODEL FOR INFERENCE

Save/Load state_dict (Recommended)

  • Save:

       torch.save(model.state_dict(), PATH)
    

在保存模型進行推理時,只需要保存訓練過的模型的學習參數即可。一個常見的PyTorch約定是使用.pt或.pth文件擴展名保存模型。

  • Load:

     model = TheModelClass(*args, **kwargs)
     model.load_state_dict(torch.load(PATH))
     model.eval()
    

記住,您必須調用model.eval(),以便在運行推斷之前將dropout和batch規范化層設置為評估模式。如果不這樣做,將會產生不一致的推斷結果。

Note:

 注意,load_state_dict()函數接受一個dictionary對象,而不是保存對象的路徑。這意味著您必須在將保存的state_dict傳至load_state_dict()函數之前反序列化它。

Save/Load Entire Model

  • Save:

      torch.save(model, PATH)
    
  • Load:

    # Model class must be defined somewhere
      model = torch.load(PATH)
     model.eval()
    

三:

Save:

       torch.save({
        'epoch': epoch,
        'model_state_dict': model.state_dict(),
        'optimizer_state_dict': optimizer.state_dict(),
        'loss': loss,
        ...
        }, PATH)

</pre>

Load:

         model = TheModelClass(*args, **kwargs)
         optimizer = TheOptimizerClass(*args, **kwargs)

        checkpoint = torch.load(PATH)
        model.load_state_dict(checkpoint['model_state_dict'])
        optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
        epoch = checkpoint['epoch']
        loss = checkpoint['loss']

         model.eval()
        # - or -
        model.train()</pre>

在保存用于推理或恢復訓練的通用檢查點時,必須保存模型的state_dict。另外,保存優化器的state_dict也是很重要的,因為它包含緩沖區和參數,這些緩沖區和參數是在模型訓練時更新的。要保存多個組件,請將它們組織在字典中,并使用torch.save()序列化字典。一個常見的PyTorch約定是使用.tar文件擴展名保存這些檢查點。

四:SAVING & LOADING MODEL ACROSS DEVICES

Save on GPU, Load on CPU

  • Save:

      torch.save(model.state_dict(), PATH)
    
  • Load:

      device = torch.device('cpu')
      model = TheModelClass(*args, **kwargs)
      model.load_state_dict(torch.load(PATH, map_location=device))
    

Save on GPU, Load on GPU

  • Save:

      torch.save(model.state_dict(), PATH)
    
  • Load:

      device = torch.device("cuda")
      model = TheModelClass(*args, **kwargs)
      model.load_state_dict(torch.load(PATH))
      model.to(device)
      # Make sure to call input = input.to(device) on any input tensors that you feed to the model
    

Save on CPU, Load on GPU

  • Save:

      torch.save(model.state_dict(), PATH)
    
  • Load:

      device = torch.device("cuda")
      model = TheModelClass(*args, **kwargs)
      model.load_state_dict(torch.load(PATH, map_location="cuda:0"))  # Choose whatever GPU device number you want
     model.to(device)
      # Make sure to call input = input.to(device) on any input tensors that you feed to the model
    

Saving torch.nn.DataParallel Models

  • Save:

      torch.save(model.module.state_dict(), PATH)
    
  • Load:

     # Load to whatever device you want
    

torch.nn.DataParallel是支持并行GPU使用的模型包裝器。為了節省DataParallel模型屬性,保存model.module.state_dict()。通過這種方式,您可以靈活地以任何方式加載模型以加載任何設備。

?著作權歸作者所有,轉載或內容合作請聯系作者
平臺聲明:文章內容(如有圖片或視頻亦包括在內)由作者上傳并發布,文章內容僅代表作者本人觀點,簡書系信息發布平臺,僅提供信息存儲服務。

推薦閱讀更多精彩內容