作者 | PULKIT SHARMA
編譯 | VK
來源 | Analytics Vidhya
概述深度學習是一個廣闊的領域,但我們大多數人在構建模型時都面臨一些共同的難題在這裡,我們將討論提高深度學習模型性能的4個難題和技巧這是一篇以代碼實踐為重點的文章,所以請準備好你的Python IDE並改進你的深度學習模型!介紹過去兩年的大部分時間,我幾乎都在深度學習領域工作。這是一個相當好的經歷,這中間我參與了圖像和視頻數據相關的多個項目。在那之前,我處於邊緣地帶,我迴避了對象檢測和人臉識別等深度學習概念。直到2017年底才開始深入研究。在這段時間裡,我遇到了各種各樣的難題。我想談談四個最常見的問題,大多數深度學習實踐者和愛好者在他們的旅程中都會遇到。如果你之前參與過深度學習項目,你就能很快理解這些障礙。好消息是克服它們並不像你想的那麼難!在本文中,我們將採用一種非常實際的方法。首先,我們將建立我上面提到的四個常見難題。然後,我們將直接深入Python代碼,學習與這些難題作鬥爭和克服這些難題的關鍵技巧和技巧。這裡有很多東西需要打開,讓我們開始吧!目錄了解每個難題以及如何克服難題以提高深度學習模型的性能深度學習模型的共同難題深度學習模型通常在大多數數據上的表現都非常好。在圖像數據方面,深度學習模型,尤其是卷積神經網絡(CNN),幾乎勝過所有其他模型。我通常的方法是在遇到圖像相關項目(例如圖像分類項目)時使用CNN模型。
這種方法效果很好,但是在某些情況下,CNN或其他深度學習模型無法執行。我遇到過幾次。我的數據很好,模型的體系結構也正確定義,損失函數和優化器也正確設置,但是我的模型沒有達到我的預期。這是我們大多數人在使用深度學習模型時面臨的常見難題。在深入探討和理解這些難題之前,讓我們快速看一下我們將在本文中解決的案例研究。車輛分類案例研究概述本文是我一直在寫的PyTorch面向初學者系列的一部分。你可以在此處查看前三篇文章(我們將從那裡引用一些內容):
我們將繼續閱讀上一篇文章中看到的案例研究。這裡的目的是將車輛圖像分類為緊急或非緊急。
首先,讓我們快速構建一個CNN模型,並將其用作基準。我們還將嘗試改善此模型的性能。這些步驟非常簡單,在之前的文章中我們已經看過幾次。
因此,我不會在這裡深入每一步。相反,我們將重點放在代碼上,你始終可以在我上面連結的先前文章中更詳細地進行檢查。
你可以從此處獲取數據集:https://drive.google.com/file/d/1EbVifjP0FQkyB1axb7KQ26yPtWmneApJ/view
這是為我們的車輛分類項目構建CNN模型的完整代碼。
導入庫# 導入庫
import pandas as pd
import numpy as np
from tqdm import tqdm
# 用於讀取和顯示圖像
from skimage.io import imread
from skimage.transform import resize
import matplotlib.pyplot as plt
%matplotlib inline
# 用於創建驗證集
from sklearn.model_selection import train_test_split
# 用於評估模型
from sklearn.metrics import accuracy_score
# PyTorch庫和模塊
import torch
from torch.autograd import Variable
from torch.nn import Linear, ReLU, CrossEntropyLoss, Sequential, Conv2d, MaxPool2d, Module, Softmax, BatchNorm2d, Dropout
from torch.optim import Adam, SGD
# 預訓練模型
from torchvision import models
# 加載數據集
train = pd.read_csv('emergency_train.csv')
# 加載訓練圖片
train_img = []
for img_name in tqdm(train['image_names']):
# 定義圖像路徑
image_path = '../Hack Session/images/' + img_name
# 讀取圖片
img = imread(image_path)
# 標準化像素值
img = img/255
img = resize(img, output_shape=(224,224,3), mode='constant', anti_aliasing=True)
# 轉換為浮點數
img = img.astype('float32')
# 添加圖片到列表
train_img.append(img)
# 轉換為numpy數組
train_x = np.array(train_img)
train_x.shape
# 定義目標
train_y = train['emergency_or_not'].values
# 創建驗證集
train_x, val_x, train_y, val_y = train_test_split(train_x, train_y, test_size = 0.1, random_state = 13, stratify=train_y)
(train_x.shape, train_y.shape), (val_x.shape, val_y.shape)
# 轉換訓練圖片到torch格式
train_x = train_x.reshape(1481, 3, 224, 224)
train_x = torch.from_numpy(train_x)
# 轉換目標到torch格式
train_y = train_y.astype(int)
train_y = torch.from_numpy(train_y)
# 轉換驗證圖像到torch格式
val_x = val_x.reshape(165, 3, 224, 224)
val_x = torch.from_numpy(val_x)
# 轉換目標到torch格式
val_y = val_y.astype(int)
val_y = torch.from_numpy(val_y)
torch.manual_seed(0)
class Net(Module):
def __init__(self):
super(Net, self).__init__()
self.cnn_layers = Sequential(
# 定義2D卷積層
Conv2d(3, 16, kernel_size=3, stride=1, padding=1),
ReLU(inplace=True),
MaxPool2d(kernel_size=2, stride=2),
# 另一個2D卷積層
Conv2d(16, 32, kernel_size=3, stride=1, padding=1),
ReLU(inplace=True),
MaxPool2d(kernel_size=2, stride=2)
)
self.linear_layers = Sequential(
Linear(32 * 56 * 56, 2)
)
# 前項傳播
def forward(self, x):
x = self.cnn_layers(x)
x = x.view(x.size(0), -1)
x = self.linear_layers(x)
return x
# 定義模型
model = Net()
# 定義優化器
optimizer = Adam(model.parameters(), lr=0.0001)
# 定義損失函數
criterion = CrossEntropyLoss()
# 檢查GPU是否可用
if torch.cuda.is_available():
model = model.cuda()
criterion = criterion.cuda()
print(model)
torch.manual_seed(0)
# 模型batch大小
batch_size = 128
# epoch數
n_epochs = 25
for epoch in range(1, n_epochs+1):
# 保持記錄訓練與驗證集損失
train_loss = 0.0
permutation = torch.randperm(train_x.size()[0])
training_loss = []
for i in tqdm(range(0,train_x.size()[0], batch_size)):
indices = permutation[i:i+batch_size]
batch_x, batch_y = train_x[indices], train_y[indices]
if torch.cuda.is_available():
batch_x, batch_y = batch_x.cuda(), batch_y.cuda()
optimizer.zero_grad()
outputs = model(batch_x)
loss = criterion(outputs,batch_y)
training_loss.append(loss.item())
loss.backward()
optimizer.step()
training_loss = np.average(training_loss)
print('epoch: \t', epoch, '\t training loss: \t', training_loss)
# 訓練集預測
prediction = []
target = []
permutation = torch.randperm(train_x.size()[0])
for i in tqdm(range(0,train_x.size()[0], batch_size)):
indices = permutation[i:i+batch_size]
batch_x, batch_y = train_x[indices], train_y[indices]
if torch.cuda.is_available():
batch_x, batch_y = batch_x.cuda(), batch_y.cuda()
with torch.no_grad():
output = model(batch_x.cuda())
softmax = torch.exp(output).cpu()
prob = list(softmax.numpy())
predictions = np.argmax(prob, axis=1)
prediction.append(predictions)
target.append(batch_y)
# 訓練集精度
accuracy = []
for i in range(len(prediction)):
accuracy.append(accuracy_score(target[i],prediction[i]))
print('training accuracy: \t', np.average(accuracy))
# 驗證集預測
prediction_val = []
target_val = []
permutation = torch.randperm(val_x.size()[0])
for i in tqdm(range(0,val_x.size()[0], batch_size)):
indices = permutation[i:i+batch_size]
batch_x, batch_y = val_x[indices], val_y[indices]
if torch.cuda.is_available():
batch_x, batch_y = batch_x.cuda(), batch_y.cuda()
with torch.no_grad():
output = model(batch_x.cuda())
softmax = torch.exp(output).cpu()
prob = list(softmax.numpy())
predictions = np.argmax(prob, axis=1)
prediction_val.append(predictions)
target_val.append(batch_y)
# 驗證集精度
accuracy_val = []
for i in range(len(prediction_val)):
accuracy_val.append(accuracy_score(target_val[i],prediction_val[i]))
print('validation accuracy: \t', np.average(accuracy_val))
數據增強是在不實際收集新數據的情況下,生成新數據或增加數據以訓練模型的過程。
圖像數據有多種數據增強技術,常用的增強技術有旋轉、剪切、翻轉等。這是一個非常好的主題,因此我決定寫一篇完整的文章。我的計劃是在下一篇文章中討論這些技術及其在PyTorch中的實現。深度學習難題#2:模型過擬合我相信你聽說過過擬合。這是數據科學家剛接觸機器學習時最常見的難題(和錯誤)之一。但這個問題實際上超越了該領域,它也適用於深度學習。當一個模型在訓練集上執行得非常好,但是在驗證集(或不可見的數據)上性能下降時,就會被認為是過擬合。例如,假設我們有一個訓練集和一個驗證集。我們使用訓練數據來訓練模型,並檢查它在訓練集和驗證集上的性能(評估指標是準確性)。訓練的準確率是95%而驗證集的準確率是62%。聽起來熟悉嗎?由於驗證精度遠低於訓練精度,因此可以推斷模型存在過擬合問題。下面的例子會讓你更好地理解什麼是過擬合:上圖中藍色標記的部分是過擬合模型,因為訓練誤差非常小並且測試誤差非常高。過擬合的原因是該模型甚至從訓練數據中學習了不必要的信息,因此它在訓練集上的表現非常好。但是,當引入新數據時,它將無法執行。我們可以向模型的架構中引入Dropout,以解決過擬合的問題。使用Dropout,我們隨機關閉神經網絡的某些神經元。假設我們在最初有20個神經元的圖層上添加了概率為0.5的Dropout層,因此,這20個神經元中的10個將被抑制,我們最終得到了一個不太複雜的體系結構。因此,該模型將不會學習過於複雜的模式,可以避免過擬合。現在讓我們在架構中添加一個Dropout層,並檢查其性能。模型架構torch.manual_seed(0)
class Net(Module):
def __init__(self):
super(Net, self).__init__()
self.cnn_layers = Sequential(
# 定義2D卷積層
Conv2d(3, 16, kernel_size=3, stride=1, padding=1),
ReLU(inplace=True),
MaxPool2d(kernel_size=2, stride=2),
# Dropout層
Dropout(),
#另一個2D卷積層
Conv2d(16, 32, kernel_size=3, stride=1, padding=1),
ReLU(inplace=True),
MaxPool2d(kernel_size=2, stride=2),
# Dropout層
Dropout(),
)
self.linear_layers = Sequential(
Linear(32 * 56 * 56, 2)
)
# 前向傳播
def forward(self, x):
x = self.cnn_layers(x)
x = x.view(x.size(0), -1)
x = self.linear_layers(x)
return x
# 定義模型
model = Net()
# 定義優化器
optimizer = Adam(model.parameters(), lr=0.0001)
# 定義損失函數
criterion = CrossEntropyLoss()
# 檢查GPU是否可用
if torch.cuda.is_available():
model = model.cuda()
criterion = criterion.cuda()
print(model)
torch.manual_seed(0)
# 模型batch大小
batch_size = 128
# epoch數
n_epochs = 25
for epoch in range(1, n_epochs+1):
# 保持記錄訓練與驗證集損失
train_loss = 0.0
permutation = torch.randperm(train_x.size()[0])
training_loss = []
for i in tqdm(range(0,train_x.size()[0], batch_size)):
indices = permutation[i:i+batch_size]
batch_x, batch_y = train_x[indices], train_y[indices]
if torch.cuda.is_available():
batch_x, batch_y = batch_x.cuda(), batch_y.cuda()
optimizer.zero_grad()
outputs = model(batch_x)
loss = criterion(outputs,batch_y)
training_loss.append(loss.item())
loss.backward()
optimizer.step()
training_loss = np.average(training_loss)
print('epoch: \t', epoch, '\t training loss: \t', training_loss)
#
prediction = []
target = []
permutation = torch.randperm(train_x.size()[0])
for i in tqdm(range(0,train_x.size()[0], batch_size)):
indices = permutation[i:i+batch_size]
batch_x, batch_y = train_x[indices], train_y[indices]
if torch.cuda.is_available():
batch_x, batch_y = batch_x.cuda(), batch_y.cuda()
with torch.no_grad():
output = model(batch_x.cuda())
softmax = torch.exp(output).cpu()
prob = list(softmax.numpy())
predictions = np.argmax(prob, axis=1)
prediction.append(predictions)
target.append(batch_y)
# 訓練集精度
accuracy = []
for i in range(len(prediction)):
accuracy.append(accuracy_score(target[i],prediction[i]))
print('training accuracy: \t', np.average(accuracy))
同樣,讓我們檢查驗證集準確性:
# 驗證集預測
prediction_val = []
target_val = []
permutation = torch.randperm(val_x.size()[0])
for i in tqdm(range(0,val_x.size()[0], batch_size)):
indices = permutation[i:i+batch_size]
batch_x, batch_y = val_x[indices], val_y[indices]
if torch.cuda.is_available():
batch_x, batch_y = batch_x.cuda(), batch_y.cuda()
with torch.no_grad():
output = model(batch_x.cuda())
softmax = torch.exp(output).cpu()
prob = list(softmax.numpy())
predictions = np.argmax(prob, axis=1)
prediction_val.append(predictions)
target_val.append(batch_y)
# 驗證集精度
accuracy_val = []
for i in range(len(prediction_val)):
accuracy_val.append(accuracy_score(target_val[i],prediction_val[i]))
print('validation accuracy: \t', np.average(accuracy_val))
讓我們將其與以前的結果進行比較:
torch.manual_seed(0)
class Net(Module):
def __init__(self):
super(Net, self).__init__()
self.cnn_layers = Sequential(
# 定義2D卷積層
Conv2d(3, 16, kernel_size=3, stride=1, padding=1),
ReLU(inplace=True),
# BN層
BatchNorm2d(16),
MaxPool2d(kernel_size=2, stride=2),
#另一個2D卷積層
Conv2d(16, 32, kernel_size=3, stride=1, padding=1),
ReLU(inplace=True),
# BN層
BatchNorm2d(32),
MaxPool2d(kernel_size=2, stride=2),
)
self.linear_layers = Sequential(
Linear(32 * 56 * 56, 2)
)
# 前向傳播
def forward(self, x):
x = self.cnn_layers(x)
x = x.view(x.size(0), -1)
x = self.linear_layers(x)
return x
# 定義模型
model = Net()
# 定義優化器
optimizer = Adam(model.parameters(), lr=0.00005)
# 定義損失函數
criterion = CrossEntropyLoss()
# 檢查GPU是否可用
if torch.cuda.is_available():
model = model.cuda()
criterion = criterion.cuda()
print(model)
讓我們訓練模型
torch.manual_seed(0)
# 模型batch大小
batch_size = 128
# epoch數
n_epochs = 5
for epoch in range(1, n_epochs+1):
# 保持記錄訓練與驗證集損失
train_loss = 0.0
permutation = torch.randperm(train_x.size()[0])
training_loss = []
for i in tqdm(range(0,train_x.size()[0], batch_size)):
indices = permutation[i:i+batch_size]
batch_x, batch_y = train_x[indices], train_y[indices]
if torch.cuda.is_available():
batch_x, batch_y = batch_x.cuda(), batch_y.cuda()
optimizer.zero_grad()
outputs = model(batch_x)
loss = criterion(outputs,batch_y)
training_loss.append(loss.item())
loss.backward()
optimizer.step()
training_loss = np.average(training_loss)
print('epoch: \t', epoch, '\t training loss: \t', training_loss)
prediction = []
target = []
permutation = torch.randperm(train_x.size()[0])
for i in tqdm(range(0,train_x.size()[0], batch_size)):
indices = permutation[i:i+batch_size]
batch_x, batch_y = train_x[indices], train_y[indices]
if torch.cuda.is_available():
batch_x, batch_y = batch_x.cuda(), batch_y.cuda()
with torch.no_grad():
output = model(batch_x.cuda())
softmax = torch.exp(output).cpu()
prob = list(softmax.numpy())
predictions = np.argmax(prob, axis=1)
prediction.append(predictions)
target.append(batch_y)
# 訓練集精度
accuracy = []
for i in range(len(prediction)):
accuracy.append(accuracy_score(target[i],prediction[i]))
print('training accuracy: \t', np.average(accuracy))
# 驗證集預測
prediction_val = []
target_val = []
permutation = torch.randperm(val_x.size()[0])
for i in tqdm(range(0,val_x.size()[0], batch_size)):
indices = permutation[i:i+batch_size]
batch_x, batch_y = val_x[indices], val_y[indices]
if torch.cuda.is_available():
batch_x, batch_y = batch_x.cuda(), batch_y.cuda()
with torch.no_grad():
output = model(batch_x.cuda())
softmax = torch.exp(output).cpu()
prob = list(softmax.numpy())
predictions = np.argmax(prob, axis=1)
prediction_val.append(predictions)
target_val.append(batch_y)
# 驗證集精度
accuracy_val = []
for i in range(len(prediction_val)):
accuracy_val.append(accuracy_score(target_val[i],prediction_val[i]))
print('validation accuracy: \t', np.average(accuracy_val))
torch.manual_seed(0)
class Net(Module):
def __init__(self):
super(Net, self).__init__()
self.cnn_layers = Sequential(
# 定義2D卷積層
Conv2d(3, 16, kernel_size=3, stride=1, padding=1),
ReLU(inplace=True),
# BN層
BatchNorm2d(16),
MaxPool2d(kernel_size=2, stride=2),
# 添加dropout
Dropout(),
#另一個2D卷積層
Conv2d(16, 32, kernel_size=3, stride=1, padding=1),
ReLU(inplace=True),
# BN層
BatchNorm2d(32),
MaxPool2d(kernel_size=2, stride=2),
# 添加dropout
Dropout(),
)
self.linear_layers = Sequential(
Linear(32 * 56 * 56, 2)
)
# 前向傳播
def forward(self, x):
x = self.cnn_layers(x)
x = x.view(x.size(0), -1)
x = self.linear_layers(x)
return x
現在,我們將定義模型的參數:
# 定義模型
model = Net()
# 定義優化器
optimizer = Adam(model.parameters(), lr=0.00025)
# 定義損失函數
criterion = CrossEntropyLoss()
# 檢查GPU是否可用
if torch.cuda.is_available():
model = model.cuda()
criterion = criterion.cuda()
print(model)
最後,讓我們訓練模型:
torch.manual_seed(0)
# 模型batch大小
batch_size = 128
# epoch數
n_epochs = 10
for epoch in range(1, n_epochs+1):
# 保持記錄訓練與驗證集損失
train_loss = 0.0
permutation = torch.randperm(train_x.size()[0])
training_loss = []
for i in tqdm(range(0,train_x.size()[0], batch_size)):
indices = permutation[i:i+batch_size]
batch_x, batch_y = train_x[indices], train_y[indices]
if torch.cuda.is_available():
batch_x, batch_y = batch_x.cuda(), batch_y.cuda()
optimizer.zero_grad()
outputs = model(batch_x)
loss = criterion(outputs,batch_y)
training_loss.append(loss.item())
loss.backward()
optimizer.step()
training_loss = np.average(training_loss)
print('epoch: \t', epoch, '\t training loss: \t', training_loss)
接下來,讓我們檢查模型的性能:
prediction = []
target = []
permutation = torch.randperm(train_x.size()[0])
for i in tqdm(range(0,train_x.size()[0], batch_size)):
indices = permutation[i:i+batch_size]
batch_x, batch_y = train_x[indices], train_y[indices]
if torch.cuda.is_available():
batch_x, batch_y = batch_x.cuda(), batch_y.cuda()
with torch.no_grad():
output = model(batch_x.cuda())
softmax = torch.exp(output).cpu()
prob = list(softmax.numpy())
predictions = np.argmax(prob, axis=1)
prediction.append(predictions)
target.append(batch_y)
# 訓練集精度
accuracy = []
for i in range(len(prediction)):
accuracy.append(accuracy_score(target[i],prediction[i]))
print('training accuracy: \t', np.average(accuracy))
# 驗證集預測
prediction_val = []
target_val = []
permutation = torch.randperm(val_x.size()[0])
for i in tqdm(range(0,val_x.size()[0], batch_size)):
indices = permutation[i:i+batch_size]
batch_x, batch_y = val_x[indices], val_y[indices]
if torch.cuda.is_available():
batch_x, batch_y = batch_x.cuda(), batch_y.cuda()
with torch.no_grad():
output = model(batch_x.cuda())
softmax = torch.exp(output).cpu()
prob = list(softmax.numpy())
predictions = np.argmax(prob, axis=1)
prediction_val.append(predictions)
target_val.append(batch_y)
# 驗證集精度
accuracy_val = []
for i in range(len(prediction_val)):
accuracy_val.append(accuracy_score(target_val[i],prediction_val[i]))
print('validation accuracy: \t', np.average(accuracy_val))
驗證準確性明顯提高到73%。太棒了!
結尾在這篇文章中,我們研究了在使用深度學習模型(如CNNs)時可能面臨的不同難題。我們還學習了所有這些難題的解決方案,最後,我們使用這些解決方案建立了一個模型。在我們將這些技術添加到模型之後,模型在驗證集上的準確性得到了提高。總有改進的空間,以下是一些你可以嘗試的方法: