一、總體簡介
結合一個超級詳細的卷積神經網絡的解析:
1.卷積神經網絡是一種帶有卷積結構的深度神經網絡,卷積結構可以減少深層網絡佔用的內存量,其三個關鍵的操作,其一是局部感受野,其二是權值共享,其三是pooling層,有效的減少了網絡的參數個數,緩解了模型的過擬合問題。
2.卷積神經網絡結構包括:卷積層,降採樣層,全連結層。每一層有多個特徵圖,每個特徵圖通過一種卷積濾波器提取輸入的一種特徵,每個特徵圖有多個神經元。
名稱包含輸入層總共8層網絡,分別為:輸入層(INPUT)、卷積層(Convolutions,C1)、池化層(Subsampling,S2)、卷積層(C3)、池化層(Subsampling,S4)、卷積層(C5)、全連接層(F6)、輸出層(徑向基層)
AlexNet引入了ReLU和dropout,引入數據增強、池化相互之間有覆蓋,三個卷積一個最大池化+三個全連接層VGGNet採用11和33的卷積核以及2*2的最大池化使得層數變得更深。常用VGGNet-16和VGGNet19微軟ResNet殘差神經網絡(Residual Neural Network)1、引入高速公路結構,可以讓神經網絡變得非常深2、ResNet第二個版本將ReLU激活函數變成y=x的線性函數DenseNet接下來就是代碼實現了:
導入需要的庫:
import numpy as npimport torchimport torch.nn as nnimport torch.nn.functional as Fimport torch.optim as optim我們要下載一個torchvision庫,裡面包含有datasets, transforms函數
from torchvision import datasets, transformsprint("PyTorch Version: ",torch.__version__)首先我們定義一個基於ConvNet的簡單神經網絡(在定義一個網絡的時候,需要初始化一個函數,然後要加上一個forward函數)
class Net(nn.Module): def __init__(self): super(Net, self).__init__() self.conv1 = nn.Conv2d(1, 20, 5, 1) self.conv2 = nn.Conv2d(20, 50, 5, 1) self.fc1 = nn.Linear(4*4*50, 500) self.fc2 = nn.Linear(500, 10) def forward(self, x): x = F.relu(self.conv1(x)) x = F.max_pool2d(x,2,2) x = F.relu(self.conv2(x)) x = F.max_pool2d(x,2,2) x = x.view(-1, 4*4*50) x = F.relu(self.fc1(x)) x= self.fc2(x) return F.log_softmax(x, dim=1)接著利用datasets函數自動下載mnist數據
mnist_data = datasets.MNIST("./mnist_data", train=True, download=True, transform=transforms.Compose([ transforms.ToTensor(), ]))mnist_data接下來是將數據轉換成numpy格式
data = [d[0].data.cpu().numpy() for d in mnist_data]np.mean(data) #均值np.std(data) #方差接下來是定義兩個function,一個是train,一個是test
def train(model, device, train_loader, optimizer, epoch): model.train() for idx, (data, target) in enumerate(train_loader): data, target = data.to(device), target.to(device)
pred = model(data) loss = F.nll_loss(pred, target) optimizer.zero_grad() loss.backward() optimizer.step() if idx % 100 == 0: print("Train Epoch: {}, iteration: {}, Loss: {}".format( epoch, idx, loss.item()))def test(model, device, test_loader): model.eval() total_loss = 0. correct = 0. with torch.no_grad(): for idx, (data, target) in enumerate(test_loader): data, target = data.to(device), target.to(device)
output = model(data) total_loss += F.nll_loss(output, target, reduction="sum").item() pred = output.argmax(dim=1) correct += pred.eq(target.view_as(pred)).sum().item() total_loss /= len(test_loader.dataset) acc = correct/len(test_loader.dataset) * 100. print("Test loss: {}, Accuracy: {}".format(total_loss, acc))接下來是訓練模型在預測驗證集
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")batch_size = 32train_dataloader = torch.utils.data.DataLoader( datasets.MNIST("./mnist_data", train=True, download=True, transform=transforms.Compose([ transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,)) ])), batch_size=batch_size, shuffle=True, num_workers=1, pin_memory=True)test_dataloader = torch.utils.data.DataLoader( datasets.MNIST("./mnist_data", train=False, download=True, transform=transforms.Compose([ transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,)) ])), batch_size=batch_size, shuffle=True, num_workers=1, pin_memory=True)
lr = 0.01momentum = 0.5model = Net().to(device)optimizer = torch.optim.SGD(model.parameters(), lr=lr, momentum=momentum)
num_epochs = 2for epoch in range(num_epochs): train(model, device, train_dataloader, optimizer, epoch) test(model, device, test_dataloader) torch.save(model.state_dict(), "mnist_cnn.pt")NLL loss的定義
導入fashion_mnist數據也試一次
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")batch_size = 32train_dataloader = torch.utils.data.DataLoader( datasets.FashionMNIST("./fashion_mnist_data", train=True, download=True, transform=transforms.Compose([ transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,)) ])), batch_size=batch_size, shuffle=True, num_workers=1, pin_memory=True)test_dataloader = torch.utils.data.DataLoader( datasets.FashionMNIST("./fashion_mnist_data", train=False, download=True, transform=transforms.Compose([ transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,)) ])), batch_size=batch_size, shuffle=True, num_workers=1, pin_memory=True)
lr = 0.01momentum = 0.5model = Net().to(device)optimizer = torch.optim.SGD(model.parameters(), lr=lr, momentum=momentum)
num_epochs = 2for epoch in range(num_epochs): train(model, device, train_dataloader, optimizer, epoch) test(model, device, test_dataloader) torch.save(model.state_dict(), "fashion_mnist_cnn.pt")CNN模型的遷移學習以下是構建和訓練遷移學習模型的基本步驟:
初始化預訓練模型
把最後一層的輸出層改變成我們想要分的類別總數
定義一個optimizer來更新參數
模型訓練
import numpy as npimport torchvisionfrom torchvision import datasets, transforms, modelsimport matplotlib.pyplot as pltimport timeimport osimport copyprint("Torchvision Version: ",torchvision.__version__)數據我們會使用hymenoptera_data數據集,數據集在https://download.pytorch.org/tutorial/hymenoptera_data.zip下載
這個數據集包括兩類圖片, bees 和 ants, 這些數據都被處理成了可以使用ImageFolder <https://pytorch.org/docs/stable/torchvision/datasets.html#torchvision.datasets.ImageFolder>來讀取的格式。我們只需要把data_dir設置成數據的根目錄,然後把model_name設置成我們想要使用的與訓練模型::: [resnet, alexnet, vgg, squeezenet, densenet, inception]
其他的參數有:
data_dir = "./hymenoptera_data"model_name = "resnet"num_classes = 2batch_size = 32num_epochs = 15feature_extract = Trueinput_size = 224讀入數據現在我們知道了模型輸入的size,我們就可以把數據預處理成相應的格式。
all_imgs = datasets.ImageFolder(os.path.join(data_dir, "train"), transforms.Compose([ transforms.RandomResizedCrop(input_size), transforms.RandomHorizontalFlip(), transforms.ToTensor(), ]))loader = torch.utils.data.DataLoader(all_imgs, batch_size=batch_size, shuffle=True, num_workers=4)data_transforms = { "train": transforms.Compose([ transforms.RandomResizedCrop(input_size), transforms.RandomHorizontalFlip(), transforms.ToTensor(), transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) ]), "val": transforms.Compose([ transforms.Resize(input_size), transforms.CenterCrop(input_size), transforms.ToTensor(), transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) ])}
image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x), data_transforms[x]) for x in ["train", "val"]}dataloaders_dict = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=batch_size, shuffle=True, num_workers=4) for x in ["train", "val"]}device = torch.device("cuda" if torch.cuda.is_available() else "cpu")img = next(iter(dataloaders_dict["val"]))[0]以下是標準化以後的圖片的展示
unloader = transforms.ToPILImage() plt.ion()def imshow(tensor, title=None): image = tensor.cpu().clone() image = image.squeeze(0) image = unloader(image) plt.imshow(image) if title is not None: plt.title(title) plt.pause(0.001) plt.figure()imshow(img[1], title='Image')定義模型:
def set_parameter_requires_grad(model, feature_extract): if feature_extract: for param in model.parameters(): param.requires_grad = False def initialize_model(model_name, num_classes, feature_extract, use_pretrained=True): if model_name == "resnet": model_ft = models.resnet18(pretrained=use_pretrained) set_parameter_requires_grad(model_ft, feature_extract) num_ftrs = model_ft.fc.in_features model_ft.fc = nn.Linear(num_ftrs, num_classes) input_size = 224 else: print("model not implemented") return None, None return model_ft, input_size
model_ft, input_size = initialize_model(model_name, num_classes, feature_extract, use_pretrained=True)print(model_ft)改變除了最後一層的所有層:
model_ft.layer1[0].conv1.weight.requires_grad #前面的所有的層改變是False改變最後一層:
model_ft.fc.weight.requires_grad #只有最後一層的改變是True,前面的是都不會更新的定義一個完整的模型
def train_model(model, dataloaders, loss_fn, optimizer, num_epochs=5): best_model_wts = copy.deepcopy(model.state_dict()) best_acc = 0. val_acc_history = [] for epoch in range(num_epochs): for phase in ["train", "val"]: running_loss = 0. running_corrects = 0. if phase == "train": model.train() else: model.eval() for inputs, labels in dataloaders[phase]: inputs, labels = inputs.to(device), labels.to(device) with torch.autograd.set_grad_enabled(phase=="train"): outputs = model(inputs) loss = loss_fn(outputs, labels) preds = outputs.argmax(dim=1) if phase == "train": optimizer.zero_grad() loss.backward() optimizer.step() running_loss += loss.item() * inputs.size(0) running_corrects += torch.sum(preds.view(-1) == labels.view(-1)).item() epoch_loss = running_loss / len(dataloaders[phase].dataset) epoch_acc = running_corrects / len(dataloaders[phase].dataset) print("Phase {} loss: {}, acc: {}".format(phase, epoch_loss, epoch_acc)) if phase == "val" and epoch_acc > best_acc: best_acc = epoch_acc best_model_wts = copy.deepcopy(model.state_dict()) if phase == "val": val_acc_history.append(epoch_acc) model.load_state_dict(best_model_wts) return model, val_acc_history模型訓練
model_ft = model_ft.to(device)optimizer = torch.optim.SGD(filter(lambda p: p.requires_grad, model_ft.parameters()), lr=0.001, momentum=0.9)loss_fn = nn.CrossEntropyLoss()_, ohist = train_model(model_ft, dataloaders_dict, loss_fn, optimizer, num_epochs=num_epochs)上面訓練模型得到訓練和驗證的loss以及精度
model_scratch, _ = initialize_model(model_name, num_classes, feature_extract=False, use_pretrained=False)model_scratch = model_scratch.to(device)optimizer = torch.optim.SGD(filter(lambda p: p.requires_grad, model_scratch.parameters()), lr=0.001, momentum=0.9)loss_fn = nn.CrossEntropyLoss()_, scratch_hist = train_model(model_scratch, dataloaders_dict, loss_fn, optimizer, num_epochs=num_epochs)我們來plot模型訓練時候loss的變化
plt.title("Validation Accuracy vs. Number of Training Epochs")plt.xlabel("Training Epochs")plt.ylabel("Validation Accuracy")plt.plot(range(1,num_epochs+1),ohist,label="Pretrained")plt.plot(range(1,num_epochs+1),scratch_hist,label="Scratch")plt.ylim((0,1.))plt.xticks(np.arange(1, num_epochs+1, 1.0))plt.legend()plt.show()結果展示:
在這次的學習當中,也學習了多種神經網絡的代碼
網址為:
https://github.com/pytorch/vision/tree/master/torchvision/models
裡面含有很多網絡代碼