代码收藏家技术教程 2024-06-14

Python实现MNIST分类任务：使用卷积神经网络

一起来看我的shit代码

实验题目

用神经网络实现MNIST手写字符识别问题。

问题分析

MNIST数据集（Mixed National Institute of Standards and Technology database）是美国国家标准与技术研究院收集整理的大型手写数字数据集，包含了60,000个样本的训练集以及10,000个样本的测试集。本次实验将采用CNN来实现对MNIST的识别。

算法设计

数据准备

将MNIST下载到本地/data目录下，其中60000个为训练集，10000个为测试集，随机打乱。

模型构建

定义一个卷积神经网络（CNN）模型。

模型经卷积-池化-卷积-池化-全连接1-全连接2-输出。

由于有10个数字，最终输出10个神经元。

使用ReLU作为隐藏层的激活函数。

模型训练

使用Adam优化器进行参数优化。（实验证明Adam比SGD收敛更快，速度也更快）

设置64个样本为1batch。

训练模型3次迭代，每次迭代遍历整个训练集。

在每次迭代后输出一次损失值。

模型评估

在测试集上评估训练好的模型。

计算模型在测试集上的分类准确率。

运算结果

跑了3个epoch

部分神经元参数如下

测试集

实验总结

在这个实验中，我们使用了一个简单的CNN模型对MNIST手写数字数据集进行了分类。我们首先定义了一个包含卷积层、池化层和全连接层的CNN模型，然后使用Adam优化器对其进行训练。通过三个epoch的训练，我们得到了一个在测试集上准确率达到98%以上的模型。在评估模型性能时，我们使用了GPU加速，并通过对模型参数的查看，了解了每一层的权重和偏置。本次实验有效地展示了CNN模型在图像分类任务中的优秀性能，以及GPU加速对训练效率的提升。

附录

CNN_Train.py

import torch
import torch.nn as nn
import torch.optim as optim
from torchvision.datasets import MNIST
from torchvision import transforms
import time

# 检查GPU是否可用
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")


# 定义CNN模型
class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 16, kernel_size=3, stride=1, padding=1)
        self.conv2 = nn.Conv2d(16, 32, kernel_size=3, stride=1, padding=1)
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
        self.fc1 = nn.Linear(32 * 7 * 7, 128)
        self.fc2 = nn.Linear(128, 64)
        self.fc3 = nn.Linear(64, 10)

    def forward(self, x):
        # 卷积-池化-卷积-池化-全连接1-全连接2-输出
        x = self.pool(torch.relu(self.conv1(x)))
        x = self.pool(torch.relu(self.conv2(x)))
        x = x.view(-1, 32 * 7 * 7)
        x = torch.relu(self.fc1(x))
        x = torch.relu(self.fc2(x))
        x = self.fc3(x)
        return x


# 设置数据预处理转换 将图像转换为张量并进行归一化
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))
])

# 加载MNIST训练集
train_data = MNIST(root='./data', train=True, transform=transform, download=True)
train_loader = torch.utils.data.DataLoader(train_data, batch_size=64,
                                           shuffle=True)
# 实例化模型、损失函数和优化器
model = CNN().to(device)  # 丢到GPU
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)  # 实验证明Adam比SGD更优

# 训练模型
start_t = time.perf_counter()  # 计时
num_epochs = 3
for epoch in range(num_epochs):
    running_loss = 0.0
    for images, labels in train_loader:
        images, labels = images.to(device), labels.to(device)
        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        running_loss += loss.item()
    print("Epoch {}/{}, Loss: {:.4f}".format(epoch + 1, num_epochs, running_loss / len(train_loader)))

end_t = time.perf_counter()
print("训练耗时:{:.8f}s".format(end_t - start_t))

# 保存模型
torch.save(model.state_dict(), "mnist_cnn_model.pth")

# for param in list(model.parameters()):
#     print(param)

CNN_Test.py

import torch
import torch.nn as nn
from torchvision.datasets import MNIST
from torchvision import transforms
import time

# 检查GPU是否可用
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")


# 定义CNN模型
class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 16, kernel_size=3, stride=1, padding=1)
        self.conv2 = nn.Conv2d(16, 32, kernel_size=3, stride=1, padding=1)
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
        self.fc1 = nn.Linear(32 * 7 * 7, 128)
        self.fc2 = nn.Linear(128, 64)
        self.fc3 = nn.Linear(64, 10)

    def forward(self, x):
        # 卷积-池化-卷积-池化-全连接1-全连接2-输出
        x = self.pool(torch.relu(self.conv1(x)))
        x = self.pool(torch.relu(self.conv2(x)))
        x = x.view(-1, 32 * 7 * 7)
        x = torch.relu(self.fc1(x))
        x = torch.relu(self.fc2(x))
        x = self.fc3(x)
        return x


# 设置数据预处理转换 将图像转换为张量并进行归一化
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))
])

# 加载MNIST训练集
test_data = MNIST(root='./data', train=False, transform=transform)
test_loader = torch.utils.data.DataLoader(test_data, batch_size=64,
                                          shuffle=True)

# 加载模型
model = CNN().to(device)  # 丢到GPU
model.load_state_dict(torch.load("mnist_cnn_model.pth"))

# 在测试集上评估模型
start_t = time.perf_counter()  # 计时
correct = 0
total = 0
with torch.no_grad():  # 测试阶段不用算梯度
    for images, labels in test_loader:
        images, labels = images.to(device), labels.to(device)  # 输入数据转移到GPU
        outputs = model(images)  # 前向传播
        _, predicted = torch.max(outputs, 1)  # 获取预测结果 找最大值所在的索引
        total += labels.size(0)  # 累加总样本数
        correct += (predicted == labels).sum().item()  # 预测正确计数

end_t = time.perf_counter()
print("分类耗时:{:.8f}s".format(end_t - start_t))
print("分类准确率: {:.2f}%".format(100 * correct / total))

作者：Alplexchur