深度学习实验一
anaconda下载
推荐使用清华镜像源,速度快链接地址,直接选择最新就可以,版本没有什么影响,需要注意的我放在下面,其他的就next就好
这里安装位置自己可以选择一下,默认在C盘
这里手动选择一下第一条,直接把路径添加进去,后边就不需要再添加了。安装的过程有点慢,耐心等等。
(让我们在等待时候听一首歌《蝶恋花·答李淑一》(●’◡’●),最近真的很喜欢这首歌,给没听过的小伙伴推荐一下)
安装好之后在开始菜单里可以看到上面的内容,点击Anaconda Navigator打开
界面如下
验证环境变量,在系统环境变量中添加如下路径(电脑右键->属性->高级系统设置->环境变量->系统变量->Path->新建),然后点击确定完成环境变量的配置
开始位置,点击Anaconda Powershell Prompt打开控制台
在控制台中输入conda info,出现如下所示则为配置成功
CUDA下载
官网地址
cuda下载的具体过程我就我不放了,因为我找到了一个博主写的很详细的教程,把链接放在这里:点这儿
然后我在安装的过程中遇到了下面的问题,这里也放一个博主很详细的解决教程,链接看这里:点这儿
在最后这里再放一下在anconda命令行里创建虚拟环境关联jupyter的教程
创建虚拟环境
关联jupyter
或者也可以直接在虚拟环境里面装一个Jupyter
pytorch下载
打开pytorch官网
根据自己需要选择CPU或是GPU,安装也可以选择pip或者conda
复制Run this Command内容到cmd运行
输入以下命令验证安装是否成功,出现图示则为成功
python
import torch
import torchvision
x = torch.rand(2,3)
print(x)
(中间输错了好几次,忽略掉忽略掉/(ㄒoㄒ)/~~)
打开jupyter notebook
OK!到现在位置,准备工作就完成了,让我们正式开始做作业!^o^/
Pytorch基本操作考察
1、使用 𝐓𝐞𝐧𝐬𝐨𝐫 初始化一个 𝟏 × 𝟑 的矩阵 𝑴 和一个 𝟐 × 𝟏 的矩阵 𝑵,对两矩阵进行减法操作(要求实现三种不同的形式),给出结果并分析三种方式的不同(如果出现报错,分析报错的原因),同时需要指出在计算过程中发生了什么
我们先来简单看一下创建矩阵的几种方法,我这里用了两种方法,当然更多的方法大家可以自己探索
import torch
M = torch.rand(1,3)
N = torch.rand(2,1)
print(M)
print(N)
M1 = torch.tensor([[1,2,3]])
N1 = torch.tensor([[4],
[5]])
print(M1)
print(N1)
接着来看题目要求的三种减法操作(图中用加法做示例,减法道理相同)
print(M - N)
print(M1 - N1)
print(torch.subtract(M , N))
print(torch.subtract(M1 , N1))
M.subtract_(N)
print(M)
M1.subtract_(N1)
print(M1)
这里最后一种方法是会报错的,不能输出结果,因为M和N的维度不同,subtract_函数无法进行广播。(高维矩阵大家可以自己进行尝试,这里就不多介绍了,注意要保证最后几位相同才可以进行广播)
2、① 利用 𝐓𝐞𝐧𝐬𝐨𝐫 创建两个大小分别 𝟑 × 𝟐 和 𝟒 × 𝟐 的随机数矩阵 𝑷 和 𝑸 ,要求服从均值为0,标准差0.01为的正态分布;② 对第二步得到的矩阵 𝑸 进行形状变换得到 𝑸 的转置 𝑸𝑻 ;③ 对上述得到的矩阵 𝑷 和矩阵 𝑸𝑻 求矩阵相乘
这个还是比较简单的,我们直接上代码
import torch
P = torch.normal(0, 0.01, [3,2])
Q = torch.normal(0, 0.01, [4,2])
print(P)
print(Q)
QT = Q.t()
print(QT)
print(torch.mm(P, QT))
3、给定公式 𝑦3 = 𝑦1 + 𝑦2 = 𝑥2 + 𝑥3,且 𝑥 = 1。利用学习所得到的Tensor的相关知识,求𝑦3对𝑥的梯度,即𝑑𝑦3/𝑑𝑥 。要求在计算过程中,在计算 𝑥3 时中断梯度的追踪,观察结果并进行原因分析
import torch
x = torch.tensor(1.0, requires_grad = True)
print(x)
print(x.grad)
y1 = x ** 2
with torch.no_grad(): #终端对x3的追踪
y2 = x ** 3
y3 = y1 + y2
y3.backward()
print(x.grad)
根据求导法则可以得到y3’ = 2x + 3x^2,但是x3中断了追踪,所以当x = 1时,y3对x的梯度为2,输出结果
实现logistic回归
1、要求动手从0实现 logistic 回归(只借助Tensor和Numpy相关的库)在人工构造的数据集上进行训练和测试,并从loss以及训练集上的准确率等多个角度对结果进行分析
import torch
import numpy as np
import matplotlib.pyplot as plt
#构建数据集
num_inputs = 2 #特征数
n_data = torch.ones(1000, num_inputs)
x1 = torch.normal(2 * n_data, 1) #shape = (1000,2)
y1 = torch.ones(1000) #shape = (1000, 1)
x2 = torch.normal(-2 * n_data, 1)
y2 = torch.zeros(1000)
#划分训练和测试集
train_index = 800
#构建训练集
trainfeatures = torch.cat((x1[:train_index], x2[:train_index]), 0).type(torch.FloatTensor)
trainlabels = torch.cat((y1[:train_index], y2[:train_index]), 0).type(torch.FloatTensor)
print(len(trainfeatures))
#构建测试集
testfeatures = torch.cat((x1[train_index:], x2[train_index:]), 0).type(torch.FloatTensor)
testlabels = torch.cat((y1[train_index:], y2[train_index:]), 0).type(torch.FloatTensor)
#可视化生成数据集
plt.scatter(trainfeatures.data.numpy()[:, 0],
trainfeatures.data.numpy()[:, 1],
c=trainlabels.data.numpy(),
s=5, lw=0, cmap='RdYlGn' )
plt.show()
#读取数据
def data_iter(batch_size,features,labels):
num_examples=len(features)
indices=list(range(num_examples))
np.random.shuffle(indices) #随机读取数据
for i in range(0,num_examples,batch_size):
j=torch.LongTensor(indices[i:min(i+batch_size,num_examples)])
yield features.index_select(0,j),labels.index_select(0,j)
#初始化参数
w=torch.tensor(np.random.normal(0,0.01,(num_inputs,1)),dtype=torch.float32)
b=torch.zeros(1,dtype=torch.float32)
#对参数梯度进行追踪
w.requires_grad_(requires_grad=True)
b.requires_grad_(requires_grad=True)
#实现逻辑回归
def logits(X, w, b):
y = torch.mm(X, w) + b
return 1/(1+torch.pow(np.e,-y))
#实现二次交叉熵损失函数
def logits_loss(y_hat, y):
y = y.view(y_hat.size())
return -y.mul(torch.log(y_hat))-(1-y).mul(torch.log(1-y_hat))
#优化函数
def sgd(params, lr, batch_size):
for param in params:
param.data -= lr * param.grad / batch_size
#测试集准确率
def evaluate_accuracy():
acc_sum,n,test_l_sum = 0.0, 0 ,0
for X,y in data_iter(batch_size, testfeatures, testlabels):
y_hat = net(X, w, b)
y_hat = torch.squeeze(torch.where(y_hat>0.5,torch.tensor(1.0),torch.tensor(0.0)))
acc_sum += (y_hat==y).float().sum().item()
l = loss(y_hat,y).sum()
test_l_sum += l.item()
n += y.shape[0]
return acc_sum/n,test_l_sum/n
lr = 0.0005 #学习率
num_epochs = 300 #迭代次数
net = logits #逻辑函数
loss = logits_loss #损失函数
batch_size = 50
test_acc,train_acc= [],[]
train_loss,test_loss =[],[]
for epoch in range(num_epochs): # 训练模型一共需要num_epochs个迭代周期
train_l_sum, train_acc_sum,n = 0.0,0.0,0
for X, y in data_iter(batch_size, trainfeatures, trainlabels): # x和y分别是小批量样本的特征和标签
y_hat = net(X, w, b)
l = loss(y_hat, y).sum() # l是有关小批量X和y的损失
l.backward() # 小批量的损失对模型参数求梯度
sgd([w, b], lr, batch_size) # 使用小批量随机梯度下降迭代模型参数
w.grad.data.zero_() # 梯度清零
b.grad.data.zero_()
#计算每个epoch的loss
train_l_sum += l.item()
#计算训练样本的准确率
y_hat = torch.squeeze(torch.where(y_hat>0.5,torch.tensor(1.0),torch.tensor(0.0)))
train_acc_sum += (y_hat==y).sum().item()
#每一个epoch的所有样本数
n += y.shape[0]
#计算测试样本的准确率
test_a,test_l = evaluate_accuracy()
test_acc.append(test_a)
test_loss.append(test_l)
train_acc.append(train_acc_sum/n)
train_loss.append(train_l_sum/n)
print('epoch %d, loss %.4f, train acc %.3f, test acc %.3f'
% (epoch + 1, train_loss[epoch], train_acc[epoch], test_acc[epoch]))
2、利用 torch.nn 实现 logistic 回归在人工构造的数据集上进行训练和测试,并对结果进行分析,并从loss以及训练集上的准确率等多个角度对结果进行分析
import torch
import numpy as np
import torch.utils.data as Data
from torch.nn import init
from torch import nn
#构建数据集
num_inputs = 2 #特征数
n_data = torch.ones(1000, num_inputs)
x1 = torch.normal(2 * n_data, 1) #shape = (1000,2)
y1 = torch.ones(1000) #shape = (1000, 1)
x2 = torch.normal(-2 * n_data, 1)
y2 = torch.zeros(1000)
#划分训练和测试集
train_index = 800
#构建训练数据集
trainfeatures = torch.cat((x1[:train_index], x2[:train_index]), 0).type(torch.FloatTensor)
trainlabels = torch.cat((y1[:train_index], y2[:train_index]), 0).type(torch.FloatTensor)
print(len(trainfeatures))
#构建测试集
testfeatures = torch.cat((x1[train_index:], x2[train_index:]), 0).type(torch.FloatTensor)
testlabels = torch.cat((y1[train_index:], y2[train_index:]), 0).type(torch.FloatTensor)
#读取数据
batch_size = 50
#组合训练数据的特征和标签
dataset = Data.TensorDataset(trainfeatures, trainlabels)
# 把dataset放入DataLoader当中
train_data_iter = Data.DataLoader(dataset=dataset, # torch TensorDataset format
batch_size=batch_size, # mini batch size
shuffle=True, # 是否打乱数据 (训练集一般需要进行打乱)
num_workers=0, # 多线程来读数据,在Windows下设置为0
)
# 将测试数据的特征和标签组合
dataset = Data.TensorDataset(testfeatures, testlabels)
# 把 dataset 放入 DataLoader
test_data_iter = Data.DataLoader(dataset=dataset, # torch TensorDataset format
batch_size=batch_size, # mini batch size
shuffle=True, # 是否打乱数据 (训练集一般需要进行打乱)
num_workers=0, # 多线程来读数据,在Windows下设置为0
)
#模型定义
class LogisticRegression(nn.Module):
def __init__(self,n_features):
super(LogisticRegression, self).__init__()
self.lr = nn.Linear(n_features, 1)
self.sm = nn.Sigmoid()
def forward(self, x): #向前传播
x = self.lr(x)
x = self.sm(x)
return x
#初始化模型
logistic_model = LogisticRegression(num_inputs)
#定义损失函数
criterion = nn.BCELoss()
#定义优化器
optimizer = torch.optim.SGD(logistic_model.parameters(), lr=1e-3)
#参数初始化
init.normal_(logistic_model.lr.weight, mean=0, std=0.01)
init.constant_(logistic_model.lr.bias, val=0)
print(logistic_model.lr.weight)
print(logistic_model.lr.bias)
#测试集准确率
def evaluate_accuracy():
acc_sum,n,test_l_sum = 0.0, 0 ,0
for X,y in test_data_iter:
y_hat = logistic_model(X)
y_hat = torch.squeeze(torch.where(y_hat>0.5,torch.tensor(1.0),torch.tensor(0.0)))
acc_sum += (y_hat==y).float().sum().item()
n += y.shape[0]
return acc_sum/n,test_l_sum/n
#开始训练
num_epochs = 300 #训练轮数
test_acc,train_acc= [],[]
train_loss,test_loss =[],[]
for epoch in range( num_epochs):
train_l_sum, train_acc_sum,n = 0.0, 0.0, 0
for X, y in train_data_iter:
y_hat = logistic_model(X)
l = criterion(y_hat, y.view(-1, 1))
optimizer.zero_grad() #梯度清零
l.backward() #计算梯度
optimizer.step() #更新参数
#计算每个epoch的loss
train_l_sum += l.item()
#计算训练样本的准确率
y_hat = torch.squeeze(torch.where(y_hat>0.5,torch.tensor(1.0),torch.tensor(0.0)))
train_acc_sum += (y_hat==y).sum().item()
#每一个epoch的所有样本数
n += y.shape[0]
#计算测试样本的准确率
test_a,test_l = evaluate_accuracy()
test_acc.append(test_a)
test_loss.append(test_l)
train_acc.append(train_acc_sum/n)
train_loss.append(train_l_sum/n)
print('epoch %d, loss %.4f, train acc %.3f, test acc %.3f'
% (epoch + 1, train_loss[epoch], train_acc[epoch], test_acc[epoch]))
实现softmax
1、要求动手从0实现 softmax 回归(只借助Tensor和Numpy相关的库)在Fashion-MNIST数据集上进行训练和测试,并从loss、训练集以及测试集上的准确率等多个角度对结果进行分析
import torch
import torchvision
import torchvision.transforms as transforms
import numpy as np
import sys
print("torch.__version__:", torch.__version__)
print("torchvision.__version__:", torchvision.__version__)
#读取数据集
batch_size = 256
mnist_train = torchvision.datasets.FashionMNIST(root="...\\train_data", train=True, download=True, transform=transforms.ToTensor())
mnist_test = torchvision.datasets.FashionMNIST(root="...\\test_data", train=False, download=True, transform=transforms.ToTensor())
train_iter = torch.utils.data.DataLoader(mnist_train, batch_size=batch_size, shuffle=True, num_workers=0)
test_iter = torch.utils.data.DataLoader(mnist_test, batch_size=batch_size, shuffle=False, num_workers=0)
#交叉熵损失函数
def cross_entropy(y_hat,y):
return - torch.log(y_hat.gather(1,y.view(-1,1)))
#优化函数
def sgd(params, lr, batch_size):
for param in params:
param.data -= lr * param.grad / batch_size
# 初始化模型参数
num_inputs = 784 # 输入是28x28像素的图像,所以输入向量长度为28*28=784
num_outputs = 10 # 输出是10个图像类别
W = torch.tensor(np.random.normal(0,0.01,(num_inputs,num_outputs)),dtype=torch.float) # 权重参数为784x10
b = torch.zeros(num_outputs,dtype=torch.float) # 偏差参数为1x10
# 追踪模型参数梯度
W.requires_grad_(requires_grad=True)
b.requires_grad_(requires_grad=True)
# 实现softmax
def softmax(X):
X_exp = X.exp() # 通过exp函数对每个元素做指数运算
partition = X_exp.sum(dim=1, keepdim=True) # 对exp矩阵同行元素求和
return X_exp / partition # 矩阵每行各元素与该行元素之和相除
# 模型定义
def net(X):
return softmax(torch.mm(X.view((-1,num_inputs)),W)+b)
# 计算分类准确率
def evaluate_accuracy(data_iter,net):
acc_sum,n,test_l_sum= 0.0, 0, 0.0
for X,y in data_iter:
acc_sum += (net(X).argmax(dim = 1) == y).float().sum().item()
l = loss(net(X),y).sum()
test_l_sum += l.item()
n += y.shape[0]
return acc_sum/n,test_l_sum/n
# 模型训练
num_epochs,lr = 50, 0.1
test_acc,train_acc= [],[]
train_loss,test_loss =[],[]
loss = cross_entropy
params = [W,b]
for epoch in range(num_epochs):
train_l_sum, train_acc_sum, n = 0.0, 0.0, 0
for X,y in train_iter:
y_hat = net(X)
l = loss(y_hat,y).sum()
l.backward()
sgd(params, lr, batch_size)
W.grad.data.zero_()
b.grad.data.zero_()#默认一开始梯度为零,所以梯度清零放在for循环的最后
train_l_sum += l.item()
train_acc_sum += (y_hat.argmax(dim=1) == y).sum().item()
n += y.shape[0]
test_a,test_l = evaluate_accuracy(test_iter, net)
test_acc.append(test_a)
test_loss.append(test_l)
train_acc.append(train_acc_sum/n)
train_loss.append(train_l_sum/n)
print('epoch %d, loss %.4f, train acc %.3f, test acc %.3f'
% (epoch + 1, train_loss[epoch], train_acc[epoch], test_acc[epoch]))
2、利用torch.nn实现 softmax 回归在Fashion-MNIST数据集上进行训练和测试,并从loss,训练集以及测试集上的准确率等多个角度对结果进行分析
import torch
from torch import nn
import torchvision
import torchvision.transforms as transforms
import numpy as np
import sys
#读取数据集
batch_size = 256
mnist_train = torchvision.datasets.FashionMNIST(root="F:\\保研先修课\\深度学习\\作业\\实验一\\test_data", train=True, download=True, transform=transforms.ToTensor())
mnist_test = torchvision.datasets.FashionMNIST(root="F:\\保研先修课\\深度学习\\作业\\实验一\\train_data", train=False, download=True, transform=transforms.ToTensor())
train_iter = torch.utils.data.DataLoader(mnist_train, batch_size=batch_size, shuffle=True, num_workers=0)
test_iter = torch.utils.data.DataLoader(mnist_test, batch_size=batch_size, shuffle=False, num_workers=0)
#初始化模型
net = torch.nn.Sequential(nn.Flatten(),
nn.Linear(784,10))
#初始化模型参数
def init_weights(m):
if type(m) == nn.Linear:
nn.init.normal_(m.weight,std = 0.01)
net.apply(init_weights)
#定义损失函数和优化器
loss = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(net.parameters(),lr = 0.1)
# 计算分类准确率
def evaluate_accuracy(data_iter,net):
acc_sum,n,test_l_sum= 0.0,0,0.0
for X,y in data_iter:
acc_sum += (net(X).argmax(dim = 1) == y).float().sum().item()
l = loss(net(X),y).sum()
test_l_sum += l.item()
n += y.shape[0]
return acc_sum/n,test_l_sum/n
# 模型训练
num_epochs = 50
lr = 0.1
test_acc, train_acc = [], []
train_loss, test_loss = [], []
for epoch in range(num_epochs):
train_l_sum, train_acc_sum, n = 0.0, 0.0, 0
for X, y in train_iter:
y_hat = net(X)
l = loss(y_hat, y).sum()
optimizer.zero_grad()
l.backward()
optimizer.step()
train_l_sum += l.item()
train_acc_sum += (y_hat.argmax(dim=1) == y).sum().item()
n += y.shape[0]
test_a, test_l = evaluate_accuracy(test_iter, net)
test_acc.append(test_a)
test_loss.append(test_l)
train_acc.append(train_acc_sum / n)
train_loss.append(train_l_sum / n)
print('epoch %d, loss %.4f, train acc %.3f, test acc %.3f'
% (epoch + 1, train_loss[epoch], train_acc[epoch], test_acc[epoch]))