代码收藏家技术教程 2025-02-18

Python 数据科学和科学计算领域的基石Numpy

Python科学计算库Numpy

本文从实战角度从几下几个方面对于Numpy进行全面演示，人工智能关键在神经网络，而神经网络又都是矩阵运算，矩阵运算的基础就是Numpy.全面掌握好Numpy就为人工智能开发打下深厚基础。
1 Numpy概述
2 array结构
3 数值计算
4 排序
5 数组形状
6 数组生成
7 运算
8 随机模块
9 读写
10 练习题

确保第一个事，咱们要用的库已经安装好了

import numpy as np

array = [1,2,3,4,5]
array + 1

---------------------------------------------------------------------------

TypeError                                 Traceback (most recent call last)

Cell In[2], line 2
      1 array = [1,2,3,4,5]
----> 2 array + 1


TypeError: can only concatenate list (not "int") to list

要想给一个list加1，不能直接这么操作。但是我们把它转成一个numpy最底层的ndarray结构，就可以对list做数学运算了

array = np.array([1,2,3,4,5])
print (type(array))

<class 'numpy.ndarray'>

array2 = array + 1
array2

array([2, 3, 4, 5, 6])

array2 +array

array([ 3,  5,  7,  9, 11])

array2 * array

array([ 2,  6, 12, 20, 30])

array[0]

np.int64(1)

array[3]

np.int64(4)

array

array([1, 2, 3, 4, 5])

array.shape

(5,)

只有adarray有shape属性，对于普通的list，没有这个shape

list1 = [1,2,3,4,5]
list1.shape

---------------------------------------------------------------------------

AttributeError                            Traceback (most recent call last)

Cell In[13], line 2
      1 list1 = [1,2,3,4,5]
----> 2 list1.shape


AttributeError: 'list' object has no attribute 'shape'

np.array([[1,2,3],[4,5,6]])

array([[1, 2, 3],
       [4, 5, 6]])

结构

对于ndarray结构来说，里面所有的元素必须是同一类型的如果不是的话，会自动的向下进行转换.把一个list转成ndarray

list = [1,2,3,4,5]
ndarray = np.array(list)
ndarray

array([1, 2, 3, 4, 5])

ndarray基本属性操作，dtype看类型，shape形状，ndim维度，fill填充

type(ndarray)

numpy.ndarray

ndarray.dtype

dtype('int64')

ndarray.shape

(5,)

ndarray.ndim

ndarray1 = np.array([[1,2,3],[4,5,6]])
ndarray1.ndim

ndarray1.shape

(2, 3)

fill可以全部填充

ndarray.fill(0)
ndarray

array([0, 0, 0, 0, 0])

索引与切片:跟Python都是一样的还是从0开始的

list = [1,2,3,4,5]
array = np.array(list)
array[0]

np.int64(1)

array[1:3]

array([2, 3])

矩阵格式，多维形式

array = np.array([[1,2,3],
                 [4,5,6],
                 [7,8,9]])
array

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

array.shape

(3, 3)

array.size

array.ndim

修改里面的值

array[1,1] = 10
array

array([[ 1,  2,  3],
       [ 4, 10,  6],
       [ 7,  8,  9]])

array[1]

array([ 4, 10,  6])

array[:,1]

array([ 2, 10,  8])

array[0,0:2]

array([1, 2])

array2 = array
array2

array([[ 1,  2,  3],
       [ 4, 10,  6],
       [ 7,  8,  9]])

array2[1,1] = 100
array2

array([[  1,   2,   3],
       [  4, 100,   6],
       [  7,   8,   9]])

array

array([[  1,   2,   3],
       [  4, 100,   6],
       [  7,   8,   9]])

可以看到=这种赋值，是一种浅拷贝，就是2个变量指向同一个地址，其中一个变量改了值，另一个也就改了。如何进行深拷贝，可以使用copy

array2 = array.copy()
array2

array([[  1,   2,   3],
       [  4, 100,   6],
       [  7,   8,   9]])

array2[1,1] = 1000
array2

array([[   1,    2,    3],
       [   4, 1000,    6],
       [   7,    8,    9]])

array

array([[  1,   2,   3],
       [  4, 100,   6],
       [  7,   8,   9]])

array = np.arange(0,100,10)

mask = np.array([0,0,0,1,1,1,0,0,1,1],dtype=bool)
mask

array([False, False, False,  True,  True,  True, False, False,  True,
        True])

array[mask]

array([30, 40, 50, 80, 90])

random_array=np.random.rand(10)
random_array

array([0.81861097, 0.09494575, 0.81619337, 0.14018824, 0.11683594,
       0.38988826, 0.81800008, 0.5675678 , 0.41470347, 0.55486108])

mask = random_array > 0.5
mask

array([ True, False,  True, False, False, False,  True,  True, False,
        True])

array = np.array([10,20,30,40,50])
array > 30

array([False, False, False,  True,  True])

拿到满足条件的索引值，把索引传进去，就可以找到值

np.where(array > 30)

(array([3, 4]),)

array[np.where(array > 30)]

array([40, 50])

数组类型，查看占有字节一共有多少个

array = np.array([1,2,3,4,5],dtype=np.float32)
array

array([1., 2., 3., 4., 5.], dtype=float32)

array.dtype

dtype('float32')

array.nbytes

asarray可以改变数组类型，但是不会改变原始数组

array=np.array([1,10,3.5,'str'])
array

array(['1', '10', '3.5', 'str'], dtype='<U32')

array = np.array([1,2,3,4,5])
np.asarray(array,dtype=np.float32)

array([1., 2., 3., 4., 5.], dtype=float32)

array

array([1, 2, 3, 4, 5])

array.astype(np.float32)

array([1., 2., 3., 4., 5.], dtype=float32)

array数组的数值计算，prod是数组里面全部相乘，全局的最大值，最小值，以及最大最小值的索引，均值，标准差，方差

array = np.array([[1,2,3],[4,5,6]])
array

array([[1, 2, 3],
       [4, 5, 6]])

np.sum(array)

np.int64(21)

指定要进行操作的是沿什么轴（维度）

np.sum(array,axis=0)

array([5, 7, 9])

array.ndim

np.sum(array,axis=1)

array([ 6, 15])

np.sum(array,axis=-1)

array([ 6, 15])

array.sum()

np.int64(21)

array.sum(axis=0)

array([5, 7, 9])

array.prod()

np.int64(720)

array.min()

np.int64(1)

找到索引位置

array.argmin()

np.int64(0)

array.argmin(axis=1)

array([0, 0])

array.argmax()

np.int64(5)

array.mean()

np.float64(3.5)

array.mean(axis=0)

array([2.5, 3.5, 4.5])

标准差

array.std()

np.float64(1.707825127659933)

array.std(axis=1)

array([0.81649658, 0.81649658])

方差计算

array.var()

np.float64(2.9166666666666665)

array

array([[1, 2, 3],
       [4, 5, 6]])

clip小于左边的，取左边值，大于右边的取右边值，round四舍五入,decimals保留小数位数

array.clip(2,4)

array([[2, 2, 3],
       [4, 4, 4]])

array1 = np.array([1.2,3.34,4.56])

array1.round()

array([1., 3., 5.])

array1.round(decimals=1)

array([1.2, 3.3, 4.6])

排序

array = np.array([[1.5,1.3,7.5],[5.6,7.8,1.2]])
array

array([[1.5, 1.3, 7.5],
       [5.6, 7.8, 1.2]])

np.sort(array)

array([[1.3, 1.5, 7.5],
       [1.2, 5.6, 7.8]])

np.sort(array,axis=0)

array([[1.5, 1.3, 1.2],
       [5.6, 7.8, 7.5]])

array

array([[1.5, 1.3, 7.5],
       [5.6, 7.8, 1.2]])

np.argsort(array)

array([[1, 0, 2],
       [2, 0, 1]])

array=np.linspace(0,10,10)
array

array([ 0.        ,  1.11111111,  2.22222222,  3.33333333,  4.44444444,
        5.55555556,  6.66666667,  7.77777778,  8.88888889, 10.        ])

valuses是即将插入的值，searchsorted是插入到array中返回的位置索引

values=np.array([2.5,6.5,9.5])
np.searchsorted(array,values)

array = np.array([[1,0,6],[1,7,0],[2,3,1],[2,4,0]])

数组的操作

shape,reshape,arange,newais增加维度，squeeze，把没必要维度进行压缩，transpose翻转，T也是一样。

concatenate数组连接，vstack竖着连，hstack横着连，flatten和ravel都是拉平。

import numpy as np
array = np.arange(10)
array

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

array.shape

(10,)

array.shape = 2,5

array

array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

array.reshape(1,10)

array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])

大小必须不能变,总共十个元素，2,5可以，2,4就不行

array.shape = 2,4

---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)

Cell In[9], line 1
----> 1 array.shape = 2,4


ValueError: cannot reshape array of size 10 into shape (2,4)

array = np.arange(10)
array.shape

(10,)

array = array[np.newaxis,:]
array.shape

array = np.arange(10)
array.shape

(10,)

array = array[:,np.newaxis,np.newaxis]
array.shape

(10, 1, 1)

array = array.squeeze()
array.shape

(10,)

array.shape = 2,5

array

array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

array.transpose()

array([[0, 5],
       [1, 6],
       [2, 7],
       [3, 8],
       [4, 9]])

array.T

array([[0, 5],
       [1, 6],
       [2, 7],
       [3, 8],
       [4, 9]])

a = np.array([[123,456,564],[345,678,754]])
a

array([[123, 456, 564],
       [345, 678, 754]])

b = np.array([[532,567,8970],[123,345,765]])
b

array([[ 532,  567, 8970],
       [ 123,  345,  765]])

c = np.concatenate((a,b))
c

array([[ 123,  456,  564],
       [ 345,  678,  754],
       [ 532,  567, 8970],
       [ 123,  345,  765]])

c.shape

(4, 3)

np.vstack((a,b))

array([[ 123,  456,  564],
       [ 345,  678,  754],
       [ 532,  567, 8970],
       [ 123,  345,  765]])

np.hstack((a,b))

array([[ 123,  456,  564,  532,  567, 8970],
       [ 345,  678,  754,  123,  345,  765]])

array([[123, 456, 564],
       [345, 678, 754]])

a.flatten()

array([123, 456, 564, 345, 678, 754])

a.ravel()

array([123, 456, 564, 345, 678, 754])

## 数组生成，arange构造数组，linspace,在给定范围均价的取多少值，logspace给定对数范围均匀分成几份对应值，
## meshgrid网格化，np.r_ ,np.c_，一个行一个列，可以接受多种输入，包括切片、数组。zeros,one,取一样值，identity对角线给1，别的为0

np.arange(10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

np.arange(2,20,2)

array([ 2,  4,  6,  8, 10, 12, 14, 16, 18])

np.arange(2,20,2,dtype=np.float32)

array([ 2.,  4.,  6.,  8., 10., 12., 14., 16., 18.], dtype=float32)

np.linspace(0,10,10)

array([ 0.        ,  1.11111111,  2.22222222,  3.33333333,  4.44444444,
        5.55555556,  6.66666667,  7.77777778,  8.88888889, 10.        ])

np.logspace(0,1,5)

array([ 1.        ,  1.77827941,  3.16227766,  5.62341325, 10.        ])

x = np.linspace(-10,10,5)
y = np.linspace(-10,10,5)
x

array([-10.,  -5.,   0.,   5.,  10.])

x,y=np.meshgrid(x,y)

array([[-10.,  -5.,   0.,   5.,  10.],
       [-10.,  -5.,   0.,   5.,  10.],
       [-10.,  -5.,   0.,   5.,  10.],
       [-10.,  -5.,   0.,   5.,  10.],
       [-10.,  -5.,   0.,   5.,  10.]])

array([[-10., -10., -10., -10., -10.],
       [ -5.,  -5.,  -5.,  -5.,  -5.],
       [  0.,   0.,   0.,   0.,   0.],
       [  5.,   5.,   5.,   5.,   5.],
       [ 10.,  10.,  10.,  10.,  10.]])

np.r_[0:10:1]

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

np.c_[0:10:1]

array([[0],
       [1],
       [2],
       [3],
       [4],
       [5],
       [6],
       [7],
       [8],
       [9]])

np.zeros(3)

array([0., 0., 0.])

np.zeros((3,3))

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

np.ones((3,3))

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

np.ones((3,3))*8

array([[8., 8., 8.],
       [8., 8., 8.],
       [8., 8., 8.]])

a=np.empty(6)
a.shape

(6,)

a.fill(1)

array = np.array([1,2,3,4])
array

array([1, 2, 3, 4])

np.zeros_like(array)

array([0, 0, 0, 0])

np.ones_like(array)

array([1, 1, 1, 1])

np.identity(5)

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

运算

x = np.array([5,5])
y = np.array([2,2])

np.multiply(x,y)

array([10, 10])

np.dot(x,y)

np.int64(20)

x.shape

(2,)

x.shape=2,1

array([[5],
       [5]])

np.dot(x,y)

---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)

Cell In[64], line 1
----> 1 np.dot(x,y)


ValueError: shapes (2,1) and (2,) not aligned: 1 (dim 1) != 2 (dim 0)

y.shape = 1,2
print(x.shape)
print(y.shape)

(2, 1)
(1, 2)

np.dot(x,y)

array([[10, 10],
       [10, 10]])

np.dot(y,x)

array([[20]])

x = np.array([1,1,1])
y = np.array([[4,5,6],[1,2,3]])
print(x*y)

[[4 5 6]
 [1 2 3]]

y = np.array([1,1,1,4])
x = np.array([1,1,1,2])
x == y

array([ True,  True,  True, False])

np.logical_and(x,y)

array([ True,  True,  True,  True])

np.logical_or(x,y)

array([ True,  True,  True,  True])

np.logical_not(x,y)

array([0, 0, 0, 0])

随机模块

import numpy as np
np.random.rand(3,2)

array([[0.17953902, 0.1301528 ],
       [0.0380382 , 0.15961125],
       [0.71432336, 0.20237312]])

### 返回是随机整数，左闭右开

np.random.randint(10,size=(5,4))

array([[0, 7, 9, 8],
       [0, 0, 2, 7],
       [6, 7, 4, 9],
       [8, 7, 1, 0],
       [2, 6, 6, 5]], dtype=int32)

np.random.rand()

0.19414180564410888

np.random.random_sample()

0.8492017836540788

np.random.randint(0,10,3)

array([3, 8, 1], dtype=int32)

mu,sigma = 0,0.1
np.random.normal(mu,sigma,10)

array([ 0.05971091,  0.11068208,  0.02777231,  0.06185503, -0.08028422,
       -0.22575989, -0.10443067,  0.01615294,  0.03461024,  0.08860057])

洗牌，随机种子

array = np.arange(10)

np.random.shuffle(array)

array

array([4, 6, 5, 8, 1, 0, 9, 2, 3, 7])

np.random.seed(10)

mu,sigma = 0,0.1
np.random.normal(mu,sigma,10)

array([ 0.13315865,  0.0715279 , -0.15454003, -0.00083838,  0.0621336 ,
       -0.07200856,  0.02655116,  0.01085485,  0.00042914, -0.01746002])

Numpy读写数据

%%writefile king.txt
1 2 3 4 5 6
3 4 5 6 7 8

Writing king.txt

data = []
with open('king.txt') as f:
    for line in f.readlines():
        fileds = line.split()
        print(fileds)
        cur_data = [float(x) for x in fileds]
        data.append(cur_data)
data = np.array(data)
data

['1', '2', '3', '4', '5', '6']
['3', '4', '5', '6', '7', '8']





array([[1., 2., 3., 4., 5., 6.],
       [3., 4., 5., 6., 7., 8.]])

split默认空格，读出来是一个个字符，然后再一个个取出来，append组成个list。

data = np.loadtxt('king.txt')
data

array([[1., 2., 3., 4., 5., 6.],
       [3., 4., 5., 6., 7., 8.]])

%%writefile king2.txt
1,2,3,4,5,6
4,5,6,7,8,7

Writing king2.txt

data = np.loadtxt('king2.txt',delimiter = ',')
data

array([[1., 2., 3., 4., 5., 6.],
       [4., 5., 6., 7., 8., 7.]])

%%writefile king3.txt
x,y,z,w,a,b
1,2,3,4,5,6
3,4,5,6,7,8

Writing king3.txt

data = np.loadtxt('king3.txt',delimiter = ',',skiprows = 1)
data

array([[1., 2., 3., 4., 5., 6.],
       [3., 4., 5., 6., 7., 8.]])

skiprows:去掉几行 delimiter = ‘,’,分隔符，usecols=(0,1,4)指定使用哪几例

读写array结构

array = np.array([[1,2,3],[4,5,6]])
np.save('king.npy',array)
array

array([[1, 2, 3],
       [4, 5, 6]])

array1 = np.load('king.npy')

array2 = np.arange(10)
array2

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

np.savez('queen.npz',a=array1,b=array2)

data=np.load('queen.npz')

data.keys()

KeysView(NpzFile 'queen.npz' with keys: a, b)

data['a']

array([[1, 2, 3],
       [4, 5, 6]])

data['b']

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

练习

打印当钱Numpy版本

print(np.__version__)

2.0.2

构造一个全零矩阵，并打印其占用内存大小

z=np.zeros((5,5))
print('%d bytes'%(z.size*z.itemsize))

200 bytes

打印一个函数帮助文档，比如numpy.add

print(help(np.info(np.add)))

add(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature])

Add arguments element-wise.

Parameters
----------
x1, x2 : array_like
    The arrays to be added.
    If ``x1.shape != x2.shape``, they must be broadcastable to a common
    shape (which becomes the shape of the output).
out : ndarray, None, or tuple of ndarray and None, optional
    A location into which the result is stored. If provided, it must have
    a shape that the inputs broadcast to. If not provided or None,
    a freshly-allocated array is returned. A tuple (possible only as a
    keyword argument) must have length equal to the number of outputs.
where : array_like, optional
    This condition is broadcast over the input. At locations where the
    condition is True, the `out` array will be set to the ufunc result.
    Elsewhere, the `out` array will retain its original value.
    Note that if an uninitialized `out` array is created via the default
    ``out=None``, locations within it where the condition is False will
    remain uninitialized.
**kwargs
    For other keyword-only arguments, see the
    :ref:`ufunc docs <ufuncs.kwargs>`.

Returns
-------
add : ndarray or scalar
    The sum of `x1` and `x2`, element-wise.
    This is a scalar if both `x1` and `x2` are scalars.

Notes
-----
Equivalent to `x1` + `x2` in terms of array broadcasting.

Examples
--------
>>> np.add(1.0, 4.0)
5.0
>>> x1 = np.arange(9.0).reshape((3, 3))
>>> x2 = np.arange(3.0)
>>> np.add(x1, x2)
array([[  0.,   2.,   4.],
       [  3.,   5.,   7.],
       [  6.,   8.,  10.]])

The ``+`` operator can be used as a shorthand for ``np.add`` on ndarrays.

>>> x1 = np.arange(9.0).reshape((3, 3))
>>> x2 = np.arange(3.0)
>>> x1 + x2
array([[ 0.,  2.,  4.],
       [ 3.,  5.,  7.],
       [ 6.,  8., 10.]])
Help on NoneType object:

class NoneType(object)
 |  Methods defined here:
 |  
 |  __bool__(self, /)
 |      True if self else False
 |  
 |  __repr__(self, /)
 |      Return repr(self).
 |  
 |  ----------------------------------------------------------------------
 |  Static methods defined here:
 |  
 |  __new__(*args, **kwargs) from builtins.type
 |      Create and return a new object.  See help(type) for accurate signature.

None

创建一个10-49的数组，并将其倒叙排列

array=np.arange(10,50,1)
array=array[::-1]
array

array([49, 48, 47, 46, 45, 44, 43, 42, 41, 40, 39, 38, 37, 36, 35, 34, 33,
       32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16,
       15, 14, 13, 12, 11, 10])

找一个数组中不为0的索引

np.nonzero([1,2,4,0,7,6,0,87])

(array([0, 1, 2, 4, 5, 7]),)

随机构造一个3*3矩阵，并打印其中最大值与最小值

array = np.random.random((3,3))
array.min()
array.max()

np.float64(0.8052231968327465)

构造一个5*5的矩阵，令自值都为1，并在最外层加上一圈0

array=np.ones((5,5))
array=np.pad(array,pad_width=2,mode='constant',constant_values=0)
array

array([[0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 1., 1., 1., 1., 1., 0., 0.],
       [0., 0., 1., 1., 1., 1., 1., 0., 0.],
       [0., 0., 1., 1., 1., 1., 1., 0., 0.],
       [0., 0., 1., 1., 1., 1., 1., 0., 0.],
       [0., 0., 1., 1., 1., 1., 1., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0.]])

构建一个shape为（6,7,8）的矩阵，并找到第100个元素的索引值

np.unravel_index(100,(6,7,8))

(np.int64(1), np.int64(5), np.int64(4))

对一个5*5的矩阵做归一化操作

array = np.random.random((5,5))
max = array.max()
min = array.min()
array = (array-min)/(max-min)
array

array([[4.82916777e-01, 9.11170266e-01, 2.45605198e-01, 6.46273968e-01,
        1.00000000e+00],
       [5.73535042e-01, 6.37970703e-01, 0.00000000e+00, 3.68131543e-01,
        4.67040973e-02],
       [3.08237106e-01, 3.37487751e-01, 8.50614944e-01, 7.84484361e-04,
        4.51867772e-01],
       [3.19199938e-01, 6.91574756e-01, 3.55584952e-01, 4.41849228e-03,
        9.73462352e-01],
       [8.38351949e-01, 9.71356473e-01, 4.37991291e-01, 6.55776506e-01,
        5.49111069e-01]])

找到两个数组中相同的值

z1 = np.random.randint(0,10,10)
z2 = np.random.randint(0,10,10)
print(z1)
print(z2)
print(np.intersect1d(z1,z2))

[5 9 0 4 6 6 0 2 3 3]
[2 6 0 5 1 3 6 5 5 1]
[0 2 3 5 6]

作者：智模睿脑君

物联沃分享整理
物联沃-IOTWORD物联网 » Python 数据科学和科学计算领域的基石Numpy

代码收藏家普通

分享到：