Python 数据科学和科学计算领域的基石Numpy

Python科学计算库Numpy

本文从实战角度从几下几个方面对于Numpy进行全面演示,人工智能关键在神经网络,而神经网络又都是矩阵运算,矩阵运算的基础就是Numpy.全面掌握好Numpy就为人工智能开发打下深厚基础。
1 Numpy概述
2 array结构
3 数值计算
4 排序
5 数组形状
6 数组生成
7 运算
8 随机模块
9 读写
10 练习题

确保第一个事,咱们要用的库已经安装好了

import numpy as np
array = [1,2,3,4,5]
array + 1
---------------------------------------------------------------------------

TypeError                                 Traceback (most recent call last)

Cell In[2], line 2
      1 array = [1,2,3,4,5]
----> 2 array + 1


TypeError: can only concatenate list (not "int") to list

要想给一个list加1,不能直接这么操作。但是我们把它转成一个numpy最底层的ndarray结构,就可以对list做数学运算了

array = np.array([1,2,3,4,5])
print (type(array))
<class 'numpy.ndarray'>
array2 = array + 1
array2
array([2, 3, 4, 5, 6])
array2 +array
array([ 3,  5,  7,  9, 11])
array2 * array
array([ 2,  6, 12, 20, 30])
array[0]
np.int64(1)
array[3]
np.int64(4)
array
array([1, 2, 3, 4, 5])
array.shape
(5,)

只有adarray有shape属性,对于普通的list,没有这个shape

list1 = [1,2,3,4,5]
list1.shape
---------------------------------------------------------------------------

AttributeError                            Traceback (most recent call last)

Cell In[13], line 2
      1 list1 = [1,2,3,4,5]
----> 2 list1.shape


AttributeError: 'list' object has no attribute 'shape'
np.array([[1,2,3],[4,5,6]])
array([[1, 2, 3],
       [4, 5, 6]])

结构

对于ndarray结构来说,里面所有的元素必须是同一类型的 如果不是的话,会自动的向下进行转换.把一个list转成ndarray

list = [1,2,3,4,5]
ndarray = np.array(list)
ndarray
array([1, 2, 3, 4, 5])

ndarray基本属性操作,dtype看类型,shape形状,ndim维度,fill填充

type(ndarray)
numpy.ndarray
ndarray.dtype
dtype('int64')
ndarray.shape
(5,)
ndarray.ndim
1
ndarray1 = np.array([[1,2,3],[4,5,6]])
ndarray1.ndim
2
ndarray1.shape
(2, 3)

fill可以全部填充

ndarray.fill(0)
ndarray
array([0, 0, 0, 0, 0])

索引与切片:跟Python都是一样的 还是从0开始的

list = [1,2,3,4,5]
array = np.array(list)
array[0]
np.int64(1)
array[1:3]
array([2, 3])

矩阵格式,多维形式

array = np.array([[1,2,3],
                 [4,5,6],
                 [7,8,9]])
array
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])
array.shape
(3, 3)
array.size
9
array.ndim
2

修改里面的值

array[1,1] = 10
array
array([[ 1,  2,  3],
       [ 4, 10,  6],
       [ 7,  8,  9]])
array[1]
array([ 4, 10,  6])
array[:,1]
array([ 2, 10,  8])
array[0,0:2]
array([1, 2])
array2 = array
array2
array([[ 1,  2,  3],
       [ 4, 10,  6],
       [ 7,  8,  9]])
array2[1,1] = 100
array2
array([[  1,   2,   3],
       [  4, 100,   6],
       [  7,   8,   9]])
array
array([[  1,   2,   3],
       [  4, 100,   6],
       [  7,   8,   9]])

可以看到=这种赋值,是一种浅拷贝,就是2个变量指向同一个地址,其中一个变量改了值,另一个也就改了。如何进行深拷贝,可以使用copy

array2 = array.copy()
array2
array([[  1,   2,   3],
       [  4, 100,   6],
       [  7,   8,   9]])
array2[1,1] = 1000
array2
array([[   1,    2,    3],
       [   4, 1000,    6],
       [   7,    8,    9]])
array
array([[  1,   2,   3],
       [  4, 100,   6],
       [  7,   8,   9]])
array = np.arange(0,100,10)
mask = np.array([0,0,0,1,1,1,0,0,1,1],dtype=bool)
mask
array([False, False, False,  True,  True,  True, False, False,  True,
        True])
array[mask]
array([30, 40, 50, 80, 90])
random_array=np.random.rand(10)
random_array
array([0.81861097, 0.09494575, 0.81619337, 0.14018824, 0.11683594,
       0.38988826, 0.81800008, 0.5675678 , 0.41470347, 0.55486108])
mask = random_array > 0.5
mask
array([ True, False,  True, False, False, False,  True,  True, False,
        True])
array = np.array([10,20,30,40,50])
array > 30
array([False, False, False,  True,  True])

拿到满足条件的索引值,把索引传进去,就可以找到值

np.where(array > 30)
(array([3, 4]),)
array[np.where(array > 30)]
array([40, 50])

数组类型,查看占有字节一共有多少个

array = np.array([1,2,3,4,5],dtype=np.float32)
array
array([1., 2., 3., 4., 5.], dtype=float32)
array.dtype
dtype('float32')
array.nbytes
20

asarray可以改变数组类型,但是不会改变原始数组

array=np.array([1,10,3.5,'str'])
array
array(['1', '10', '3.5', 'str'], dtype='<U32')
array = np.array([1,2,3,4,5])
np.asarray(array,dtype=np.float32)
array([1., 2., 3., 4., 5.], dtype=float32)
array
array([1, 2, 3, 4, 5])
array.astype(np.float32)
array([1., 2., 3., 4., 5.], dtype=float32)

array数组的数值计算,prod是数组里面全部相乘,全局的最大值,最小值,以及最大最小值的索引,均值,标准差,方差

array = np.array([[1,2,3],[4,5,6]])
array
array([[1, 2, 3],
       [4, 5, 6]])
np.sum(array)
np.int64(21)

指定要进行操作的是沿什么轴(维度)

np.sum(array,axis=0)
array([5, 7, 9])
array.ndim
2
np.sum(array,axis=1)
array([ 6, 15])
np.sum(array,axis=-1)
array([ 6, 15])
array.sum()
np.int64(21)
array.sum(axis=0)
array([5, 7, 9])
array.prod()
np.int64(720)
array.min()
np.int64(1)

找到索引位置

array.argmin()
np.int64(0)
array.argmin(axis=1)
array([0, 0])
array.argmax()
np.int64(5)
array.mean()
np.float64(3.5)
array.mean(axis=0)
array([2.5, 3.5, 4.5])

标准差

array.std()
np.float64(1.707825127659933)
array.std(axis=1)
array([0.81649658, 0.81649658])

方差计算

array.var()
np.float64(2.9166666666666665)
array
array([[1, 2, 3],
       [4, 5, 6]])

clip小于左边的,取左边值,大于右边的取右边值,round四舍五入,decimals保留小数位数

array.clip(2,4)
array([[2, 2, 3],
       [4, 4, 4]])
array1 = np.array([1.2,3.34,4.56])
array1.round()
array([1., 3., 5.])
array1.round(decimals=1)
array([1.2, 3.3, 4.6])

排序

array = np.array([[1.5,1.3,7.5],[5.6,7.8,1.2]])
array
array([[1.5, 1.3, 7.5],
       [5.6, 7.8, 1.2]])
np.sort(array)
array([[1.3, 1.5, 7.5],
       [1.2, 5.6, 7.8]])
np.sort(array,axis=0)
array([[1.5, 1.3, 1.2],
       [5.6, 7.8, 7.5]])
array
array([[1.5, 1.3, 7.5],
       [5.6, 7.8, 1.2]])
np.argsort(array)
array([[1, 0, 2],
       [2, 0, 1]])
array=np.linspace(0,10,10)
array
array([ 0.        ,  1.11111111,  2.22222222,  3.33333333,  4.44444444,
        5.55555556,  6.66666667,  7.77777778,  8.88888889, 10.        ])

valuses是即将插入的值,searchsorted是插入到array中返回的位置索引

values=np.array([2.5,6.5,9.5])
np.searchsorted(array,values)

array = np.array([[1,0,6],[1,7,0],[2,3,1],[2,4,0]])

数组的操作

shape,reshape,arange,newais增加维度,squeeze,把没必要维度进行压缩,transpose翻转,T也是一样。

concatenate数组连接,vstack竖着连,hstack横着连,flatten和ravel都是拉平。

import numpy as np
array = np.arange(10)
array
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
array.shape
(10,)
array.shape = 2,5
array
array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])
array.reshape(1,10)
array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])

大小必须不能变,总共十个元素,2,5可以,2,4就不行

array.shape = 2,4
---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)

Cell In[9], line 1
----> 1 array.shape = 2,4


ValueError: cannot reshape array of size 10 into shape (2,4)
array = np.arange(10)
array.shape
(10,)

array = array[np.newaxis,:]
array.shape

array = np.arange(10)
array.shape
(10,)
array = array[:,np.newaxis,np.newaxis]
array.shape
(10, 1, 1)
array = array.squeeze()
array.shape
(10,)
array.shape = 2,5
array
array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])
array.transpose()
array([[0, 5],
       [1, 6],
       [2, 7],
       [3, 8],
       [4, 9]])
array.T
array([[0, 5],
       [1, 6],
       [2, 7],
       [3, 8],
       [4, 9]])
a = np.array([[123,456,564],[345,678,754]])
a
array([[123, 456, 564],
       [345, 678, 754]])
b = np.array([[532,567,8970],[123,345,765]])
b
array([[ 532,  567, 8970],
       [ 123,  345,  765]])
c = np.concatenate((a,b))
c
array([[ 123,  456,  564],
       [ 345,  678,  754],
       [ 532,  567, 8970],
       [ 123,  345,  765]])
c.shape
(4, 3)
np.vstack((a,b))
array([[ 123,  456,  564],
       [ 345,  678,  754],
       [ 532,  567, 8970],
       [ 123,  345,  765]])
np.hstack((a,b))
array([[ 123,  456,  564,  532,  567, 8970],
       [ 345,  678,  754,  123,  345,  765]])
a
array([[123, 456, 564],
       [345, 678, 754]])
a.flatten()
array([123, 456, 564, 345, 678, 754])
a.ravel()
array([123, 456, 564, 345, 678, 754])
## 数组生成,arange构造数组,linspace,在给定范围均价的取多少值,logspace给定对数范围均匀分成几份对应值,
## meshgrid网格化,np.r_ ,np.c_,一个行一个列,可以接受多种输入,包括切片、数组。zeros,one,取一样值,identity对角线给1,别的为0
np.arange(10)
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
np.arange(2,20,2)
array([ 2,  4,  6,  8, 10, 12, 14, 16, 18])
np.arange(2,20,2,dtype=np.float32)
array([ 2.,  4.,  6.,  8., 10., 12., 14., 16., 18.], dtype=float32)
np.linspace(0,10,10)
array([ 0.        ,  1.11111111,  2.22222222,  3.33333333,  4.44444444,
        5.55555556,  6.66666667,  7.77777778,  8.88888889, 10.        ])
np.logspace(0,1,5)
array([ 1.        ,  1.77827941,  3.16227766,  5.62341325, 10.        ])
x = np.linspace(-10,10,5)
y = np.linspace(-10,10,5)
x
array([-10.,  -5.,   0.,   5.,  10.])
x,y=np.meshgrid(x,y)
x
array([[-10.,  -5.,   0.,   5.,  10.],
       [-10.,  -5.,   0.,   5.,  10.],
       [-10.,  -5.,   0.,   5.,  10.],
       [-10.,  -5.,   0.,   5.,  10.],
       [-10.,  -5.,   0.,   5.,  10.]])
y
array([[-10., -10., -10., -10., -10.],
       [ -5.,  -5.,  -5.,  -5.,  -5.],
       [  0.,   0.,   0.,   0.,   0.],
       [  5.,   5.,   5.,   5.,   5.],
       [ 10.,  10.,  10.,  10.,  10.]])
np.r_[0:10:1]
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
np.c_[0:10:1]
array([[0],
       [1],
       [2],
       [3],
       [4],
       [5],
       [6],
       [7],
       [8],
       [9]])
np.zeros(3)
array([0., 0., 0.])
np.zeros((3,3))
array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])
np.ones((3,3))
array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])
np.ones((3,3))*8
array([[8., 8., 8.],
       [8., 8., 8.],
       [8., 8., 8.]])
a=np.empty(6)
a.shape
(6,)
a.fill(1)
array = np.array([1,2,3,4])
array
array([1, 2, 3, 4])
np.zeros_like(array)
array([0, 0, 0, 0])
np.ones_like(array)
array([1, 1, 1, 1])
np.identity(5)
array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

运算

x = np.array([5,5])
y = np.array([2,2])
np.multiply(x,y)
array([10, 10])
np.dot(x,y)
np.int64(20)
x.shape
(2,)
x.shape=2,1
x
array([[5],
       [5]])
np.dot(x,y)
---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)

Cell In[64], line 1
----> 1 np.dot(x,y)


ValueError: shapes (2,1) and (2,) not aligned: 1 (dim 1) != 2 (dim 0)
y.shape = 1,2
print(x.shape)
print(y.shape)
(2, 1)
(1, 2)
np.dot(x,y)
array([[10, 10],
       [10, 10]])
np.dot(y,x)
array([[20]])
x = np.array([1,1,1])
y = np.array([[4,5,6],[1,2,3]])
print(x*y)
[[4 5 6]
 [1 2 3]]
y = np.array([1,1,1,4])
x = np.array([1,1,1,2])
x == y
array([ True,  True,  True, False])
np.logical_and(x,y)
array([ True,  True,  True,  True])
np.logical_or(x,y)
array([ True,  True,  True,  True])
np.logical_not(x,y)
array([0, 0, 0, 0])

随机模块

import numpy as np
np.random.rand(3,2)
array([[0.17953902, 0.1301528 ],
       [0.0380382 , 0.15961125],
       [0.71432336, 0.20237312]])
### 返回是随机整数,左闭右开
np.random.randint(10,size=(5,4))
array([[0, 7, 9, 8],
       [0, 0, 2, 7],
       [6, 7, 4, 9],
       [8, 7, 1, 0],
       [2, 6, 6, 5]], dtype=int32)
np.random.rand()
0.19414180564410888
np.random.random_sample()
0.8492017836540788
np.random.randint(0,10,3)
array([3, 8, 1], dtype=int32)
mu,sigma = 0,0.1
np.random.normal(mu,sigma,10)
array([ 0.05971091,  0.11068208,  0.02777231,  0.06185503, -0.08028422,
       -0.22575989, -0.10443067,  0.01615294,  0.03461024,  0.08860057])

洗牌,随机种子

array = np.arange(10)
np.random.shuffle(array)
array
array([4, 6, 5, 8, 1, 0, 9, 2, 3, 7])
np.random.seed(10)
mu,sigma = 0,0.1
np.random.normal(mu,sigma,10)
array([ 0.13315865,  0.0715279 , -0.15454003, -0.00083838,  0.0621336 ,
       -0.07200856,  0.02655116,  0.01085485,  0.00042914, -0.01746002])

Numpy读写数据

%%writefile king.txt
1 2 3 4 5 6
3 4 5 6 7 8
Writing king.txt
data = []
with open('king.txt') as f:
    for line in f.readlines():
        fileds = line.split()
        print(fileds)
        cur_data = [float(x) for x in fileds]
        data.append(cur_data)
data = np.array(data)
data
['1', '2', '3', '4', '5', '6']
['3', '4', '5', '6', '7', '8']





array([[1., 2., 3., 4., 5., 6.],
       [3., 4., 5., 6., 7., 8.]])
split默认空格,读出来是一个个字符,然后再一个个取出来,append组成个list。
data = np.loadtxt('king.txt')
data
array([[1., 2., 3., 4., 5., 6.],
       [3., 4., 5., 6., 7., 8.]])
%%writefile king2.txt
1,2,3,4,5,6
4,5,6,7,8,7
Writing king2.txt
data = np.loadtxt('king2.txt',delimiter = ',')
data
array([[1., 2., 3., 4., 5., 6.],
       [4., 5., 6., 7., 8., 7.]])
%%writefile king3.txt
x,y,z,w,a,b
1,2,3,4,5,6
3,4,5,6,7,8
Writing king3.txt
data = np.loadtxt('king3.txt',delimiter = ',',skiprows = 1)
data
array([[1., 2., 3., 4., 5., 6.],
       [3., 4., 5., 6., 7., 8.]])

skiprows:去掉几行 delimiter = ‘,’,分隔符,usecols=(0,1,4)指定使用哪几例

读写array结构

array = np.array([[1,2,3],[4,5,6]])
np.save('king.npy',array)
array
array([[1, 2, 3],
       [4, 5, 6]])
array1 = np.load('king.npy')
array2 = np.arange(10)
array2
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
np.savez('queen.npz',a=array1,b=array2)
data=np.load('queen.npz')
data.keys()
KeysView(NpzFile 'queen.npz' with keys: a, b)
data['a']
array([[1, 2, 3],
       [4, 5, 6]])
data['b']
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

练习

打印当钱Numpy版本

print(np.__version__)
2.0.2

构造一个全零矩阵,并打印其占用内存大小

z=np.zeros((5,5))
print('%d bytes'%(z.size*z.itemsize))
200 bytes

打印一个函数帮助文档,比如numpy.add

print(help(np.info(np.add)))
add(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature])

Add arguments element-wise.

Parameters
----------
x1, x2 : array_like
    The arrays to be added.
    If ``x1.shape != x2.shape``, they must be broadcastable to a common
    shape (which becomes the shape of the output).
out : ndarray, None, or tuple of ndarray and None, optional
    A location into which the result is stored. If provided, it must have
    a shape that the inputs broadcast to. If not provided or None,
    a freshly-allocated array is returned. A tuple (possible only as a
    keyword argument) must have length equal to the number of outputs.
where : array_like, optional
    This condition is broadcast over the input. At locations where the
    condition is True, the `out` array will be set to the ufunc result.
    Elsewhere, the `out` array will retain its original value.
    Note that if an uninitialized `out` array is created via the default
    ``out=None``, locations within it where the condition is False will
    remain uninitialized.
**kwargs
    For other keyword-only arguments, see the
    :ref:`ufunc docs <ufuncs.kwargs>`.

Returns
-------
add : ndarray or scalar
    The sum of `x1` and `x2`, element-wise.
    This is a scalar if both `x1` and `x2` are scalars.

Notes
-----
Equivalent to `x1` + `x2` in terms of array broadcasting.

Examples
--------
>>> np.add(1.0, 4.0)
5.0
>>> x1 = np.arange(9.0).reshape((3, 3))
>>> x2 = np.arange(3.0)
>>> np.add(x1, x2)
array([[  0.,   2.,   4.],
       [  3.,   5.,   7.],
       [  6.,   8.,  10.]])

The ``+`` operator can be used as a shorthand for ``np.add`` on ndarrays.

>>> x1 = np.arange(9.0).reshape((3, 3))
>>> x2 = np.arange(3.0)
>>> x1 + x2
array([[ 0.,  2.,  4.],
       [ 3.,  5.,  7.],
       [ 6.,  8., 10.]])
Help on NoneType object:

class NoneType(object)
 |  Methods defined here:
 |  
 |  __bool__(self, /)
 |      True if self else False
 |  
 |  __repr__(self, /)
 |      Return repr(self).
 |  
 |  ----------------------------------------------------------------------
 |  Static methods defined here:
 |  
 |  __new__(*args, **kwargs) from builtins.type
 |      Create and return a new object.  See help(type) for accurate signature.

None

创建一个10-49的数组,并将其倒叙排列

array=np.arange(10,50,1)
array=array[::-1]
array
array([49, 48, 47, 46, 45, 44, 43, 42, 41, 40, 39, 38, 37, 36, 35, 34, 33,
       32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16,
       15, 14, 13, 12, 11, 10])

找一个数组中不为0的索引

np.nonzero([1,2,4,0,7,6,0,87])
(array([0, 1, 2, 4, 5, 7]),)

随机构造一个3*3矩阵,并打印其中最大值与最小值

array = np.random.random((3,3))
array.min()
array.max()
np.float64(0.8052231968327465)

构造一个5*5的矩阵,令自值都为1,并在最外层加上一圈0

array=np.ones((5,5))
array=np.pad(array,pad_width=2,mode='constant',constant_values=0)
array
array([[0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 1., 1., 1., 1., 1., 0., 0.],
       [0., 0., 1., 1., 1., 1., 1., 0., 0.],
       [0., 0., 1., 1., 1., 1., 1., 0., 0.],
       [0., 0., 1., 1., 1., 1., 1., 0., 0.],
       [0., 0., 1., 1., 1., 1., 1., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0.]])

构建一个shape为(6,7,8)的矩阵,并找到第100个元素的索引值

np.unravel_index(100,(6,7,8))
(np.int64(1), np.int64(5), np.int64(4))

对一个5*5的矩阵做归一化操作

array = np.random.random((5,5))
max = array.max()
min = array.min()
array = (array-min)/(max-min)
array
array([[4.82916777e-01, 9.11170266e-01, 2.45605198e-01, 6.46273968e-01,
        1.00000000e+00],
       [5.73535042e-01, 6.37970703e-01, 0.00000000e+00, 3.68131543e-01,
        4.67040973e-02],
       [3.08237106e-01, 3.37487751e-01, 8.50614944e-01, 7.84484361e-04,
        4.51867772e-01],
       [3.19199938e-01, 6.91574756e-01, 3.55584952e-01, 4.41849228e-03,
        9.73462352e-01],
       [8.38351949e-01, 9.71356473e-01, 4.37991291e-01, 6.55776506e-01,
        5.49111069e-01]])

找到两个数组中相同的值

z1 = np.random.randint(0,10,10)
z2 = np.random.randint(0,10,10)
print(z1)
print(z2)
print(np.intersect1d(z1,z2))
[5 9 0 4 6 6 0 2 3 3]
[2 6 0 5 1 3 6 5 5 1]
[0 2 3 5 6]

作者:智模睿脑君

物联沃分享整理
物联沃-IOTWORD物联网 » Python 数据科学和科学计算领域的基石Numpy

发表回复