代码收藏家技术教程 2024-12-23

python进行Theil-Sen Median斜率估计和Mann-Kendall趋势分析

Theil-Sen Median斜率估计和Mann-Kendall趋势分析

Theil-Sen Median斜率估计和Mann-Kendall趋势分析都是非参数统计方法，用于分析时间序列数据中的趋势。

应用案例

Liu等（2024）利用Theil-Sen Median斜率估计和Mann-Kendall趋势分析对沉积物连续性指标（IC）1990-2020年间30年的变化趋势进行了分析

一般来说，可以根据Theil-Sen Median计算出的β值和Mann-Kendall计算出的Z值进行趋势分类
在Theil-Sen Median斜率估计和Mann-Kendall趋势分析：以多年NPP数据为例这篇博客中提到了具体的分类表如下：

代码思路

我这里有2000-2022年的NDVI的tif文件，可以参考我这篇博客获取NDVI数据GEE平台获取指定区域DEM。我将这些文件命名为2000_NDVI.tif、2021_NDVI.tif等等。

以下是需要导入的包：

import glob
import re
import numpy as np
import rasterio
from pymannkendall import original_test
from scipy.stats import theilslopes
from rasterio.transform import from_bounds

1、我将代码文件与tif文件放在同一文件夹中，首先就是要匹配出所有的tif文件

# 查找当前目录下所有的.tif文件，这些tif文件都只有一个波段
tif_files = glob.glob('*.tif')

虽然匹配到了所有的tif文件，但不是按照年份顺序排列的，所有接下来需要对文件列表重新排序

# 定义一个辅助函数，用于从文件名中提取年份
def extract_year(filename):
    # 使用正则表达式查找文件名中的第一个四位数作为年份
    match = re.search(r'\d{4}', filename)
    if match:
        return int(match.group())  # 返回找到的年份
    else:
        return 0  # 如果没有找到年份，则返回0或任何其他默认值


# 根据年份对文件列表进行排序
sorted_tif_files = sorted(tif_files, key=extract_year)

这样就把所有的tif文件按照年份从小到大排列了

2、获取tif文件所有像元的值，并将这些tif文件像元的值组合起来形成像元矩阵

# 创建像元矩阵
pixels_matrix = []
for i in sorted_tif_files:
    pixels_list = []
    with rasterio.open(i) as src:
        data = src.read(1)  # 读取第一个波段
        for row in data:
            for value in row:
                pixels_list.append(value)
    pixels_matrix.append(pixels_list)

这样就得到了像元矩阵，矩阵中的行代表某一年的所有像元值。固定列，从第一行到最后一行，即是某一个像元的时间序列。这样获取某一列数据就代表某个像元的时间序列数据，或者可以对矩阵进行转置，这样每一行就代表某个像元的时间序列，这里直接采用列数据。

# 获取像元矩阵中每一列的元素，即NDVI时间序列
# 使用zip函数解压缩得到列
columns = list(zip(*pixels_matrix))
# 将每列转换为列表
columns = [list(column) for column in columns]

3、根据β和Z值的分类表创建趋势分类函数

# 分类规则
def classify_trends(beta, z):
    if np.isnan(beta) or np.isnan(z):
        return np.nan

    if beta == 0:
        return 0  # 无变化

    if beta > 0:
        if z > 2.58:
            return 4  # 极显著增加
        elif 1.96 < z <= 2.58:
            return 3  # 显著增加
        elif 1.65 < z <= 1.96:
            return 2  # 微显著增加
        else:
            return 1  # 不显著增加

    elif beta < 0:
        if z > 2.58:
            return -4  # 极显著减少
        elif 1.96 < z <= 2.58:
            return -3  # 显著减少
        elif 1.65 < z <= 1.96:
            return -2  # 微显著减少
        else:
            return -1  # 不显著减少

4、利用Theil-Sen Median斜率估计和Mann-Kendall趋势分析计算每个像元序列的β和Z值，并根据分类函数进行分类

classify_result = []
# 获取每个序列的β(slope)和Z值，并根据分类函数赋予指定的分类标量
for j in columns:
    j_array = np.array(j)
    has_nan = np.isnan(j_array).any()
    if has_nan:
        classify_result.append(np.nan)
    else:
        # 1. Theil-Sen Median 斜率估计
        slope, intercept, lower, upper = theilslopes(j, range(len(j)))

        # 2. Mann-Kendall 检验
        mk_result = original_test(j)
        z_value = mk_result.z
        classify_result.append(classify_trends(slope, z_value))

这样就获取到了每个像元的分类结果

5、将分类结果写入到新的tif文件中

# 将分类好的结果写入到新的tif文件中
# 打开原始TIFF文件
with rasterio.open(sorted_tif_files[0]) as src:
    # 获取源文件的高度和宽度
    height, width = src.shape
    # 确保新值列表长度与原图大小一致
    if len(classify_result) != height * width:
        raise ValueError("The length of the new values list does not match the size of the TIFF file.")

    # 将一维列表转换为二维数组
    new_data = np.array(classify_result).reshape(height, width)

    # 读取源文件的元数据
    meta = src.meta
    # 获取图像的边界
    bounds = src.bounds
    # 如果需要，可以调整元数据
    # 比如改变数据类型
    # meta.update(dtype=new_data.dtype)  # 根据你的new_data类型调整
# 写入新的像元值到TIFF文件
with rasterio.open('final.tif', 'w', **meta) as dst:
    # 将new_data写入到dst
    dst.write(new_data, 1)  # 假设只有一个波段，所以是1

    # 如果需要设置地理参考信息
    if not dst.crs:
        dst.crs = src.crs  # 使用源文件的坐标系统
    if not dst.transform:
        dst.transform, _ = from_bounds(*bounds, width, height)

这样就获取到了最终分类结果的final.tif文件

完整代码如下：

import glob
import re

import numpy as np
import rasterio
from pymannkendall import original_test
from scipy.stats import theilslopes
from rasterio.transform import from_bounds

# 查找当前目录下所有的.tif文件，这些tif文件都只有一个波段
tif_files = glob.glob('*.tif')


# 定义一个辅助函数，用于从文件名中提取年份
def extract_year(filename):
    # 使用正则表达式查找文件名中的第一个四位数作为年份
    match = re.search(r'\d{4}', filename)
    if match:
        return int(match.group())  # 返回找到的年份
    else:
        return 0  # 如果没有找到年份，则返回0或任何其他默认值


# 根据年份对文件列表进行排序
sorted_tif_files = sorted(tif_files, key=extract_year)

# 创建像元矩阵
pixels_matrix = []
for i in sorted_tif_files:
    pixels_list = []
    with rasterio.open(i) as src:
        data = src.read(1)  # 读取第一个波段
        for row in data:
            for value in row:
                pixels_list.append(value)
    pixels_matrix.append(pixels_list)

# 获取像元矩阵中每一列的元素，即NDVI时间序列
# 使用zip函数解压缩得到列
columns = list(zip(*pixels_matrix))
# 将每列转换为列表
columns = [list(column) for column in columns]


# 分类规则
def classify_trends(beta, z):
    if np.isnan(beta) or np.isnan(z):
        return np.nan

    if beta == 0:
        return 0  # 无变化

    if beta > 0:
        if z > 2.58:
            return 4  # 极显著增加
        elif 1.96 < z <= 2.58:
            return 3  # 显著增加
        elif 1.65 < z <= 1.96:
            return 2  # 微显著增加
        else:
            return 1  # 不显著增加

    elif beta < 0:
        if z > 2.58:
            return -4  # 极显著减少
        elif 1.96 < z <= 2.58:
            return -3  # 显著减少
        elif 1.65 < z <= 1.96:
            return -2  # 微显著减少
        else:
            return -1  # 不显著减少


classify_result = []
# 获取每个序列的β(slope)和Z值，并根据分类函数赋予指定的分类标量
for j in columns:
    j_array = np.array(j)
    has_nan = np.isnan(j_array).any()
    if has_nan:
        classify_result.append(np.nan)
    else:
        # 1. Theil-Sen Median 斜率估计
        slope, intercept, lower, upper = theilslopes(j, range(len(j)))

        # 2. Mann-Kendall 检验
        mk_result = original_test(j)
        z_value = mk_result.z
        classify_result.append(classify_trends(slope, z_value))

# 将分类好的结果写入到新的tif文件中
# 打开原始TIFF文件
with rasterio.open(sorted_tif_files[0]) as src:
    # 获取源文件的高度和宽度
    height, width = src.shape
    # 确保新值列表长度与原图大小一致
    if len(classify_result) != height * width:
        raise ValueError("The length of the new values list does not match the size of the TIFF file.")

    # 将一维列表转换为二维数组
    new_data = np.array(classify_result).reshape(height, width)

    # 读取源文件的元数据
    meta = src.meta
    # 获取图像的边界
    bounds = src.bounds
    # 如果需要，可以调整元数据
    # 比如改变数据类型
    # meta.update(dtype=new_data.dtype)  # 根据你的new_data类型调整
# 写入新的像元值到TIFF文件
with rasterio.open('final.tif', 'w', **meta) as dst:
    # 将new_data写入到dst
    dst.write(new_data, 1)  # 假设只有一个波段，所以是1

    # 如果需要设置地理参考信息
    if not dst.crs:
        dst.crs = src.crs  # 使用源文件的坐标系统
    if not dst.transform:
        dst.transform, _ = from_bounds(*bounds, width, height)

结果展示

由于我每个tif文件都包含四百多万个像元，所以运行上述代码比较耗时。这里我以2000-2004年的变化趋势为例，大概运行了几分钟。下面是QGIS展示的结果，可以看到大部分区域都是不显著减少或者增加，有少量地区出现了微显著增加。

参考文献

Liu, Chuanxiu, et al. “Impact of climatic and geomorphologic drivers on sediment connectivity in the Tarim River Basin, China.” Journal of Hydrology 643 (2024): 132027.

作者：彭博锐