Python内存泄漏排查详解

Python内存泄漏排查

  • 1. 排查工具
  • 1.1 gc
  • 1.2 tracemalloc
  • 1.3 mem_top
  • 1.4 guppy
  • 1.5 objgraph
  • 1.6 pympler
  • 1.7 pyrasite
  • 2. 案例分析
  • 3. 参考
  • 记一次排查Python程序内存泄漏的问题。

    1. 排查工具

    工具 说明
    gc Python标准库 内置模块
    tracemalloc 推荐 Python3.4 以上此工具为标准库
    mem_top 推荐 是对 gc 的封装,能够排序输出最多的 Top N,执行快
    guppy 可以对堆里边的对象进行统计, 算是比较实用;但计算耗时长
    objgraph 可以绘制对象引用图, 对于对象种类较少, 结构比较简单的程序适用
    pympler 可以统计内存里边各种类型的使用, 获取对象的大小
    pyrasite 非常强大的第三方库, 可以渗透进入正在运行的python进程动态修改里边的数据和代码

    各个工具官网文档都有详细说明,也有基本示例用法,本文简单介绍工具的常见使用。

    1.1 gc

    gc 作为内置模块,Python2 和 Python3 都支持,用起来非常方便。

    常用的方法有:

  • gc.collect(generation=2) 若被调用时不包含参数,则启动完全的垃圾回收;在排查内存泄漏时,为避免垃圾未及时回收的影响,在统计前可以先手动调用一下垃圾回收;
  • gc.get_objects() 返回一个收集器所跟踪的所有对象列表;
  • gc.get_referrers(*objs) 返回 直接 引用任意一个 objs 的对象列表。这个函数只定位支持垃圾回收的容器;引用了其它对象但不支持垃圾回收的扩展类型不会被找到。
  • gc.get_referents(*ojbs) 返回 任意一个参数中的对象直接引用的对象的列表,在排查内存泄漏中一般需要排查被引用的对象列表;
  • sys.getsizeof(obj) 返回对象的大小(以字节为单位), 只计算直接分配给对象的内存消耗,不计算它所引用的对象的内存消耗。
  • 示例用法:

    import gc, sys
    
    def top_memory(limit=3):
        gc.collect()
        objs_by_size = []
        for obj in gc.get_objects():
            size = sys.getsizeof(obj)
            objs_by_size.append((obj, size))
        # 按照内存分配大小排序
        sorted_objs = sorted(objs_by_size, key=lambda x: x[1], reverse=True)
        for obj, size in sorted_objs[:limit]:
            print(f"size: {size/1024/1024:.2f}MB, type: {type(obj)}, obj: {id(obj)} ")
            # 输出被引用列表
            for item in gc.get_referents(obj):
                print(f"{item}\n")
    

    1.2 tracemalloc

    Python3.4 以上的内置库。

    tracemalloc 模块是一个用于对 python 已申请的内存块进行debug的工具。它能提供以下信息:

  • 回溯对象分配内存的位置
  • 按文件、按行统计python的内存块分配情况: 总大小、块的数量以及块平均大小
  • 对比两个内存快照的差异,以便排查内存泄漏
  • 常用函数介绍:

  • tracemalloc.start() 可以在运行时调用函数来启动追踪 Python 内存分配
  • tracemalloc.take_snapshot() 保存一个由 Python 分配的内存块的追踪的快照。 返回一个新的 Snapshot 实例
  • Snapshot.compare_to 计算与某个旧快照的差异
  • 代码示例:

    import tracemalloc
    tracemalloc.start()
    # ... start your application ...
    
    snapshot1 = tracemalloc.take_snapshot()
    # ... call the function leaking memory ...
    snapshot2 = tracemalloc.take_snapshot()
    
    top_stats = snapshot2.compare_to(snapshot1, 'lineno')
    
    print("[ Top 10 differences ]")
    for stat in top_stats[:10]:
        print(stat)
    

    官网有非常详细的说明文档和使用示例,详见

    1.3 mem_top

    mem_top 其实是对 gc 模块的方法的封装,调用 mem_top.mem_top() 函数能够直接打印出按照 被引用数量占用内存大小按照类型统计对象个数 三种方式排序的 top N 信息。

    安装 pip install mem-top

    函数说明:

    mem_top(
        limit=10,                           # limit of top lines per section
        width=100,                          # width of each line in chars
        sep='\n',                           # char to separate lines with
        refs_format='{num}\t{type} {obj}',  # format of line in "refs" section
        bytes_format='{num}\t {obj}',       # format of line in "bytes" section
        types_format='{num}\t {obj}',       # format of line in "types" section
        verbose_types=None,                 # list of types to sort values by `repr` length
        verbose_file_name='/tmp/mem_top',   # name of file to store verbose values in
    )
    

    示例 mem_top.mem_top(limit=3, width=200) 输出:

    refs:
    1638	<type 'dict'> {'IPython.core.error': <module 'IPython.core.error' from '/Users/skyler/Documents/py-env/venv2.7/lib/python2.7/site-packages/IPython/core/error.pyc'>, 'ipython_genutils.py3compat': <module 'ipython_ge
    765		<type 'list'> [u'd = {\n    "@babel/core": "^7.24.4",\n    "@babel/plugin-proposal-class-properties": "^7.18.6",\n    "@babel/preset-env": "^7.9.5",\n    "@jest/globals": "^29.7.0",\n    "babel-eslint": "^10.1.0",\
    765		<type 'list'> [u'd = {\n    "@babel/core": "^7.24.4",\n    "@babel/plugin-proposal-class-properties": "^7.18.6",\n    "@babel/preset-env": "^7.9.5",\n    "@jest/globals": "^29.7.0",\n    "babel-eslint": "^10.1.0",\
    
    bytes:
    49432	 {'IPython.core.error': <module 'IPython.core.error' from '/Users/skyler/Documents/py-env/venv2.7/lib/python2.7/site-packages/IPython/core/error.pyc'>, 'ipython_genutils.py3compat': <module 'ipython_ge
    33000	 set(['disp', 'union1d', 'all', 'issubsctype', 'atleast_2d', 'setmember1d', 'restoredot', 'ptp', 'blackman', 'pkgload', 'tostring', 'tri', 'arrayrange', 'array_equal', 'item', 'indices', 'loads', 'roun
    12584	 {u'': 0, u'pmem_top.mem_top(limit=3, width=200) ': 37, u'primem_top.mem_top(limit=3, width=200) ': 39, u'printmem_top.mem_top() ': 23, u'print mem_top.mem_top(limit) ': 29, u'print mem_top.mem_top(lim
    
    types:
    8581	 <type 'function'>
    7527	 <type 'tuple'>
    6102	 <type 'dict'>
    

    1.4 guppy

    gunppy是一个非常强大的工具,但同时 缺点 也比较明细,执行耗时不适合生产debug。

    安装 pip install guppy

    注意 该库会寻找使用对象的 dir 相关属性,注意若是自行实现的 __dir__ 函数有问题,会导致该库初始化出现异常。

    常用示例:

    import datetime
    import guppy
    
    # 初始化了SessionContext,使用它可以访问heap信息
    analyzer = guppy.hpy()
    
    def do_something():
    
        # run your app ...
    
        print("==={} heap total===".format(datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")))
        # 返回heap内存详情
        heap = analyzer.heap()
        print(heap)
        # byvia返回该对象的被哪些引用, heap[0]是内存消耗最大的对象
        print("==={} references===".format(datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")))
        references = heap[0].byvia
        print(references)
        print("==={} references detail===".format(datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")))
        print(references[0].kind)  # 类型
        print(references[0].shpaths)  # 路径
        print(references[0].rp)  # 引用
    

    输出结果:

    ===2024-07-21 16:27:12 heap total===
    Partition of a set of 785315 objects. Total size = 104732120 bytes.
     Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
         0 396372  50 35974232  34  35974232  34 unicode
         1  23029   3 23814136  23  59788368  57 dict (no owner)
         2 143799  18 13556704  13  73345072  70 str
         3  75473  10  7372992   7  80718064  77 tuple
         4   1085   0  2634680   3  83352744  80 dict of module
         5   2764   0  2500384   2  85853128  82 type
         6  19206   2  2458368   2  88311496  84 types.CodeType
         7  15857   2  2409224   2  90720720  87 list
         8  19402   2  2328240   2  93048960  89 function
         9   2764   0  2215840   2  95264800  91 dict of type
    <931 more rows. Type e.g. '_.more' to view.>
    ===2024-07-21 16:27:14 references===
    Partition of a set of 396372 objects. Total size = 35974232 bytes.
     Index  Count   %     Size   % Cumulative  % Referred Via:
         0  18748   5  1371888   4   1371888   4 '.keys()[0]'
         1  13046   3   974352   3   2346240   7 '.keys()[1]'
         2   9958   3   724328   2   3070568   9 '.keys()[2]'
         3   9027   2   658576   2   3729144  10 '.keys()[3]'
         4   8636   2   632264   2   4361408  12 '.keys()[4]'
         5   8175   2   607032   2   4968440  14 '.keys()[5]'
         6    715   0   515688   1   5484128  15 '.func_doc', '[0]'
         7   6557   2   502880   1   5987008  17 '.keys()[6]'
         8   5785   1   428904   1   6415912  18 '.keys()[7]'
         9   5168   1   392432   1   6808344  19 '.keys()[8]'
    <3213 more rows. Type e.g. '_.more' to view.>
    ===2024-07-21 16:27:16 references detail===
    <via '.keys()[0]'>
     0: hpy().Root.i0_modules['kombu'].__dict__.keys()[0]
    Reference Pattern by <[dict of] class>.
     0: _ --- [-] 18748 <via '.keys()[0]'>: 0x7ff3f82dec30, 0x7ff3f82decc0...
     1: a      [-] 18753 dict (no owner): 0x7ff3f82f7050*24, 0x7ff3f82f73b0*3...
     2: aa ---- [-] 317 dict (no owner): 0x7ff3f88e43b0*1, 0x7ff3f88e44d0*1...
     3: a3       [-] 77 dict of aliyunsdkcore.endpoint.endpoint_resolver_rules.En...
     4: a4 ------ [-] 77 aliyunsdkcore.endpoint.endpoint_resolver_rules.EndpointR...
     5: a5         [-] 77 list: 0x7ff3f88f65f0*6, 0x7ff3f897e7d0*6...
     6: a6 -------- [-] 77 dict of aliyunsdkcore.endpoint.chained_endpoint_resolv...
     7: a7           [+] 77 aliyunsdkcore.endpoint.chained_endpoint_resolver.Chai...
     8: aab ---- [-] 80 dict (no owner): 0x7ff3f88e44d0*1, 0x7ff3f88e8b90*1...
     9: aaba      [-] 78 dict of aliyunsdkcore.retry.retry_condition.DefaultConfi...
    <Type e.g. '_.more' for more.>
    

    除了官网的文档,还可以通过类的属性查看相关说明:

    analyzer = guppy.hpy()
    heap = analyzer.heap()
    print("============== Heap Documents ====================")
    print(analyzer.doc)
    print("============= Heap Status Documents ================")
    print(heap.doc)
    

    输出:

    ============== Heap Documents ====================
    Top level interface to Heapy. Available attributes:
    Anything            Nothing             Via                 iso
    Class               Rcs                 doc                 load
    Clodo               Root                findex              monitor
    Id                  Size                heap                pb
    Idset               Type                heapu               setref
    Module              Unity               idset               test
    Use eg: hpy().doc.<attribute> for info on <attribute>.
    ============= Heap Status Documents ================
    biper               byvia               get_examples        parts
    brief               count               get_render          pathsin
    by                  dictof              get_rp              pathsout
    byclass             diff                get_shpaths         referents
    byclodo             disjoint            imdom               referrers
    byid                doc                 indisize            rp
    byidset             dominos             kind                shpaths
    bymodule            domisize            maprox              size
    byrcs               dump                more                sp
    bysize              er                  nodes               stat
    bytype              fam                 owners              test_contains
    byunity             get_ckc             partition           theone
    

    从Heap Status的说明中可以看到,除了 byvia 统计方法外,还有其他方式,这里介绍几种:

  • byvia 堆状态的此属性根据引用的对象对堆状态条目进行分组;
  • bysize 堆状态的此属性根据对象的单独大小对堆状态条目进行分组;
  • bytype 堆状态的此属性按对象类型对堆状态条目进行分组,所有dict条目将合并为一个条目;
  • byrcs 堆状态的此属性按引用者类型对堆状态条目进行分组;
  • bymodule 堆状态的此属性按模块对堆状态条目进行分组;
  • byunity 堆状态的此属性按总大小对堆状态条目进行分组;
  • byidset 堆状态的此属性按 idset 对堆状态条目进行分组;
  • byid 堆状态的此属性按内存地址对堆状态条目进行分组;
  • 一般情况下 byviabysize 就能解决很多场景的问题。

    更多使用示例可以参考 guppy/heapy – Profile Memory Usage in Python

    1.5 objgraph

    安装 pip install objgraph

    为了快速概览内存中的对象,使用函数 show_most_common_types()
    objgraph会对所有存活的对象进行快照,调用函数 show_growth 查看调用前后的变化。

    常见用用法示例:

    import objgraph
    import datetime
    
    def do_something():
    
        # run your app ...
    
        print("==={} show_most_common_types===".format(datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")))
        objgraph.show_most_common_types(limit=5)
        print("==={} show_growth===".format(datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")))
        objgraph.show_growth(limit=5)
    

    输出:

    ===2024-07-21 16:41:14 show_most_common_types===
    function 18495
    list     16072
    dict     10912
    tuple    6515
    weakref  3773
    ===2024-07-21 16:41:14 show_growth===
    function    18495    +18495
    list        16072    +16072
    dict        10903    +10903
    tuple        6503     +6503
    weakref      3773     +3773
    

    objgraph 还可以直观的输出对象的引用关系图,需要搭配 xdot 使用。

    1.6 pympler

    安装 pip install pympler

    常见用法示例:

    import datetime
    from pympler import tracker, muppy, summary
    
    tr = tracker.SummaryTracker()
    
    def do_something():
    
        # run your app ...
    
        print("==={} mem total===".format(datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")))
        all_objects = muppy.get_objects()
        sum1 = summary.summarize(all_objects)
        summary.print_(sum1)
        print("==={} mem diff===".format(datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")))
        tr.print_diff()
    

    输出结果:

    ===2024-07-21 16:17:47 mem total===
                        types |   # objects |   total size
    ========================= | =========== | ============
                         dict |       35489 |     32.33 MB
                          str |       57287 |      5.50 MB
                      unicode |       41150 |      3.55 MB
                         type |        2748 |      2.37 MB
                         code |       17055 |      2.08 MB
                         list |       16024 |      1.80 MB
                        tuple |       12969 |      1.74 MB
                          set |        1704 |    539.06 KB
                      weakref |        3741 |    321.49 KB
          function (__init__) |        1426 |    167.11 KB
            getset_descriptor |        2294 |    161.30 KB
             _sre.SRE_Pattern |         241 |    116.76 KB
                  abc.ABCMeta |         124 |    109.70 KB
           wrapper_descriptor |        1371 |    107.11 KB
      collections.OrderedDict |         341 |    103.82 KB
    
    ===2024-07-21 16:17:47 mem diff===
                    types |   # objects |   total size
    ===================== | =========== | ============
                     list |       19695 |      3.77 MB
                      str |       23061 |      1.44 MB
                     dict |         505 |    344.71 KB
                  unicode |         285 |     97.78 KB
                     type |          91 |     80.27 KB
                     code |         560 |     70.00 KB
                      int |        2421 |     56.74 KB
              _io.BytesIO |           1 |     24.25 KB
                    tuple |         296 |     20.49 KB
         _sre.SRE_Pattern |          25 |      9.86 KB
                  weakref |          97 |      8.34 KB
        collections.deque |           7 |      4.77 KB
        getset_descriptor |          54 |      3.80 KB
      function (__repr__) |          32 |      3.75 KB
      function (__init__) |          31 |      3.63 KB
    

    缺点:统计耗时长,若是放在程序中容易阻塞进程执行,不适合生产debug。

    1.7 pyrasite

    安装 pip install pyrasite pyrasite-gui urwid meliae

    还依赖系统的 gdb (version 7.3+)

    虽说工具非常强大,是一个可以通过Python进程ID获取进程运行状态的工具,直接运行时查看非常的方便。
    非常遗憾,在Mac和Centos系统都未尝试成功。

    原始需求是排查Python2程序的问题,所以也是用的python2.7环境进行尝试使用:

    出现错误1:

        Complete output from command python setup.py egg_info: 'NewConnectionError('<pip._vendor.requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x25c5150>: Failed to establish a new connection: [Errno 101] Network is unreachable',)': /simple/cython/
          Could not find a version that satisfies the requirement Cython (from versions: )
        No matching distribution found for Cython
        Traceback (most recent call last):
          File "<string>", line 1, in <module>
          File "/tmp/pip-build-RqQ7F6/meliae/setup.py", line 96, in <module>
            config()
          File "/tmp/pip-build-RqQ7F6/meliae/setup.py", line 93, in config
            setup(**kwargs)
          File "/usr/lib/python2.7/site-packages/setuptools/__init__.py", line 161, in setup
            _install_setup_requires(attrs)
          File "/usr/lib/python2.7/site-packages/setuptools/__init__.py", line 156, in _install_setup_requires
            dist.fetch_build_eggs(dist.setup_requires)
          File "/usr/lib/python2.7/site-packages/setuptools/dist.py", line 721, in fetch_build_eggs
            replace_conflicting=True,
          File "/usr/lib/python2.7/site-packages/pkg_resources/__init__.py", line 782, in resolve
            replace_conflicting=replace_conflicting
          File "/usr/lib/python2.7/site-packages/pkg_resources/__init__.py", line 1065, in best_match
            return self.obtain(req, installer)
          File "/usr/lib/python2.7/site-packages/pkg_resources/__init__.py", line 1077, in obtain
            return installer(requirement)
          File "/usr/lib/python2.7/site-packages/setuptools/dist.py", line 777, in fetch_build_egg
            return fetch_build_egg(self, req)
          File "/usr/lib/python2.7/site-packages/setuptools/installer.py", line 130, in fetch_build_egg
            raise DistutilsError(str(e))
        distutils.errors.DistutilsError: Command '['/usr/bin/python2', '-m', 'pip', '--disable-pip-version-check', 'wheel', '--no-deps', '-w', '/tmp/tmpryTZj0', '--quiet', 'Cython']' returned non-zero exit status 1
    
        ----------------------------------------
    Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-RqQ7F6/meliae/
    

    安装依赖报错,通过 pip install -U pip 解决。

    安装成功后,找到Python进程ID为 75055

    执行 pyrasite-memory-viewer 75055 出现错误2:

    Traceback (most recent call last):
      File "/Users/skyler/Documents/py-env/venv2.7/bin/pyrasite-memory-viewer", line 8, in <module>
        sys.exit(main())
      File "/Users/skyler/Documents/py-env/venv2.7/lib/python2.7/site-packages/pyrasite/tools/memory_viewer.py", line 150, in main
        objects = loader.load(filename)
      File "/Users/skyler/Documents/py-env/venv2.7/lib/python2.7/site-packages/meliae/loader.py", line 541, in load
        source, cleanup = files.open_file(source)
      File "/Users/skyler/Documents/py-env/venv2.7/lib/python2.7/site-packages/meliae/files.py", line 32, in open_file
        source = open(filename, 'rb')
    IOError: [Errno 2] No such file or directory: '/tmp/pyrasite-75055-objects.json'
    

    简单通过 touch /tmp/pyrasite-75055-objects.json 继续执行:

    Traceback (most recent call last):
      File "/Users/skyler/Documents/py-env/venv2.7/bin/pyrasite-memory-viewer", line 8, in <module>
        sys.exit(main())
      File "/Users/skyler/Documents/py-env/venv2.7/lib/python2.7/site-packages/pyrasite/tools/memory_viewer.py", line 150, in main
        objects = loader.load(filename)
      File "/Users/skyler/Documents/py-env/venv2.7/lib/python2.7/site-packages/meliae/loader.py", line 556, in load
        max_parents=max_parents)
      File "/Users/skyler/Documents/py-env/venv2.7/lib/python2.7/site-packages/meliae/loader.py", line 635, in _load
        factory=objs.add):
      File "/Users/skyler/Documents/py-env/venv2.7/lib/python2.7/site-packages/meliae/loader.py", line 629, in iter_objs
        % (line_num, len(objs), mb_read, input_mb, tdelta))
    UnboundLocalError: local variable 'line_num' referenced before assignment
    

    非常遗憾,pyrasite工具本文在Mac和Centos系统都未尝试成功。

    2. 案例分析

    环境:

  • Centos 7
  • Python2.7
  • mem-top==0.2.1
  • 这里使用的 mem_top 工具,执行耗时快,不影响业务进程提供服务;

    定义了全局计数器 count ,每执行100次输出一次目前进程内存占用情况;

    import logging
    import mem_top
    
    logger = logging.getLogger("mem-debug")  # 自行配置logger相关配置
    global count  # 定义全局计数器
    
    def do_something():
        # run your app ...
    
        global count
        if count % 100 == 0:
            msg = mem_top.mem_top(limit=3, width=400)
            logger.info("{} {}".format(count, msg))
        else:
            logger.debug(count)
        count += 1
    

    截取部分输出:

    refs:
    157613189	<type 'list'> [<function search_function at 0x7f777e945398>, <function search_function at 0x7f777e945398>, <function search_function at 0x7f777e945398>, <function search_function at 0x7f777e945398>, <function search_function at 0x7f777e945398>, <function search_function at 0x7f777e945398>, <function search_function at 0x7f777e945398>, <function search_function at 0x7f777e945398>, <function search_function at 0x
    5742	<type 'list'> ['# module pyparsing.py\n', '#\n', '# Copyright (c) 2003-2018  Paul T. McGuire\n', '#\n', '# Permission is hereby granted, free of charge, to any person obtaining\n', '# a copy of this software and associated documentation files (the\n', '# "Software"), to deal in the Software without restriction, including\n', '# without limitation the rights to use, copy, modify, merge, publish,\n', '# distribut
    4240	<type 'dict'> {'oss2.task_queue': <module 'oss2.task_queue' from '/usr/lib/python2.7/site-packages/oss2/task_queue.pyc'>, 'requests.Cookie': None, 'aliyunsdkcdn.request.v20180510': <module 'aliyunsdkcdn.request.v20180510' from '/usr/lib/python2.7/site-packages/aliyunsdkcdn/request/v20180510/__init__.pyc'>, 'elasticsearch.client.cat': <module 'elasticsearch.client.cat' from '/usr/lib/python2.7/site-packages/elas
    
    bytes:
    1377112608	 [<function search_function at 0x7f777e945398>, <function search_function at 0x7f777e945398>, <function search_function at 0x7f777e945398>, <function search_function at 0x7f777e945398>, <function search_function at 0x7f777e945398>, <function search_function at 0x7f777e945398>, <function search_function at 0x7f777e945398>, <function search_function at 0x7f777e945398>, <function search_function at 0x
    196888	 {'oss2.task_queue': <module 'oss2.task_queue' from '/usr/lib/python2.7/site-packages/oss2/task_queue.pyc'>, 'requests.Cookie': None, 'aliyunsdkcdn.request.v20180510': <module 'aliyunsdkcdn.request.v20180510' from '/usr/lib/python2.7/site-packages/aliyunsdkcdn/request/v20180510/__init__.pyc'>, 'elasticsearch.client.cat': <module 'elasticsearch.client.cat' from '/usr/lib/python2.7/site-packages/elas
    49432	 {'FOLLOWLOCATION': 52, 'NETRC_IGNORED': 0, 'E_WRITE_ERROR': 23, 'CONTENT_LENGTH_UPLOAD': 3145744, 'SSLVERSION_TLSv1_0': 4, 'SSLVERSION_TLSv1_1': 5, 'SSLVERSION_TLSv1_2': 6, 'E_COULDNT_CONNECT': 7, 'NETRC_OPTIONAL': 1, 'IOCTLFUNCTION': 20130, 'MAX_SEND_SPEED_LARGE': 30145, 'QUOTE': 10028, 'E_ABORTED_BY_CALLBACK': 42, 'INFOTYPE_TEXT': 0, 'READDATA': 10009, 'POLL_NONE': 0, 'E_CONV_REQD': 76, 'MAXCONN
    
    types:
    19638	 <type 'function'>
    11322	 <type 'dict'>
    7124	 <type 'tuple'>
    

    从输出日志中可以看到内存泄漏是因为 <function search_function at 0x7f777e945398>

    在代码中全局搜索 search_function 但并未发现使用,此时我们也可以通过其他工具是通过引用路径发现使用的地方,本人直接暴力从安装依赖库的路径去全局搜索了一下。

    > cd ./py-env/venv2.7/lib/python2.7/site-packages/
    > find . -type f -name "*.py" | xargs grep search_function
    ./gnupg/_util.py:    codecs.register(encodings.search_function)
    

    到此发现是 三方库 gnupg 中出现的问题 , 源码。

    gnupg是一个加解密模块,在处理encode编码问题时,为了解决非utf-8的编码,lib内部处理编码时register了编码function,但没有unregister(python2.7也没有unregister函数,在python 3.10版本加入)
    因为服务代码都是utf-8编码,不需要通过那个逻辑解决,注释了那行register代码,测试内存不泄漏。

    由于时间紧迫,加上看lib作者已经很久没有维护改库了,所以使用 python-gnupg==0.4.6 替换了 gnupg==2.2.0 去解决了问题。

  • python-gnupg源码仓库:isislovecruft/python-gnupg
  • gnupg源码仓库:vsajip/python-gnupg
  • 3. 参考

  • python内存泄漏调试
  • Python内存泄露调查
  • 记一次Python应用内存泄漏问题定位
  • Python内存泄漏(OOM)如何快速排查?
  • 作者:SkylerHu

    物联沃分享整理
    物联沃-IOTWORD物联网 » Python内存泄漏排查详解

    发表回复