Python高级-经典协程

生成器嵌套

Python3.3新增的yield from表达式句法可把一个生成器的工作委托给一个子生成器。

假设在前面文章的 Sentence 例子中,需要按照多种不同格式依次输出单词,可以通过for循环迭代子生成器,实现代码如下:

# 生成器嵌套
import re
import reprlib
from collections.abc import Iterable

RE_WORD = re.compile(r'\w+')


class Sentence:

    def __init__(self, text):
        self.text = text
    def __repr__(self):
        return f'Sentence({reprlib.repr(self.text)})'

    def __split_words__(self) -> Iterable[str]:
        for match in RE_WORD.finditer(self.text):
            yield match.group()
  
    def __format1__(self) -> Iterable[str]:
        for item in self.__split_words__():
            yield f"[{item}]"
  
    def __format2__(self) -> Iterable[str]:
        for item in self.__split_words__():
            yield f"--{item}--"

    def __iter__(self) -> Iterable[str]:
        for item in self.__format1__():
            yield item
        for item in self.__format2__():
            yield item


s = Sentence('"The time has come," the Walrus said')
for word in s:
    print(word)

输出内容:

[The]
[time]
[has]
[come]
[the]
[Walrus]
[said]
--The--
--time--
--has--
--come--
--the--
--Walrus--
--said--

如果使用 yield from 则更简单明了:

# 生成器嵌套
import re
import reprlib
from collections.abc import Iterable

RE_WORD = re.compile(r'\w+')


class Sentence:

    def __init__(self, text):
        self.text = text
    def __repr__(self):
        return f'Sentence({reprlib.repr(self.text)})'

    def __split_words__(self) -> Iterable[str]:
        for match in RE_WORD.finditer(self.text):
            yield match.group()
  
    def __format1__(self) -> Iterable[str]:
        for item in self.__split_words__():
            yield f"[{item}]"
  
    def __format2__(self) -> Iterable[str]:
        for item in self.__split_words__():
            yield f"--{item}--"

    def __iter__(self) -> Iterable[str]:
        yield from self.__format1__()
        yield from self.__format2__()

s = Sentence('"The time has come," the Walrus said')
for word in s:
    print(word)

经典协程

在Python3.5还未发布前,协程指代的就是基于生成器实现的“经典协程”,虽然如此但协程和生成器还是有很大区别:

  • 生成器生产供迭代的数据
  • 协程功能更强大,不仅在运行过程中能生产迭代数据,还能接收外部传值并最终返回结果

生成器类型的类型参数定义:Generator[YieldType, SendType, ReturnType],只有当生成器作为协程用途时(大部分场景生成器是作为迭代器使用),SendType, ReturnType 参数才会派上用场。

  • SendType,外部可以通过 gen.send(x) 的方式向生成器传递参数
  • ReturnType,生成器在执行结束之后,通过 return 返回的结果(在 StopIteration 异常中获取结果)
  • YieldType,通过 yield 生成的迭代数据(外部通过 next() 或者 for 循环进行触发)

下面展示一个通过协程实现计算平均值的例子:

  • 启动一个循环协程 averager(),可以不断接收外部数据项并计算平均值
  • 通过 send() 向协程发送数据项,每次发送完会实时计算当前平均值并返回(基于 yield 生成)
  • 通过发送 STOP Signal(该信号也被称为哨符) 让协程停止循环,并返回最终计算的结果【数据项总数,平均值】(基于 return 返回)
from typing import Generator, Union


class STOPSignal:
    def __repr__(self):
        return f'<STOPSignal>'


def averager() -> Generator[None, Union[float, STOPSignal], tuple[int, float]]:
    total = 0.0
    count = 0
    average = 0.0
    while True:
        term = yield average
        print('received:', term)
        if isinstance(term, STOPSignal):
            break
        total += term
        count += 1
        average = total / count
    return count, average

def compute():
    result = yield from averager()
    print("compute result:", result)
    return result


coro_compute = compute()
for item in [None, 10, 30, 5, STOPSignal()]:
    try:
        print(coro_compute.send(item))
    except StopIteration as e:
        print("final result:", e.value)

输出结果如下:

0.0
received: 10
10.0
received: 30
20.0
received: 5
15.0
received: <STOPSignal>
compute result: (3, 15.0)
final result: (3, 15.0)

由于协程具备的如下特性:

  • 能在运行过程中中暂停,并即时返回值
  • 能接受外部参数,并由外部驱动执行

故该概念被广泛运用在异步编程框架中(例如 asyncio),用于实现异步IO操作。