Python高级语法指南：从LangGraph项目中学习

在本文中，我们将通过分析一个基于LangGraph的项目，深入探讨Python中的一些高级语法特性。这些特性在现代Python应用开发中非常常见，尤其是在使用Pydantic、FastAPI和LangGraph等框架时。

1. Pydantic模型

1.1 BaseModel基础

Pydantic是一个数据验证和设置管理的Python库，广泛应用于FastAPI、LangGraph等现代Python框架中。BaseModel是Pydantic的核心类，用于创建具有数据验证功能的模型。

from pydantic import BaseModel

class User(BaseModel):
    id: int
    name: str
    email: str
    is_active: bool = True  # 带默认值的字段

当你创建一个继承自BaseModel的类时，Pydantic会自动：

验证传入的数据类型是否符合定义
提供序列化方法（如.dict(), .json()）
支持数据转换和验证

使用示例：

# 创建实例
user = User(id=1, name="张三", email="zhangsan@example.com")

# 序列化为字典
user_dict = user.dict()  # {"id": 1, "name": "张三", "email": "zhangsan@example.com", "is_active": True}

# 序列化为JSON
user_json = user.json()  # '{"id": 1, "name": "张三", "email": "zhangsan@example.com", "is_active": true}'

# 数据验证（错误示例）
try:
    invalid_user = User(id="not_an_integer", name=123, email="invalid_email")
except Exception as e:
    print(f"验证错误: {e}")  # 会打印出详细的验证错误信息

1.2 Field函数

Field是Pydantic中用于为模型字段提供额外元数据和验证规则的函数。

from pydantic import BaseModel, Field

class Product(BaseModel):
    id: int = Field(..., gt=0)  # ... 表示必填字段，gt=0表示值必须大于0
    name: str = Field(min_length=3, max_length=50)  # 字符串长度限制
    price: float = Field(gt=0, description="商品价格")  # 添加描述信息
    tags: list[str] = Field(default_factory=list)  # 使用default_factory提供默认值

Field函数的常用参数：

default: 默认值
default_factory: 提供默认值的工厂函数
alias: 别名，用于序列化/反序列化
title: 标题
description: 描述信息
gt, ge, lt, le: 大于、大于等于、小于、小于等于
min_length, max_length: 字符串或列表的长度限制
regex: 正则表达式模式
metadata: 额外的元数据字典

1.3 不同的Field用法

在我们分析的项目中，我们看到了两种不同的Field用法：

1.3.1 直接使用description参数

在tools_and_schemas.py文件中：

class SearchQueryList(BaseModel):
    query: List[str] = Field(
        description="A list of search queries to be used for web research."
    )
    rationale: str = Field(
        description="A brief explanation of why these queries are relevant to the research topic."
    )

1.3.2 在metadata字典中使用description

在configuration.py文件中：

class Configuration(BaseModel):
    query_generator_model: str = Field(
        default="gemini-2.0-flash",
        metadata={
            "description": "The name of the language model to use for the agent's query generation."
        },
    )

区别解释：

直接作为参数：当description直接作为参数时，它主要用于Pydantic的模型文档生成和API文档生成（如FastAPI）。
放在metadata中：当放在metadata中时，它更多是作为额外的元数据，可能用于自定义的文档生成或UI界面生成，不会直接被Pydantic的标准文档工具使用。

两种方式都是有效的，选择哪种方式取决于你的具体需求。

2. Python类型注解

2.1 基本类型注解

Python 3.5+引入了类型注解，允许开发者指定变量、参数和返回值的类型。

def greet(name: str) -> str:
    return f"Hello, {name}!"

age: int = 30
pi: float = 3.14
active: bool = True
names: list[str] = ["Alice", "Bob", "Charlie"]

2.2 Optional和Union

Optional和Union类型用于表示变量可能有多种类型。

from typing import Optional, Union

# Optional[T]等同于Union[T, None]
def get_user(user_id: Optional[int] = None) -> Optional[dict]:
    if user_id is None:
        return None
    return {"id": user_id, "name": "User"}

# Union表示多种可能的类型
def process_input(data: Union[str, bytes, list]) -> str:
    if isinstance(data, bytes):
        return data.decode('utf-8')
    elif isinstance(data, list):
        return ", ".join(str(item) for item in data)
    return data

2.3 泛型类型

泛型类型允许你创建可以与多种类型一起工作的容器类型。

from typing import List, Dict, Tuple, Generic, TypeVar

# 基本容器类型
names: List[str] = ["Alice", "Bob"]
ages: Dict[str, int] = {"Alice": 30, "Bob": 25}
point: Tuple[float, float] = (2.5, 3.7)

# 泛型
T = TypeVar('T')

class Stack(Generic[T]):
    def __init__(self) -> None:
        self.items: List[T] = []
    
    def push(self, item: T) -> None:
        self.items.append(item)
    
    def pop(self) -> T:
        return self.items.pop()
    
    def empty(self) -> bool:
        return len(self.items) == 0

# 使用
int_stack = Stack[int]()
int_stack.push(1)
int_stack.push(2)

3. Python装饰器

3.1 类方法装饰器

@classmethod装饰器用于将普通的实例方法转换为类方法。类方法的特点是：

第一个参数是类本身（通常命名为cls），而不是实例（通常命名为self）
可以通过类名直接调用，不需要创建实例
可以访问和修改类的属性，但不能访问实例的属性

class Person:
    count = 0  # 类变量
    
    def __init__(self, name: str) -> None:
        self.name = name  # 实例变量
        Person.count += 1
    
    @classmethod
    def from_birth_year(cls, name: str, birth_year: int) -> 'Person':
        """根据出生年份创建Person实例的替代构造函数"""
        import datetime
        age = datetime.datetime.now().year - birth_year
        return cls(name)
    
    @classmethod
    def get_count(cls) -> int:
        """获取创建的Person实例数量"""
        return cls.count

# 使用类方法
person1 = Person("张三")
person2 = Person.from_birth_year("李四", 1990)
print(Person.get_count())  # 输出: 2

在我们分析的项目中，Configuration类使用了@classmethod装饰器来创建一个名为from_runnable_config的类方法：

@classmethod
def from_runnable_config(cls, config: Optional[RunnableConfig] = None) -> "Configuration":
    """Create a Configuration instance from a RunnableConfig."""
    configurable = (config["configurable"] if config and "configurable" in config else {})
    
    # Get raw values from environment or config
    raw_values: dict[str, Any] = {
        name: os.environ.get(name.upper(), configurable.get(name))
        for name in cls.model_fields.keys()
    }
    
    # Filter out None values
    values = {k: v for k, v in raw_values.items() if v is not None}
    
    return cls(**values)

这个类方法的作用是从RunnableConfig对象创建一个Configuration实例，它从环境变量和配置对象中获取值。

4. Python字典和推导式

4.1 字典推导式

字典推导式是一种简洁的创建字典的方式，类似于列表推导式。

# 基本字典推导式
squares = {x: x*x for x in range(6)}  # {0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25}

# 带条件的字典推导式
even_squares = {x: x*x for x in range(6) if x % 2 == 0}  # {0: 0, 2: 4, 4: 16}

# 从两个列表创建字典
keys = ['a', 'b', 'c']
values = [1, 2, 3]
my_dict = {k: v for k, v in zip(keys, values)}  # {'a': 1, 'b': 2, 'c': 3}

在from_runnable_config方法中，我们看到了两个字典推导式：

# 第一个字典推导式：从环境变量或配置中获取原始值
raw_values: dict[str, Any] = {
    name: os.environ.get(name.upper(), configurable.get(name))
    for name in cls.model_fields.keys()
}

# 第二个字典推导式：过滤掉None值
values = {k: v for k, v in raw_values.items() if v is not None}

4.2 条件表达式

条件表达式（也称为三元运算符）是if-else语句的简洁形式。

# 基本语法
# 结果 = 值1 if 条件 else 值2

# 示例
x = 10
result = "偶数" if x % 2 == 0 else "奇数"  # 结果: "偶数"

# 嵌套条件表达式
y = 5
category = "大" if x > 10 else "中" if x > 5 else "小"  # 结果: "中"

在from_runnable_config方法中，我们看到了一个条件表达式：

configurable = (config["configurable"] if config and "configurable" in config else {})

这个表达式的含义是：

如果config存在（不是None）且"configurable"是config字典的一个键，那么configurable被赋值为config["configurable"]
否则，configurable被赋值为空字典{}

4.3 解包操作符

解包操作符用于将可迭代对象（如列表、元组、字典）的元素展开。

# 列表解包
numbers = [1, 2, 3]
print(*numbers)  # 输出: 1 2 3

# 字典解包
defaults = {"color": "red", "size": "medium"}
options = {"size": "large", "material": "cotton"}
product = {**defaults, **options}  # {"color": "red", "size": "large", "material": "cotton"}

# 在函数调用中使用
def create_user(name, email, role="user"):
    return {"name": name, "email": email, "role": role}

user_data = {"name": "张三", "email": "zhangsan@example.com", "role": "admin"}
user = create_user(**user_data)  # 等同于create_user(name="张三", email="zhangsan@example.com", role="admin")

在from_runnable_config方法中，我们看到了字典解包操作符的使用：

return cls(**values)

这行代码将values字典解包为关键字参数，传递给cls（即Configuration类）的构造函数。例如，如果values = {"query_generator_model": "gemini-2.0-pro", "max_research_loops": 3}，则cls(**values)相当于cls(query_generator_model="gemini-2.0-pro", max_research_loops=3)。

5. 实际应用案例

5.1 配置管理模式

在我们分析的项目中，Configuration类展示了一种灵活的配置管理模式：

class Configuration(BaseModel):
    """The configuration for the agent."""

    query_generator_model: str = Field(
        default="gemini-2.0-flash",
        metadata={
            "description": "The name of the language model to use for the agent's query generation."
        },
    )
    
    # 其他配置项...
    
    @classmethod
    def from_runnable_config(cls, config: Optional[RunnableConfig] = None) -> "Configuration":
        """Create a Configuration instance from a RunnableConfig."""
        configurable = (config["configurable"] if config and "configurable" in config else {})
        
        # Get raw values from environment or config
        raw_values: dict[str, Any] = {
            name: os.environ.get(name.upper(), configurable.get(name))
            for name in cls.model_fields.keys()
        }
        
        # Filter out None values
        values = {k: v for k, v in raw_values.items() if v is not None}
        
        return cls(**values)

这种模式的优点：

灵活的配置来源：允许从多个来源获取配置（环境变量、配置对象、默认值）
优先级明确：环境变量 > 配置对象 > 默认值
类型安全：返回的是一个类型化的Configuration对象，而不是一个普通字典
默认值处理：自动应用默认值，减少了空值检查的代码

5.2 LangGraph中的应用

在LangGraph项目中，Configuration类被用于配置代理的各个节点：

def generate_query(state: OverallState, config: RunnableConfig) -> QueryGenerationState:
    """LangGraph node that generates a search queries based on the User's question."""
    configurable = Configuration.from_runnable_config(config)
    
    # check for custom initial search query count
    if state.get("initial_search_query_count") is None:
        state["initial_search_query_count"] = configurable.number_of_initial_queries

    # init Gemini 2.0 Flash
    llm = ChatGoogleGenerativeAI(
        model=configurable.query_generator_model,
        temperature=1.0,
        max_retries=2,
        api_key=os.getenv("GEMINI_API_KEY"),
    )
    # ...

这种方式使得LangGraph的节点函数可以轻松地访问配置，而不需要处理配置的加载和验证逻辑。

总结

通过分析这个基于LangGraph的项目，我们学习了多种Python高级语法特性：

Pydantic模型：用于数据验证和设置管理
Python类型注解：提高代码的可读性和可维护性
装饰器：用于修改函数或方法的行为
字典推导式：简洁地创建和转换字典
条件表达式：简化条件逻辑
解包操作符：方便地展开可迭代对象

这些特性在现代Python应用开发中非常有用，尤其是在使用像FastAPI、LangGraph这样的现代框架时。掌握这些特性可以帮助你编写更简洁、更可维护的代码。

目录

1. Pydantic模型

1.1 BaseModel基础

1.2 Field函数

1.3 不同的Field用法

1.3.1 直接使用description参数

1.3.2 在metadata字典中使用description

2. Python类型注解

2.1 基本类型注解

2.2 Optional和Union

2.3 泛型类型

3. Python装饰器

3.1 类方法装饰器

4. Python字典和推导式

4.1 字典推导式

4.2 条件表达式

4.3 解包操作符

5. 实际应用案例

5.1 配置管理模式

5.2 LangGraph中的应用

总结

评论

Python高级语法指南：从LangGraph项目中学习

目录

1. Pydantic模型

1.1 BaseModel基础

1.2 Field函数

1.3 不同的Field用法

1.3.1 直接使用description参数

1.3.2 在metadata字典中使用description

2. Python类型注解

2.1 基本类型注解

2.2 Optional和Union

2.3 泛型类型

3. Python装饰器

3.1 类方法装饰器

4. Python字典和推导式

4.1 字典推导式

4.2 条件表达式

4.3 解包操作符

5. 实际应用案例

5.1 配置管理模式

5.2 LangGraph中的应用

总结

推荐阅读

解析 PockerFlow 源码

使用Python生成8位随机密码

2 分钟搞定 Python 虚拟环境

评论