这是用户在 2025-7-11 7:05 为 https://langchain-ai.github.io/langgraph/tutorials/rag/langgraph_agentic_rag/#8-run-the-agentic-rag 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?
Skip to content

Agentic RAG  代理 RAG

In this tutorial we will build a retrieval agent. Retrieval agents are useful when you want an LLM to make a decision about whether to retrieve context from a vectorstore or respond to the user directly.
在本教程中,我们将构建一个 检索代理 。当您希望 LLM 决定是从 vectorstore 检索上下文还是直接响应用户时,检索代理非常有用。

By the end of the tutorial we will have done the following:
在本教程结束时,我们将完成以下工作:

  1. Fetch and preprocess documents that will be used for retrieval.
    获取和预处理将用于检索的文档。
  2. Index those documents for semantic search and create a retriever tool for the agent.
    为这些文档编制索引以进行语义搜索,并为代理创建检索工具。
  3. Build an agentic RAG system that can decide when to use the retriever tool.
    构建一个代理 RAG 系统,可以决定何时使用检索工具。

Screenshot 2024-02-14 at 3.43.58 PM.png

Setup  设置

Let's download the required packages and set our API keys:
让我们下载所需的包并设置我们的 API 密钥:

%%capture --no-stderr
%pip install -U --quiet langgraph "langchain[openai]" langchain-community langchain-text-splitters
import getpass
import os


def _set_env(key: str):
    if key not in os.environ:
        os.environ[key] = getpass.getpass(f"{key}:")


_set_env("OPENAI_API_KEY")

Tip

Sign up for LangSmith to quickly spot issues and improve the performance of your LangGraph projects. LangSmith lets you use trace data to debug, test, and monitor your LLM apps built with LangGraph.
注册 LangSmith 以快速发现问题并提高 LangGraph 项目的性能。 LangSmith 允许您使用跟踪数据来调试、测试和监控使用 LangGraph 构建的 LLM 应用程序。

1. Preprocess documents
1. 预处理文档

  1. Fetch documents to use in our RAG system. We will use three of the most recent pages from Lilian Weng's excellent blog. We'll start by fetching the content of the pages using WebBaseLoader utility:
    获取文档以在我们的 RAG 系统中使用。我们将使用 Lilian Weng 优秀博客中的三个最新页面。我们首先使用 WebBaseLoader 实用程序获取页面的内容:

    from langchain_community.document_loaders import WebBaseLoader
    
    urls = [
        "https://lilianweng.github.io/posts/2024-11-28-reward-hacking/",
        "https://lilianweng.github.io/posts/2024-07-07-hallucination/",
        "https://lilianweng.github.io/posts/2024-04-12-diffusion-video/",
    ]
    
    docs = [WebBaseLoader(url).load() for url in urls]
    
    docs[0][0].page_content.strip()[:1000]
    
  2. Split the fetched documents into smaller chunks for indexing into our vectorstore:
    将获取的文档拆分成更小的块,以便索引到我们的 vectorstore 中:

    from langchain_text_splitters import RecursiveCharacterTextSplitter
    
    docs_list = [item for sublist in docs for item in sublist]
    
    text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
        chunk_size=100, chunk_overlap=50
    )
    doc_splits = text_splitter.split_documents(docs_list)
    
    doc_splits[0].page_content.strip()
    

2. Create a retriever tool
2. 创建检索器工具

Now that we have our split documents, we can index them into a vector store that we'll use for semantic search.
现在我们有了拆分的文档,我们可以将它们索引到用于语义搜索的向量存储中。

  1. Use an in-memory vector store and OpenAI embeddings:
    使用内存中向量存储和 OpenAI 嵌入:

    from langchain_core.vectorstores import InMemoryVectorStore
    from langchain_openai import OpenAIEmbeddings
    
    vectorstore = InMemoryVectorStore.from_documents(
        documents=doc_splits, embedding=OpenAIEmbeddings()
    )
    retriever = vectorstore.as_retriever()
    
  2. Create a retriever tool using LangChain's prebuilt create_retriever_tool:
    使用 LangChain 的预构建 create_retriever_tool 创建 retriever 工具:

    from langchain.tools.retriever import create_retriever_tool
    
    retriever_tool = create_retriever_tool(
        retriever,
        "retrieve_blog_posts",
        "Search and return information about Lilian Weng blog posts.",
    )
    
  3. Test the tool:  测试工具:

    retriever_tool.invoke({"query": "types of reward hacking"})
    

3. Generate query
3. 生成查询

Now we will start building components (nodes and edges) for our agentic RAG graph. Note that the components will operate on the MessagesState — graph state that contains a messages key with a list of chat messages.
现在,我们将开始为代理 RAG 图构建组件( 节点边缘 )。请注意,这些组件将对 MessagesState — 图形状态进行作,该状态包含带有聊天消息列表的 messages 键。

  1. Build a generate_query_or_respond node. It will call an LLM to generate a response based on the current graph state (list of messages). Given the input messages, it will decide to retrieve using the retriever tool, or respond directly to the user. Note that we're giving the chat model access to the retriever_tool we created earlier via .bind_tools:
    构建 generate_query_or_respond 节点。它将调用 LLM 以根据当前图形状态(消息列表)生成响应。给定输入消息,它将决定使用 retriever 工具进行检索,或直接响应用户。请注意,我们将为聊天模型提供对之前通过 .bind_tools 创建的 retriever_tool 的访问权限:

    from langgraph.graph import MessagesState
    from langchain.chat_models import init_chat_model
    
    response_model = init_chat_model("openai:gpt-4.1", temperature=0)
    
    
    def generate_query_or_respond(state: MessagesState):
        """Call the model to generate a response based on the current state. Given
        the question, it will decide to retrieve using the retriever tool, or simply respond to the user.
        """
        response = (
            response_model
            .bind_tools([retriever_tool]).invoke(state["messages"])
        )
        return {"messages": [response]}
    
  2. Try it on a random input:
    在随机输入上尝试一下:

    input = {"messages": [{"role": "user", "content": "hello!"}]}
    generate_query_or_respond(input)["messages"][-1].pretty_print()
    

    Output:  输出:

    ================================== Ai Message ==================================
    
    Hello! How can I help you today?
    

  3. Ask a question that requires semantic search:
    提出一个需要语义搜索的问题:

    input = {
        "messages": [
            {
                "role": "user",
                "content": "What does Lilian Weng say about types of reward hacking?",
            }
        ]
    }
    generate_query_or_respond(input)["messages"][-1].pretty_print()
    

    Output:  输出:

    ================================== Ai Message ==================================
    Tool Calls:
    retrieve_blog_posts (call_tYQxgfIlnQUDMdtAhdbXNwIM)
    Call ID: call_tYQxgfIlnQUDMdtAhdbXNwIM
    Args:
        query: types of reward hacking
    

4. Grade documents
4. 对文档进行评分

  1. Add a conditional edgegrade_documents — to determine whether the retrieved documents are relevant to the question. We will use a model with a structured output schema GradeDocuments for document grading. The grade_documents function will return the name of the node to go to based on the grading decision (generate_answer or rewrite_question):
    添加 条件边 grade_documents — 以确定检索到的文档是否与问题相关。我们将使用具有结构化输出架构 GradeDocuments 的模型进行文档评分。 grade_documents 函数将根据评分决定(generate_answerrewrite_question)返回要转到的节点的名称:

    from pydantic import BaseModel, Field
    from typing import Literal
    
    GRADE_PROMPT = (
        "You are a grader assessing relevance of a retrieved document to a user question. \n "
        "Here is the retrieved document: \n\n {context} \n\n"
        "Here is the user question: {question} \n"
        "If the document contains keyword(s) or semantic meaning related to the user question, grade it as relevant. \n"
        "Give a binary score 'yes' or 'no' score to indicate whether the document is relevant to the question."
    )
    
    
    class GradeDocuments(BaseModel):
        """Grade documents using a binary score for relevance check."""
    
        binary_score: str = Field(
            description="Relevance score: 'yes' if relevant, or 'no' if not relevant"
        )
    
    
    grader_model = init_chat_model("openai:gpt-4.1", temperature=0)
    
    
    def grade_documents(
        state: MessagesState,
    ) -> Literal["generate_answer", "rewrite_question"]:
        """Determine whether the retrieved documents are relevant to the question."""
        question = state["messages"][0].content
        context = state["messages"][-1].content
    
        prompt = GRADE_PROMPT.format(question=question, context=context)
        response = (
            grader_model
            .with_structured_output(GradeDocuments).invoke(
                [{"role": "user", "content": prompt}]
            )
        )
        score = response.binary_score
    
        if score == "yes":
            return "generate_answer"
        else:
            return "rewrite_question"
    
  2. Run this with irrelevant documents in the tool response:
    在工具响应中使用不相关的文档运行此命令:

    from langchain_core.messages import convert_to_messages
    
    input = {
        "messages": convert_to_messages(
            [
                {
                    "role": "user",
                    "content": "What does Lilian Weng say about types of reward hacking?",
                },
                {
                    "role": "assistant",
                    "content": "",
                    "tool_calls": [
                        {
                            "id": "1",
                            "name": "retrieve_blog_posts",
                            "args": {"query": "types of reward hacking"},
                        }
                    ],
                },
                {"role": "tool", "content": "meow", "tool_call_id": "1"},
            ]
        )
    }
    grade_documents(input)
    
  3. Confirm that the relevant documents are classified as such:
    确认相关文件分类如下:

    input = {
        "messages": convert_to_messages(
            [
                {
                    "role": "user",
                    "content": "What does Lilian Weng say about types of reward hacking?",
                },
                {
                    "role": "assistant",
                    "content": "",
                    "tool_calls": [
                        {
                            "id": "1",
                            "name": "retrieve_blog_posts",
                            "args": {"query": "types of reward hacking"},
                        }
                    ],
                },
                {
                    "role": "tool",
                    "content": "reward hacking can be categorized into two types: environment or goal misspecification, and reward tampering",
                    "tool_call_id": "1",
                },
            ]
        )
    }
    grade_documents(input)
    

5. Rewrite question
5. 重写问题

  1. Build the rewrite_question node. The retriever tool can return potentially irrelevant documents, which indicates a need to improve the original user question. To do so, we will call the rewrite_question node:
    构建 rewrite_question 节点。检索工具可能会返回可能不相关的文档,这表明需要改进原始用户问题。为此,我们将调用 rewrite_question 节点:

    REWRITE_PROMPT = (
        "Look at the input and try to reason about the underlying semantic intent / meaning.\n"
        "Here is the initial question:"
        "\n ------- \n"
        "{question}"
        "\n ------- \n"
        "Formulate an improved question:"
    )
    
    
    def rewrite_question(state: MessagesState):
        """Rewrite the original user question."""
        messages = state["messages"]
        question = messages[0].content
        prompt = REWRITE_PROMPT.format(question=question)
        response = response_model.invoke([{"role": "user", "content": prompt}])
        return {"messages": [{"role": "user", "content": response.content}]}
    
  2. Try it out:  试试看:

    input = {
        "messages": convert_to_messages(
            [
                {
                    "role": "user",
                    "content": "What does Lilian Weng say about types of reward hacking?",
                },
                {
                    "role": "assistant",
                    "content": "",
                    "tool_calls": [
                        {
                            "id": "1",
                            "name": "retrieve_blog_posts",
                            "args": {"query": "types of reward hacking"},
                        }
                    ],
                },
                {"role": "tool", "content": "meow", "tool_call_id": "1"},
            ]
        )
    }
    
    response = rewrite_question(input)
    print(response["messages"][-1]["content"])
    

    Output:  输出:

    What are the different types of reward hacking described by Lilian Weng, and how does she explain them?
    

6. Generate an answer
6. 生成答案

  1. Build generate_answer node: if we pass the grader checks, we can generate the final answer based on the original question and the retrieved context:
    构建 generate_answer 节点:如果我们通过了 grader 检查,我们可以根据原始问题和检索到的上下文生成最终答案:

    GENERATE_PROMPT = (
        "You are an assistant for question-answering tasks. "
        "Use the following pieces of retrieved context to answer the question. "
        "If you don't know the answer, just say that you don't know. "
        "Use three sentences maximum and keep the answer concise.\n"
        "Question: {question} \n"
        "Context: {context}"
    )
    
    
    def generate_answer(state: MessagesState):
        """Generate an answer."""
        question = state["messages"][0].content
        context = state["messages"][-1].content
        prompt = GENERATE_PROMPT.format(question=question, context=context)
        response = response_model.invoke([{"role": "user", "content": prompt}])
        return {"messages": [response]}
    
  2. Try it:  试一试:

    input = {
        "messages": convert_to_messages(
            [
                {
                    "role": "user",
                    "content": "What does Lilian Weng say about types of reward hacking?",
                },
                {
                    "role": "assistant",
                    "content": "",
                    "tool_calls": [
                        {
                            "id": "1",
                            "name": "retrieve_blog_posts",
                            "args": {"query": "types of reward hacking"},
                        }
                    ],
                },
                {
                    "role": "tool",
                    "content": "reward hacking can be categorized into two types: environment or goal misspecification, and reward tampering",
                    "tool_call_id": "1",
                },
            ]
        )
    }
    
    response = generate_answer(input)
    response["messages"][-1].pretty_print()
    

    Output:  输出:

    ================================== Ai Message ==================================
    
    Lilian Weng categorizes reward hacking into two types: environment or goal misspecification, and reward tampering. She considers reward hacking as a broad concept that includes both of these categories. Reward hacking occurs when an agent exploits flaws or ambiguities in the reward function to achieve high rewards without performing the intended behaviors.
    

7. Assemble the graph
7. 组装图表

  • Start with a generate_query_or_respond and determine if we need to call retriever_tool
    generate_query_or_respond 开始,并确定是否需要调用 retriever_tool
  • Route to next step using tools_condition:
    使用 tools_condition 路由到下一步:
    • If generate_query_or_respond returned tool_calls, call retriever_tool to retrieve context
      如果 generate_query_or_respond 返回 tool_calls,则调用 retriever_tool 以检索上下文
    • Otherwise, respond directly to the user
      否则,请直接响应用户
  • Grade retrieved document content for relevance to the question (grade_documents) and route to next step:
    对检索到的文档内容进行评分以使其与问题相关 (grade_documents) 并转到下一步:
    • If not relevant, rewrite the question using rewrite_question and then call generate_query_or_respond again
      如果不相关,请使用 rewrite_question 重写问题,然后再次致电 generate_query_or_respond
    • If relevant, proceed to generate_answer and generate final response using the ToolMessage with the retrieved document context
      如果相关,请继续 generate_answer 并使用 ToolMessage 和检索到的文档上下文生成最终响应

API Reference: StateGraph | START | END | ToolNode | tools_condition
API 参考: StateGraph |开始 |完 |工具节点 |tools_condition

from langgraph.graph import StateGraph, START, END
from langgraph.prebuilt import ToolNode
from langgraph.prebuilt import tools_condition

workflow = StateGraph(MessagesState)

# Define the nodes we will cycle between
workflow.add_node(generate_query_or_respond)
workflow.add_node("retrieve", ToolNode([retriever_tool]))
workflow.add_node(rewrite_question)
workflow.add_node(generate_answer)

workflow.add_edge(START, "generate_query_or_respond")

# Decide whether to retrieve
workflow.add_conditional_edges(
    "generate_query_or_respond",
    # Assess LLM decision (call `retriever_tool` tool or respond to the user)
    tools_condition,
    {
        # Translate the condition outputs to nodes in our graph
        "tools": "retrieve",
        END: END,
    },
)

# Edges taken after the `action` node is called.
workflow.add_conditional_edges(
    "retrieve",
    # Assess agent decision
    grade_documents,
)
workflow.add_edge("generate_answer", END)
workflow.add_edge("rewrite_question", "generate_query_or_respond")

# Compile
graph = workflow.compile()

Visualize the graph:  可视化图表:

from IPython.display import Image, display

display(Image(graph.get_graph().draw_mermaid_png()))

Graph

8. Run the agentic RAG
8. 运行代理 RAG

for chunk in graph.stream(
    {
        "messages": [
            {
                "role": "user",
                "content": "What does Lilian Weng say about types of reward hacking?",
            }
        ]
    }
):
    for node, update in chunk.items():
        print("Update from node", node)
        update["messages"][-1].pretty_print()
        print("\n\n")

Output:  输出:

Update from node generate_query_or_respond
================================== Ai Message ==================================
Tool Calls:
  retrieve_blog_posts (call_NYu2vq4km9nNNEFqJwefWKu1)
 Call ID: call_NYu2vq4km9nNNEFqJwefWKu1
  Args:
    query: types of reward hacking



Update from node retrieve
================================= Tool Message ==================================
Name: retrieve_blog_posts

(Note: Some work defines reward tampering as a distinct category of misalignment behavior from reward hacking. But I consider reward hacking as a broader concept here.)
At a high level, reward hacking can be categorized into two types: environment or goal misspecification, and reward tampering.

Why does Reward Hacking Exist?#

Pan et al. (2022) investigated reward hacking as a function of agent capabilities, including (1) model size, (2) action space resolution, (3) observation space noise, and (4) training time. They also proposed a taxonomy of three types of misspecified proxy rewards:

Let's Define Reward Hacking#
Reward shaping in RL is challenging. Reward hacking occurs when an RL agent exploits flaws or ambiguities in the reward function to obtain high rewards without genuinely learning the intended behaviors or completing the task as designed. In recent years, several related concepts have been proposed, all referring to some form of reward hacking:



Update from node generate_answer
================================== Ai Message ==================================

Lilian Weng categorizes reward hacking into two types: environment or goal misspecification, and reward tampering. She considers reward hacking as a broad concept that includes both of these categories. Reward hacking occurs when an agent exploits flaws or ambiguities in the reward function to achieve high rewards without performing the intended behaviors.