RAG

AgentScope provides built-in support for Retrieval-Augmented Generation (RAG). This page demonstrates how to use the RAG module, how to build multimodal knowledge bases, and how to integrate RAG with ReActAgent in both agentic and generic manners.

AgentScope does not require you to use the built-in RAG module. Integrating third-party RAG implementations, frameworks, or services is fully supported and encouraged.

RAG Module Architecture

The RAG module is composed of two core components:

Reader — reads and chunks input documents into Document objects.
Knowledge — stores documents in a vector database and implements retrieval algorithms.

Integration Approaches

When integrating RAG with ReActAgent, you can choose between two approaches:

Integration Manner	Description	Advantages	Disadvantages
Agentic	RAG is registered as a tool; the agent decides when to retrieve and what query to use.	Flexible query rewriting using full conversation context; retrieves only when needed.	Requires strong reasoning and tool-use capabilities from the LLM.
Generic	Knowledge is retrieved at the start of every reply and prepended to the user message.	Simple to implement; works with less capable LLMs.	Always retrieves even when unnecessary; may increase latency.

Built-in Readers

Readers are responsible for loading data and chunking it into Document objects. Each Document contains:

metadata — document content, doc_id, chunk_id, and total_chunks.
embedding — embedding vector, filled when the document is added to or retrieved from the knowledge base.
score — relevance score, filled during retrieval.

TextReader

TextReader reads and chunks plain text into paragraph-level (or character-level) Document objects.

import asyncio
import json
from agentscope.rag import TextReader, Document

async def example_text_reader() -> list[Document]:
    reader = TextReader(chunk_size=512, split_by="paragraph")

    documents = await reader(
        text=(
            "I'm John Doe, 28 years old.\n"
            "I live in San Francisco. I work at OpenAI as a "
            "software engineer. I love hiking and photography.\n"
            "My father is Michael Doe, a doctor. I'm very proud of him. "
            "My mother is Sarah Doe, a teacher. She is very kind and "
            "always helps me with my studies.\n"
            "I'm now a PhD student at Stanford University, majoring in "
            "Computer Science. My advisor is Prof. Jane Williams, who is "
            "a leading expert in artificial intelligence.\n"
            "My best friend is James Smith.\n"
        ),
    )

    print("Number of chunks:", len(documents))
    for idx, doc in enumerate(documents):
        print(f"Document #{idx}")
        print("  Score:", doc.score)
        print("  Metadata:", json.dumps(doc.metadata, indent=2))

    return documents

docs = asyncio.run(example_text_reader())

There is no universally optimal chunk size or splitting strategy. For PDF files and domain-specific content, implementing a custom reader tailored to your scenario is strongly recommended. To create one, inherit from ReaderBase and implement the __call__ method.

Building a Knowledge Base

After chunking documents, create a knowledge base by providing an embedding model and an embedding store (vector database). AgentScope provides built-in support for Qdrant as the embedding store and SimpleKnowledge as the knowledge base implementation.

import asyncio
import os
from agentscope.rag import TextReader, SimpleKnowledge, QdrantStore
from agentscope.embedding import DashScopeTextEmbedding

async def build_knowledge_base() -> SimpleKnowledge:
    # Read and chunk documents
    reader = TextReader(chunk_size=512, split_by="paragraph")
    documents = await reader(
        text=(
            "I'm John Doe, 28 years old.\n"
            "I live in San Francisco. I work at OpenAI as a "
            "software engineer. I love hiking and photography.\n"
            "My father is Michael Doe, a doctor. I'm very proud of him. "
            "My mother is Sarah Doe, a teacher. She is very kind and "
            "always helps me with my studies.\n"
            "I'm now a PhD student at Stanford University, majoring in "
            "Computer Science. My advisor is Prof. Jane Williams, who is "
            "a leading expert in artificial intelligence.\n"
            "My best friend is James Smith.\n"
        ),
    )

    knowledge = SimpleKnowledge(
        embedding_model=DashScopeTextEmbedding(
            api_key=os.environ["DASHSCOPE_API_KEY"],
            model_name="text-embedding-v4",
            dimensions=1024,
        ),
        embedding_store=QdrantStore(
            location=":memory:",  # Use in-memory storage; supports local files and remote servers
            collection_name="my_collection",
            dimensions=1024,
        ),
    )

    # Add documents to the knowledge base
    await knowledge.add_documents(documents)

    # Retrieve relevant documents for a query
    results = await knowledge.retrieve(
        query="Who is John Doe's father?",
        limit=3,
        score_threshold=0.5,
    )

    for doc in results:
        print(doc)

    return knowledge

knowledge = asyncio.run(build_knowledge_base())

The QdrantStore location parameter supports in-memory storage (:memory:), local file paths, and remote Qdrant server URLs. Refer to the Qdrant documentation for details.

retrieve_knowledge as a Tool Function

SimpleKnowledge exposes a retrieve_knowledge method that wraps retrieve into a tool-compatible function. You can register it directly in an agent’s Toolkit:

toolkit.register_tool_function(
    knowledge.retrieve_knowledge,
    func_description=(
        "Retrieve documents relevant to the given query. "
        "Use this tool when you need to find information about John Doe."
    ),
)

Customizing RAG Components

AgentScope provides base classes for building custom readers, knowledge bases, and embedding stores:

Base Class	Description	Abstract Methods
`ReaderBase`	Base class for all readers.	`__call__`
`VDBStoreBase`	Base class for vector database stores.	`add`, `search`, `get_client` (optional), `delete` (optional)
`KnowledgeBase`	Base class for knowledge bases.	`retrieve`, `add_documents`

The get_client method in VDBStoreBase exposes the underlying vector database client directly, enabling advanced features such as index management and custom search configurations.

Integrating with ReActAgent

Agentic Manner

In agentic manner, retrieve_knowledge is registered as a tool in the agent’s Toolkit. The agent autonomously decides when to retrieve and rewrites the query using full conversation context.

import asyncio
import os
from agentscope.agent import ReActAgent
from agentscope.formatter import DashScopeChatFormatter
from agentscope.model import DashScopeChatModel
from agentscope.message import Msg
from agentscope.tool import Toolkit

async def example_agentic_manner() -> None:
    toolkit = Toolkit()

    agent = ReActAgent(
        name="Friday",
        sys_prompt="You are a helpful assistant named Friday.",
        model=DashScopeChatModel(
            api_key=os.environ["DASHSCOPE_API_KEY"],
            model_name="qwen-max",
        ),
        formatter=DashScopeChatFormatter(),
        toolkit=toolkit,
    )

    # First turn — introduce context without RAG
    await agent(Msg("user", "John Doe is my best friend.", "user"))

    # Register retrieve_knowledge after the agent knows who John Doe is
    toolkit.register_tool_function(
        knowledge.retrieve_knowledge,
        func_description=(
            "Retrieve documents relevant to the given query. "
            "Use this tool when you need to find information about John Doe."
        ),
    )

    # Second turn — agent should rewrite the vague query using context
    await agent(Msg("user", "Do you know who his father is?", "user"))

asyncio.run(example_agentic_manner())

In the second turn, the agent rewrites “his father” into a specific query such as “John Doe’s father” using the conversation history, then retrieves the relevant document.

Generic Manner

In generic manner, pass the knowledge object directly to ReActAgent. The agent automatically retrieves relevant documents at the start of each reply and prepends them to the user message.

import asyncio
import os
from agentscope.agent import ReActAgent
from agentscope.formatter import DashScopeChatFormatter
from agentscope.model import DashScopeChatModel
from agentscope.message import Msg

async def example_generic_manner() -> None:
    agent = ReActAgent(
        name="Friday",
        sys_prompt="You are a helpful assistant named Friday.",
        model=DashScopeChatModel(
            api_key=os.environ["DASHSCOPE_API_KEY"],
            model_name="qwen-max",
        ),
        formatter=DashScopeChatFormatter(),
        knowledge=knowledge,  # Pass the knowledge base here
    )

    await agent(Msg("user", "Do you know who John Doe's father is?", "user"))

asyncio.run(example_generic_manner())

Multimodal RAG

AgentScope supports multimodal RAG natively:

DashScopeMultiModalEmbedding embeds text, images, and other modalities into the same vector space.
ImageReader reads image files into Document objects with image metadata.

import asyncio
import os
from agentscope.agent import ReActAgent
from agentscope.embedding import DashScopeMultiModalEmbedding
from agentscope.formatter import DashScopeChatFormatter
from agentscope.message import Msg
from agentscope.model import DashScopeChatModel
from agentscope.rag import ImageReader, SimpleKnowledge, QdrantStore

async def example_multimodal_rag() -> None:
    # Read an image file into a Document
    reader = ImageReader()
    docs = await reader(image_url="./example.jpg")

    # Build a multimodal knowledge base
    knowledge = SimpleKnowledge(
        embedding_model=DashScopeMultiModalEmbedding(
            api_key=os.environ["DASHSCOPE_API_KEY"],
            model_name="multimodal-embedding-v1",
            dimensions=1024,
        ),
        embedding_store=QdrantStore(
            location=":memory:",
            collection_name="multimodal_collection",
            dimensions=1024,
        ),
    )
    await knowledge.add_documents(docs)

    # Use a vision-capable model for the agent
    agent = ReActAgent(
        name="Friday",
        sys_prompt="You are a helpful assistant named Friday.",
        model=DashScopeChatModel(
            api_key=os.environ["DASHSCOPE_API_KEY"],
            model_name="qwen-vl-max",
        ),
        formatter=DashScopeChatFormatter(),
        knowledge=knowledge,
    )

    await agent(Msg("user", "What's my name?", "user"))

asyncio.run(example_multimodal_rag())

Context and Memory

Memory backends for storing and managing session messages.

Agent

ReActAgent internals, tool registration, and reply lifecycle.

Get Started

Tutorial

Out-of-box Agents

Basic Concepts

Building Blocks

Observe & Evaluate

Tune Agent

Deploy & Serve

Others

RAG Module Architecture

Integration Approaches

Built-in Readers

TextReader

Building a Knowledge Base

retrieve_knowledge as a Tool Function

Customizing RAG Components

Integrating with ReActAgent

Agentic Manner

Generic Manner

Multimodal RAG

Further Reading

Context and Memory

Agent

Get Started

Tutorial

Out-of-box Agents

Basic Concepts

Building Blocks

Observe & Evaluate

Tune Agent

Deploy & Serve

Others

​RAG Module Architecture

​Integration Approaches

​Built-in Readers

​TextReader

​Building a Knowledge Base

​retrieve_knowledge as a Tool Function

​Customizing RAG Components

​Integrating with ReActAgent

​Agentic Manner

​Generic Manner

​Multimodal RAG

​Further Reading

Context and Memory

Agent

RAG Module Architecture

Integration Approaches

Built-in Readers

TextReader

Building a Knowledge Base

retrieve_knowledge as a Tool Function

Customizing RAG Components

Integrating with ReActAgent

Agentic Manner

Generic Manner

Multimodal RAG

Further Reading