Auto-Retrieval: RAG的智能進化原創

發布于 2024-10-23 10:21

瀏覽

0收藏

Auto-Retrieval是一種高級的RAG技術，它在啟動向量數據庫檢索之前使用Agent LLM動態推斷元數據過濾器參數和語義查詢，而不是將用戶查詢直接發送到向量數據庫檢索接口（例如密集向量搜索）的樸素RAG。您可以將其視為查詢擴展/重寫的一種形式，也可以將其視為函數調用的一種特定形式；后文我們將給出實現邏輯和代碼。達到效果如下：

用戶輸入

Give me a summary of the SWE-bench paper

推理結果

改寫查詢: summary of the SWE-bench paper
過濾參數: {"filters": [{"key": "file_name", "value": "swebench.pdf", "operator": "=="}], "condition": "and"}

實現步驟

我們借助LlamaCloud來實現，主要通過在LlamaCloud檢索器上設置一個Auto-Retrieval功能。在高層次上，我們的自動檢索函數使用一個調用函數的LLM來推斷用戶查詢的元數據過濾器——比僅僅使用原始語義查詢產生更精確和相關的檢索結果。

定義一個自定義prompt來生成元數據過濾器
給定一個用戶查詢，首先執行塊級檢索，從檢索到的塊中動態召回元數據。
在auto-retrieval prompt中注入元數據作為少量示例。目的是向LLM展示現有的、相關的元數據值示例，以便LLM可以推斷出正確的元數據過濾器。

文檔級檢索器返回整個文件級別的文檔，而塊級檢索器返回特定的塊，實現如此簡單。

from llama_index.indices.managed.llama_cloud import LlamaCloudIndex
import os


index = LlamaCloudIndex(
  name="research_papers_page",
  project_name="llamacloud_demo",
  api_key=os.environ["LLAMA_CLOUD_API_KEY"]
)


doc_retriever = index.as_retriever(
    retrieval_mode="files_via_content",
    # retrieval_mode="files_via_metadata",
    files_top_k=1
)


chunk_retriever = index.as_retriever(
    retrieval_mode="chunks",
    rerank_top_n=5
)

代碼實現

接下來我們將根據上面的流程給出實現代碼：

from llama_index.core.prompts import ChatPromptTemplate
from llama_index.core.vector_stores.types import VectorStoreInfo, VectorStoreQuerySpec, MetadataInfo, MetadataFilters
from llama_index.core.retrievers import BaseRetriever
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core import Response


import json


SYS_PROMPT = """\
Your goal is to structure the user's query to match the request schema provided below.
You MUST call the tool in order to generate the query spec.


<< Structured Request Schema >>
When responding use a markdown code snippet with a JSON object formatted in the \
following schema:


{schema_str}


The query string should contain only text that is expected to match the contents of \
documents. Any conditions in the filter should not be mentioned in the query as well.


Make sure that filters only refer to attributes that exist in the data source.
Make sure that filters take into account the descriptions of attributes.
Make sure that filters are only used as needed. If there are no filters that should be \
applied return [] for the filter value.\


If the user's query explicitly mentions number of documents to retrieve, set top_k to \
that number, otherwise do not set top_k.


The schema of the metadata filters in the vector db table is listed below, along with some example metadata dictionaries from relevant rows.
The user will send the input query string.


Data Source:
```json
{info_str}
```


Example metadata from relevant chunks:
{example_rows}


"""


example_rows_retriever = index.as_retriever(
    retrieval_mode="chunks",
    rerank_top_n=4
)


def get_example_rows_fn(**kwargs):
    """Retrieve relevant few-shot examples."""
    query_str = kwargs["query_str"]
    nodes = example_rows_retriever.retrieve(query_str)
    # get the metadata, join them
    metadata_list = [n.metadata for n in nodes]


    return "\n".join([json.dumps(m) for m in metadata_list])
        
    


# TODO: define function mapping for `example_rows`.
chat_prompt_tmpl = ChatPromptTemplate.from_messages(
    [
        ("system", SYS_PROMPT),
        ("user", "{query_str}"),
    ],
    function_mappings={
        "example_rows": get_example_rows_fn
    }
)




## NOTE: this is a dataclass that contains information about the metadata
vector_store_info = VectorStoreInfo(
    content_info="contains content from various research papers",
    metadata_info=[
        MetadataInfo(
            name="file_name",
            type="str",
            description="Name of the source paper",
        ),
    ],
)


def auto_retriever_rag(query: str, retriever: BaseRetriever) -> Response:
    """Synthesizes an answer to your question by feeding in an entire relevant document as context."""
    print(f"> User query string: {query}")
    # Use structured predict to infer the metadata filters and query string.
    query_spec = llm.structured_predict(
        VectorStoreQuerySpec,
        chat_prompt_tmpl,
        info_str=vector_store_info.json(indent=4),
        schema_str=VectorStoreQuerySpec.schema_json(indent=4),
        query_str=query
    )
    # build retriever and query engine
    filters = MetadataFilters(filters=query_spec.filters) if len(query_spec.filters) > 0 else None
    print(f"> Inferred query string: {query_spec.query}")
    if filters:
        print(f"> Inferred filters: {filters.json()}")
    query_engine = RetrieverQueryEngine.from_args(
        retriever, 
        llm=llm,
        response_mode="tree_summarize"
    )
    # run query
    return query_engine.query(query_spec.query)

效果展示

auto_doc_rag("Give me a summary of the SWE-bench paper") 
print(str(response))

> User query string: Give me a summary of the SWE-bench paper
> Inferred query string: summary of the SWE-bench paper
> Inferred filters: {"filters": [{"key": "file_name", "value": "swebench.pdf", "operator": "=="}], "condition": "and"}
The construction of SWE-Bench involves a three-stage pipeline:


1. **Repo Selection and Data Scraping**: Pull requests (PRs) are collected from 12 popular open-source Python repositories on GitHub, resulting in approximately 90,000 PRs. These repositories are chosen for their popularity, better maintenance, clear contributor guidelines, and extensive test coverage.


2. **Attribute-Based Filtering**: Candidate tasks are created by selecting merged PRs that resolve a GitHub issue and make changes to the test files of the repository. This indicates that the user likely contributed tests to check whether the issue has been resolved.


3. **Execution-Based Filtering**: For each candidate task, the PR’s test content is applied, and the associated test results are logged before and after the PR’s other content is applied. Tasks are filtered out if they do not have at least one test where its status changes from fail to pass or if they result in installation or runtime errors.


Through these stages, the original 90,000 PRs are filtered down to 2,294 task instances that comprise SWE-Bench.

本文轉載自公眾號哎呀AIYA

原文鏈接：??https://mp.weixin.qq.com/s/wcmJ3OQzDxx_ILo_m7zA2Q??

?著作權歸作者所有，如需轉載，請注明出處，否則將追究法律責任

標簽

RAG

已于2024-10-23 10:22:56修改

贊

回復