跳转到主要内容

未提供项目描述

项目描述

Dockerhub pypi Discord Contributor Covenant

GraphRAG-SDK

Try Free

GraphRAG-SDK 是一个用于构建图检索增强生成(GraphRAG)应用的综合解决方案,利用 FalkorDB 以获得最佳性能。

功能

  • 本体管理:手动或自动从非结构化数据管理本体。
  • 知识图谱(KG):构建和查询知识图谱以实现高效数据检索。
  • LLMs集成:支持OpenAI和Google Gemini模型。
  • 多代理系统:基于KG的多代理调度器。

开始使用

安装

pip install graphrag_sdk

先决条件

图数据库

GraphRAG-SDK依赖于 FalkorDB 作为其图引擎,并与OpenAI/Gemini兼容。

使用 FalkorDB Cloud 获取凭证或本地启动FalkorDB

docker run -p 6379:6379 -p 3000:3000 -it --rm  -v ./data:/data falkordb/falkordb:latest

LLM模型

目前,此SDK支持以下LLMs API

  • OpenAI 推荐模型:gpt-4o
  • google 推荐模型:gemini-1.5-flash-001

确保存在包含所有必需凭证的 .env 文件。

.env
OPENAI_API_KEY="OPENAI_API_KEY"
GOOGLE_API_KEY="GOOGLE_API_KEY"

基本用法

Open In Colab

以下示例展示了使用自动检测的本体创建GraphRAG的基本用法。

from dotenv import load_dotenv

from graphrag_sdk.source import URL
from graphrag_sdk import KnowledgeGraph, Ontology
from graphrag_sdk.models.openai import OpenAiGenerativeModel
from graphrag_sdk.model_config import KnowledgeGraphModelConfig
load_dotenv()

# Import Data
urls = ["https://www.rottentomatoes.com/m/side_by_side_2012",
"https://www.rottentomatoes.com/m/matrix",
"https://www.rottentomatoes.com/m/matrix_revolutions",
"https://www.rottentomatoes.com/m/matrix_reloaded",
"https://www.rottentomatoes.com/m/speed_1994",
"https://www.rottentomatoes.com/m/john_wick_chapter_4"]

sources = [URL(url) for url in urls]

# Model
model = OpenAiGenerativeModel(model_name="gpt-4o")

# Ontology Auto-Detection
ontology = Ontology.from_sources(
    sources=sources,
    model=model,
)

# Knowledge Graph
kg = KnowledgeGraph(
    name="movies",
    model_config=KnowledgeGraphModelConfig.with_model(model),
    ontology=ontology,
)

# GraphRAG System and Questioning
kg.process_sources(sources)

chat = kg.chat_session()

print(chat.send_message("Who is the director of the movie The Matrix?"))
print(chat.send_message("How this director connected to Keanu Reeves?"))

工具

Open In Colab

导入源数据

该SDK支持以下文件格式

  • PDF
  • TEXT
  • JSONL
  • URL
  • HTML
  • CSV
import os
from graphrag_sdk.source import Source

src_files = "data_folder"
sources = []

# Create a Source object.
for file in os.listdir(src_files):
    sources.append(Source(os.path.join(src_files, file)))

本体

您可以从数据中自动检测本体,或者手动定义它。此外,您可以为本体自动检测设置边界

一旦创建本体,您可以在使用它来构建知识图谱(KG)之前,根据需要对其进行审查、修改和更新。

import random
from falkordb import FalkorDB
from graphrag_sdk import KnowledgeGraph, Ontology
from graphrag_sdk.models.openai import OpenAiGenerativeModel

# Define the percentage of files that will be used to auto-create the ontology.
percent = 0.1  # This represents 10%. You can adjust this value (e.g., 0.2 for 20%).

boundaries = """
    Extract only the most relevant information about UFC fighters, fights, and events.
    Avoid creating entities for details that can be expressed as attributes.
"""

# Define the model to be used for the ontology
model = OpenAiGenerativeModel(model_name="gpt-4o")

# Randomly select a percentage of files from sources.
sampled_sources = random.sample(sources, round(len(sources) * percent))

ontology = Ontology.from_sources(
    sources=sampled_sources,
    boundaries=boundaries,
    model=model,
)

# Save the ontology to the disk as a json file.
with open("ontology.json", "w", encoding="utf-8") as file:
    file.write(json.dumps(ontology.to_json(), indent=2))

在生成初始本体之后,您可以对其进行审查,并根据您的数据和需求进行必要的修改,这可能包括精炼实体类型或调整关系。

一旦您对本体满意,您就可以继续使用它来创建和管理您的知识图谱(KG)。

知识图谱

现在,您可以使用SDK从您的来源和本体创建知识图谱(KG)。

# After approving the ontology, load it from disk.
ontology_file = "ontology.json"
with open(ontology_file, "r", encoding="utf-8") as file:
    ontology = Ontology.from_json(json.loads(file.read()))

kg = KnowledgeGraph(
    name="kg_name",
    model_config=KnowledgeGraphModelConfig.with_model(model),
    ontology=ontology,
)

kg.process_sources(sources)

您可以通过使用process_sources方法处理更多来源来随时更新KG。

图RAG

到此为止,您已经有一个可以使用此SDK进行查询的知识图谱。您可以使用ask方法进行单次提问或使用chat_session进行对话。

# Single question.
response = kg.ask("What were the last five fights? When were they? How many rounds did they have?")
print(response)

# Conversation.
chat = kg.chat_session()
response = chat.send_message("Who is Salsa Boy?")
print(response)
response = chat.send_message("Tell me about one of his fights?")
print(response)

多代理 - 编排器

Open In Colab

GraphRAG-SDK支持KG代理。每个代理都是对其所学习数据的专家,而编排器则负责协调代理。

代理

请参阅基本用法部分,了解如何为代理创建KG对象。

# Define the model
model = OpenAiGenerativeModel("gpt-4o")

# Create the KG from the predefined ontology.
# In this example, we will use the restaurants agent and the attractions agent.
restaurants_kg = KnowledgeGraph(
    name="restaurants",
    ontology=restaurants_ontology,
    model_config=KnowledgeGraphModelConfig.with_model(model),
)
attractions_kg = KnowledgeGraph(
    name="attractions",
    ontology=attractions_ontology,
    model_config=KnowledgeGraphModelConfig.with_model(model),
)


# The following agent is specialized in finding restaurants.
restaurants_agent = KGAgent(
    agent_id="restaurants_agent",
    kg=restaurants_kg,
    introduction="I'm a restaurant agent, specialized in finding the best restaurants for you.",
)

# The following agent is specialized in finding tourist attractions.
attractions_agent = KGAgent(
    agent_id="attractions_agent",
    kg=attractions_kg,
    introduction="I'm an attractions agent, specialized in finding the best tourist attractions for you.",
)

编排器 - 多代理系统

编排器管理代理的使用并处理提问。

# Initialize the orchestrator while giving it the backstory.
orchestrator = Orchestrator(
    model,
    backstory="You are a trip planner, and you want to provide the best possible itinerary for your clients.",
)

# Register the agents that we created above.
orchestrator.register_agent(restaurants_agent)
orchestrator.register_agent(attractions_agent)

# Query the orchestrator.
runner = orchestrator.ask("Create a two-day itinerary for a trip to Rome. Please don't ask me any questions; just provide the best itinerary you can.")
print(runner.output)

支持

与我们社区联系以获取支持和讨论。如果您有任何问题,请通过以下方法之一与我们联系

项目详情


下载文件

下载适用于您的平台的文件。如果您不确定选择哪个,请了解更多关于安装包的信息。

源分布

graphrag_sdk-0.2.1.tar.gz (41.0 kB 查看散列值)

上传时间

支持者

AWS AWS 云计算和安全赞助商 Datadog Datadog 监控 Fastly Fastly CDN Google Google 下载分析 Microsoft Microsoft PSF 赞助商 Pingdom Pingdom 监控 Sentry Sentry 错误日志 StatusPage StatusPage 状态页