翻译者知识图谱的验证 - TRAPI,Biolink模型和单跳导航
项目描述
图形验证测试
此存储库提供了在新的2023测试基础设施中实现的翻译者知识图谱验证测试运行器的实现。当前包目前包含两个这样的测试运行器
- StandardsValidationTest:是翻译者reasoner-validator包的包装器,它认证知识图谱数据访问符合TRAPI规范,图语义内容符合Biolink模型规范。
- 单跳测试: 是从遗留的 SRI_Testing 测试框架 中提取的 "One Hop" 知识图谱导航单元测试代码的简化版,该代码验证在翻译器知识图谱上进行的单跳 TRAPI 查询是否满足输出中输入测试边数据的恢复的基本期望,使用了多种不同类型的模板化 TRAPI 查询。与 SRI_Testing 不同,graph-validation-test-runners 测试运行器使用来自新 NCATS 翻译器测试 存储库的测试数据。
程序上,每种测试的命令行或程序参数相同,但底层测试用例(从源测试资源派生而来)是相同的。
使用方法
可以命令行方式或通过 Python 脚本直接运行 standards_validation_test_runner 和 *one_hop_test_runner。
安装
可以从 PyPI 安装 graph-validation-test-runners 模块,并将其作为翻译器全局自动化测试的一部分使用。
注意:需要 3.9 <= Python 版本 < 3.12
从 PyPI
在您的目标工作目录内
- 创建 Python 虚拟环境:
python -m venv venv
- 激活环境:
./venv/bin/activate
- 安装依赖项:
pip install graph-validation-test-runners
从 GitHub
您还可以从 GitHub 检出项目。如果是这样,安装过程将略有不同,因为项目本身使用 Poetry 进行依赖关系管理——以下说明假设您已经在系统中安装了 Poetry。
- 检出代码:
git checkout https://github.com/TranslatorSRI/graph-validation-test-runners.git
- 创建 Poetry 壳:
poetry shell
- 安装依赖项:
poetry install
命令行界面
在命令行终端中键入
$ standards_validation_test --help
或
$ one_hop_test --help
应提供以下使用说明(其中是 'standards_validation_test_runner' 或 'one_hop_test_runner')
usage: <tool name> [-h] [--components COMPONENTS] [--environment {dev,ci,test,prod}] --subject_id SUBJECT_ID --predicate_id PREDICATE_ID
--object_id OBJECT_ID [--trapi_version TRAPI_VERSION] [--biolink_version BIOLINK_VERSION]
[--log_level {ERROR,WARNING,INFO,DEBUG}]
Translator TRAPI and Biolink Model Validation of Knowledge Graphs
options:
-h, --help show this help message and exit
--components COMPONENTS
Names Translator components to be tested taken from the Translator Testing Model 'ComponentEnum'
(may be a comma separated string of such names; default: run the test against the 'ars')
--environment {dev,ci,test,prod}
Translator execution environment of the Translator Component targeted for testing.
--subject_id SUBJECT_ID
Statement object concept CURIE
--predicate_id PREDICATE_ID
Statement Biolink Predicate identifier
--object_id OBJECT_ID
Statement object concept CURIE
--trapi_version TRAPI_VERSION
TRAPI version expected for knowledge graph access (default: use current default release)
--biolink_version BIOLINK_VERSION
Biolink Model version expected for knowledge graph access (default: use current default release)
程序级执行
标准验证测试
运行 TRAPI 和 Biolink 模型验证测试,以验证知识图谱 TRAPI 组件的查询输出
from typing import Dict
import asyncio
from standards_validation_test_runner import run_standards_validation_tests
test_data = {
# One test edge (asset)
"subject_id": "DRUGBANK:DB01592",
"subject_category": "biolink:SmallMolecule",
"predicate_id": "biolink:has_side_effect",
"object_id": "MONDO:0011426",
"object_category": "biolink:Disease",
"components": ["arax", "molepro"]
# "environment": environment, # Optional[TestEnvEnum] = None; default: 'TestEnvEnum.ci' if not given
# "trapi_version": trapi_version, # Optional[str] = None; latest community release if not given
# "biolink_version": biolink_version, # Optional[str] = None; current Biolink Toolkit default if not given
# "runner_settings": asset.test_runner_settings, # Optional[List[str]] = None
}
results: Dict = asyncio.run(run_standards_validation_tests(**test_data))
print(results)
单跳测试
运行 "One Hop" 知识图谱导航测试,以验证知识图谱 TRAPI 组件的查询输出
from typing import Dict
import asyncio
from one_hop_test_runner import run_one_hop_tests
test_data = {
# One test edge (asset)
"subject_id": "DRUGBANK:DB01592",
"subject_category": "biolink:SmallMolecule",
"predicate_id": "biolink:has_side_effect",
"object_id": "MONDO:0011426",
"object_category": "biolink:Disease",
"components": ["arax", "molepro"]
#
# "environment": environment, # Optional[TestEnvEnum] = None; default: 'TestEnvEnum.ci' if not given
# "trapi_version": trapi_version, # Optional[str] = None; latest community release if not given
# "biolink_version": biolink_version, # Optional[str] = None; current Biolink Toolkit default if not given
# "runner_settings": asset.test_runner_settings, # Optional[List[str]] = None
}
results: Dict = asyncio.run(run_one_hop_tests(**test_data))
print(results)
上述包装方法会运行从指定的测试资源(例如 subject_id)派生的所有相关测试用例,而无需任何特殊测试参数。如果需要更精细的测试,可以直接运行底层 TRAPI 查询的子集,如下所示(在此,我们忽略了 'by_subject'、'inverse_by_new_subject' 和 'by_object' 测试用例,并将 'strict_validation' 参数指定为 True,这是在幕后运行的 reasoner-validator 代码所理解的)
from typing import Dict
import asyncio
from standards_validation_test_runner import StandardsValidationTest
from graph_validation_tests.utils.unit_test_templates import (
# by_subject,
# inverse_by_new_subject,
# by_object,
raise_subject_entity,
raise_object_entity,
raise_object_by_subject,
raise_predicate_by_subject
)
test_data = {
# One test edge (asset)
"subject_id": "DRUGBANK:DB01592",
"subject_category": "biolink:SmallMolecule",
"predicate_id": "biolink:has_side_effect",
"object_id": "MONDO:0011426",
"object_category": "biolink:Disease",
"components": ["arax", "molepro"],
"environment": "test",
"trapi_version": "1.5.0-beta",
"biolink_version": "4.1.6",
"runner_settings": "Inferred"
}
trapi_generators = [
# by_subject,
# inverse_by_new_subject,
# by_object,
raise_subject_entity,
raise_object_entity,
raise_object_by_subject,
raise_predicate_by_subject
]
# A test runner specific parameter passed through
kwargs = {
"strict_validation": True
}
results: Dict = asyncio.run(StandardsValidationTest.run_tests(
**test_data, trapi_generators=trapi_generators, **kwargs)
)
注意,trapi_generation 变量——在 graph_validation_test.utils.unit_test_templates 模块中定义——都是简单的 Python 函数,返回发送到目标组件的 TRAPI JSON 消息。原则上,如果理解了这些函数的作用,可以编写自己的方法来执行其他类型的 TRAPI 查询,然后可以验证其输出是否符合指定的 TRAPI 和 Biolink 模型版本。
直接在 TRAPI 响应输出上运行测试
新的翻译器测试框架有 "QueryRunner" 的概念,该框架准备并运行 TRAPI 查询,然后将 TRAPI 响应(带有原始测试资源)传递给测试运行器进行验证。
对于此用例,可以采用另一种脚本设计模式,大致如下
from typing import Dict
from sys import stderr
import json
from standards_validation_test_runner import StandardsValidationTest
from translator_testing_model.datamodel.pydanticmodel import TestAsset
test_data = {
# One test edge (asset)
"subject_id": "DRUGBANK:DB01592",
"subject_category": "biolink:SmallMolecule",
"predicate_id": "biolink:has_side_effect",
"object_id": "MONDO:0011426",
"object_category": "biolink:Disease",
"components": ["molepro"],
"environment": "test",
"trapi_version": "1.5.0-beta",
"biolink_version": "4.1.6",
"runner_settings": "Inferred"
}
with (open("TRAPI-Response-filename", mode="r") as trapi_json_file):
test_asset: TestAsset = TestAsset(**test_data)
trapi_response: Dict = json.load(trapi_json_file)
svt = StandardsValidationTest(
test_asset=test_asset,
environment=test_data["environment"],
component=test_data["components"][0]
)
results: Dict = svt.test_case_processor(trapi_response=trapi_response)
assert results
json.dump(results, stderr, indent=4)
请注意,即使在TestRunner中未运行TRAPI查询,也需要将源组件("infores" CURIE引用标识符,例如'molepro')和目标环境(例如'test')作为字符串传递给系统,并在StandardsValidationTest()构造函数中使用,以便正确索引'results'字典。
示例输出
这是当前测试运行输出的JSON示例(此示例来自OneHopTest运行)。
{
"pks": {
"molepro": "molepro"
},
"results": {
"TestAsset_1-by_subject": {
"molepro": {
"status": "FAILED",
"messages": {
"error": {
"error.trapi.response.knowledge_graph.missing_expected_edge": {
"global": {
"TestAsset_1|(CHEBI:58579#biolink:SmallMolecule)-[biolink:is_active_metabolite_of]->(UniProtKB:Q9NQ88#biolink:Protein)": null
}
}
}
}
}
},
"TestAsset_1-inverse_by_new_subject": {
"molepro": {
"status": "FAILED",
"messages": {
"critical": {
"critical.trapi.request.invalid": {
"global": {
"predicate 'biolink:is_active_metabolite_of'": [
{
"context": "inverse_by_new_subject",
"reason": "is an unknown or has no inverse?"
}
]
}
}
}
}
}
}
}
}
版本
提供了一份完整的变更日志,记录了每个版本的信息,但我们在此总结了关键版本的限制。
v0.0.* 版本
- 此初始代码版本仅支持测试Translator SmartAPI注册表中的组件,这些组件是针对Translator Autonomous Relay Agent (ARA)和Knowledge Providers (KP)的TRAPI实现,但不直接测试Translator Autonomous Relay System (ARS)或Translator用户界面(UI)。
项目详情
graph_validation_test_runners-0.1.5.tar.gz 的哈希值
算法 | 哈希摘要 | |
---|---|---|
SHA256 | 301f0030efc2af95b57700db4fab8b4559c34d68c251d14ac16c202efe44f02a |
|
MD5 | b73f10d02465bc68beda72b4c7b99b49 |
|
BLAKE2b-256 | 8f02745fdb8874d0c2e8681368c3075bd023eaa2f62ae9efe68d251ea329e6fa |
graph_validation_test_runners-0.1.5-py3-none-any.whl 的哈希值
算法 | 哈希摘要 | |
---|---|---|
SHA256 | 921b2800d69d8a824ba1139960746ae9012fe9a05b4689578d57ed360e9a60cd |
|
MD5 | 07d0213d8c92c828537f10e38c1cd81c |
|
BLAKE2b-256 | b95389b487ebed3ff00d17436d5b8f6ab1738282cde25e57aef81ecaf6c423bb |