翻译者知识图谱的验证 - TRAPI,Biolink模型和单跳导航
项目描述
图形验证测试
此存储库提供了在新的2023测试基础设施中实现的翻译者知识图谱验证测试运行器的实现。当前包目前包含两个这样的测试运行器
- StandardsValidationTest:是翻译者reasoner-validator包的包装器,它认证知识图谱数据访问符合TRAPI规范,图语义内容符合Biolink模型规范。
- 单跳测试: 是从遗留的 SRI_Testing 测试框架 中提取的 "One Hop" 知识图谱导航单元测试代码的简化版,该代码验证在翻译器知识图谱上进行的单跳 TRAPI 查询是否满足输出中输入测试边数据的恢复的基本期望,使用了多种不同类型的模板化 TRAPI 查询。与 SRI_Testing 不同,graph-validation-test-runners 测试运行器使用来自新 NCATS 翻译器测试 存储库的测试数据。
程序上,每种测试的命令行或程序参数相同,但底层测试用例(从源测试资源派生而来)是相同的。
使用方法
可以命令行方式或通过 Python 脚本直接运行 standards_validation_test_runner 和 *one_hop_test_runner。
安装
可以从 PyPI 安装 graph-validation-test-runners 模块,并将其作为翻译器全局自动化测试的一部分使用。
注意:需要 3.9 <= Python 版本 < 3.12
从 PyPI
在您的目标工作目录内
- 创建 Python 虚拟环境:python -m venv venv
- 激活环境:./venv/bin/activate
- 安装依赖项:pip install graph-validation-test-runners
从 GitHub
您还可以从 GitHub 检出项目。如果是这样,安装过程将略有不同,因为项目本身使用 Poetry 进行依赖关系管理——以下说明假设您已经在系统中安装了 Poetry。
- 检出代码:git checkout https://github.com/TranslatorSRI/graph-validation-test-runners.git
- 创建 Poetry 壳:poetry shell
- 安装依赖项:poetry install
命令行界面
在命令行终端中键入
$ standards_validation_test --help
或
$ one_hop_test --help
应提供以下使用说明(其中是 'standards_validation_test_runner' 或 'one_hop_test_runner')
usage: <tool name> [-h] [--components COMPONENTS] [--environment {dev,ci,test,prod}] --subject_id SUBJECT_ID --predicate_id PREDICATE_ID
                                 --object_id OBJECT_ID [--trapi_version TRAPI_VERSION] [--biolink_version BIOLINK_VERSION]
                                 [--log_level {ERROR,WARNING,INFO,DEBUG}]
Translator TRAPI and Biolink Model Validation of Knowledge Graphs
options:
  -h, --help            show this help message and exit
  --components COMPONENTS
                        Names Translator components to be tested taken from the Translator Testing Model 'ComponentEnum' 
                        (may be a comma separated string of such names; default: run the test against the 'ars')
  --environment {dev,ci,test,prod}
                        Translator execution environment of the Translator Component targeted for testing.
  --subject_id SUBJECT_ID
                        Statement object concept CURIE
  --predicate_id PREDICATE_ID
                        Statement Biolink Predicate identifier
  --object_id OBJECT_ID
                        Statement object concept CURIE
  --trapi_version TRAPI_VERSION
                        TRAPI version expected for knowledge graph access (default: use current default release)
  --biolink_version BIOLINK_VERSION
                        Biolink Model version expected for knowledge graph access (default: use current default release)
程序级执行
标准验证测试
运行 TRAPI 和 Biolink 模型验证测试,以验证知识图谱 TRAPI 组件的查询输出
from typing import Dict
import asyncio
from standards_validation_test_runner import run_standards_validation_tests
test_data = {
    # One test edge (asset)
    "subject_id": "DRUGBANK:DB01592",
    "subject_category": "biolink:SmallMolecule",
    "predicate_id": "biolink:has_side_effect",
    "object_id": "MONDO:0011426",
    "object_category": "biolink:Disease",
    "components": ["arax", "molepro"]
    # "environment": environment, # Optional[TestEnvEnum] = None; default: 'TestEnvEnum.ci' if not given
    # "trapi_version": trapi_version,  # Optional[str] = None; latest community release if not given
    # "biolink_version": biolink_version,  # Optional[str] = None; current Biolink Toolkit default if not given
    # "runner_settings": asset.test_runner_settings,  # Optional[List[str]] = None
}
results: Dict = asyncio.run(run_standards_validation_tests(**test_data))
print(results)
单跳测试
运行 "One Hop" 知识图谱导航测试,以验证知识图谱 TRAPI 组件的查询输出
from typing import Dict
import asyncio
from one_hop_test_runner import run_one_hop_tests
test_data = {
    # One test edge (asset)
    "subject_id": "DRUGBANK:DB01592",
    "subject_category": "biolink:SmallMolecule",
    "predicate_id": "biolink:has_side_effect",
    "object_id": "MONDO:0011426",
    "object_category": "biolink:Disease",
    "components": ["arax", "molepro"]
    #
    #     "environment": environment, # Optional[TestEnvEnum] = None; default: 'TestEnvEnum.ci' if not given
    #     "trapi_version": trapi_version,  # Optional[str] = None; latest community release if not given
    #     "biolink_version": biolink_version,  # Optional[str] = None; current Biolink Toolkit default if not given
    #     "runner_settings": asset.test_runner_settings,  # Optional[List[str]] = None
}
results: Dict = asyncio.run(run_one_hop_tests(**test_data))
print(results)
上述包装方法会运行从指定的测试资源(例如 subject_id)派生的所有相关测试用例,而无需任何特殊测试参数。如果需要更精细的测试,可以直接运行底层 TRAPI 查询的子集,如下所示(在此,我们忽略了 'by_subject'、'inverse_by_new_subject' 和 'by_object' 测试用例,并将 'strict_validation' 参数指定为 True,这是在幕后运行的 reasoner-validator 代码所理解的)
from typing import Dict
import asyncio
from standards_validation_test_runner import StandardsValidationTest
from graph_validation_tests.utils.unit_test_templates import (
    # by_subject,
    # inverse_by_new_subject,
    # by_object,
    raise_subject_entity,
    raise_object_entity,
    raise_object_by_subject,
    raise_predicate_by_subject
)
test_data = {
    # One test edge (asset)
    "subject_id": "DRUGBANK:DB01592",
    "subject_category": "biolink:SmallMolecule",
    "predicate_id": "biolink:has_side_effect",
    "object_id": "MONDO:0011426",
    "object_category": "biolink:Disease",
    "components": ["arax", "molepro"],
    "environment": "test",
    "trapi_version": "1.5.0-beta",
    "biolink_version": "4.1.6",
    "runner_settings": "Inferred"
}
trapi_generators = [
    # by_subject,
    # inverse_by_new_subject,
    # by_object,
    raise_subject_entity,
    raise_object_entity,
    raise_object_by_subject,
    raise_predicate_by_subject
]
# A test runner specific parameter passed through
kwargs = {
    "strict_validation": True
}
results: Dict = asyncio.run(StandardsValidationTest.run_tests(
    **test_data, trapi_generators=trapi_generators, **kwargs)
)
注意,trapi_generation 变量——在 graph_validation_test.utils.unit_test_templates 模块中定义——都是简单的 Python 函数,返回发送到目标组件的 TRAPI JSON 消息。原则上,如果理解了这些函数的作用,可以编写自己的方法来执行其他类型的 TRAPI 查询,然后可以验证其输出是否符合指定的 TRAPI 和 Biolink 模型版本。
直接在 TRAPI 响应输出上运行测试
新的翻译器测试框架有 "QueryRunner" 的概念,该框架准备并运行 TRAPI 查询,然后将 TRAPI 响应(带有原始测试资源)传递给测试运行器进行验证。
对于此用例,可以采用另一种脚本设计模式,大致如下
from typing import Dict
from sys import stderr
import json
from standards_validation_test_runner import StandardsValidationTest
from translator_testing_model.datamodel.pydanticmodel import TestAsset
test_data = {
    # One test edge (asset)
    "subject_id": "DRUGBANK:DB01592",
    "subject_category": "biolink:SmallMolecule",
    "predicate_id": "biolink:has_side_effect",
    "object_id": "MONDO:0011426",
    "object_category": "biolink:Disease",
    "components": ["molepro"],
    "environment": "test",
    "trapi_version": "1.5.0-beta",
    "biolink_version": "4.1.6",
    "runner_settings": "Inferred"
}
with (open("TRAPI-Response-filename", mode="r") as trapi_json_file):
    test_asset: TestAsset = TestAsset(**test_data)
    trapi_response: Dict = json.load(trapi_json_file)
    svt = StandardsValidationTest(
        test_asset=test_asset,
        environment=test_data["environment"],
        component=test_data["components"][0]
    )
    results: Dict = svt.test_case_processor(trapi_response=trapi_response)
    assert results
    json.dump(results, stderr, indent=4)
请注意,即使在TestRunner中未运行TRAPI查询,也需要将源组件("infores" CURIE引用标识符,例如'molepro')和目标环境(例如'test')作为字符串传递给系统,并在StandardsValidationTest()构造函数中使用,以便正确索引'results'字典。
示例输出
这是当前测试运行输出的JSON示例(此示例来自OneHopTest运行)。
{    
    "pks": {
        "molepro": "molepro"
    },
    "results": {
      "TestAsset_1-by_subject": {
        "molepro": {
          "status": "FAILED",
          "messages": {
            "error": {
              "error.trapi.response.knowledge_graph.missing_expected_edge": {
                "global": {
                  "TestAsset_1|(CHEBI:58579#biolink:SmallMolecule)-[biolink:is_active_metabolite_of]->(UniProtKB:Q9NQ88#biolink:Protein)": null
                }
              }
            }
          }
        }
      },
      "TestAsset_1-inverse_by_new_subject": {
        "molepro": {
          "status": "FAILED",
          "messages": {
            "critical": {
              "critical.trapi.request.invalid": {
                "global": {
                  "predicate 'biolink:is_active_metabolite_of'": [
                    {
                      "context": "inverse_by_new_subject",
                      "reason": "is an unknown or has no inverse?"
                    }
                  ]
                }
              }
            }
          }
        }
      }
    }
}
版本
提供了一份完整的变更日志,记录了每个版本的信息,但我们在此总结了关键版本的限制。
v0.0.* 版本
- 此初始代码版本仅支持测试Translator SmartAPI注册表中的组件,这些组件是针对Translator Autonomous Relay Agent (ARA)和Knowledge Providers (KP)的TRAPI实现,但不直接测试Translator Autonomous Relay System (ARS)或Translator用户界面(UI)。
项目详情
graph_validation_test_runners-0.1.5.tar.gz 的哈希值
| 算法 | 哈希摘要 | |
|---|---|---|
| SHA256 | 301f0030efc2af95b57700db4fab8b4559c34d68c251d14ac16c202efe44f02a | |
| MD5 | b73f10d02465bc68beda72b4c7b99b49 | |
| BLAKE2b-256 | 8f02745fdb8874d0c2e8681368c3075bd023eaa2f62ae9efe68d251ea329e6fa | 
graph_validation_test_runners-0.1.5-py3-none-any.whl 的哈希值
| 算法 | 哈希摘要 | |
|---|---|---|
| SHA256 | 921b2800d69d8a824ba1139960746ae9012fe9a05b4689578d57ed360e9a60cd | |
| MD5 | 07d0213d8c92c828537f10e38c1cd81c | |
| BLAKE2b-256 | b95389b487ebed3ff00d17436d5b8f6ab1738282cde25e57aef81ecaf6c423bb |