pyobo · PyPI · Python 包索引

处理和编写OBO

这些详情未经过PyPI验证

项目链接

项目描述

Current version on PyPI Stable Supported Python Versions MIT License

通过OBO视角处理生物标识符、名称、同义词、xrefs、层次结构、关系和属性的工具。

使用示例

注意！PyOBO不拖泥带水。这意味着标识符中没有重复的前缀。这也意味着所有标识符都是字符串，没有例外。

注意！第一次运行这些脚本时，它们需要下载并缓存所有资源。我们不做数据再分发业务，因此所有脚本都应该是完全可复制的。如果您没有时间，可以在 pyobo.aws 中找到一些AWS工具，用于托管/下载预编译版本。

注意！PyOBO可以在有限情况下执行基座化，但它不是命名实体识别（NER）或基座化的通用解决方案。建议您检查 Gilda 以获得一个不拖泥带水的解决方案。

标识符和CURIEs的映射

获取ChEBI标识符到名称的映射

import pyobo

chebi_id_to_name = pyobo.get_id_name_mapping('chebi')

name = chebi_id_to_name['132964']
assert name == 'fluazifop-P-butyl'

或者，您没有时间写两行代码

import pyobo

name = pyobo.get_name('chebi', '132964')
assert name == 'fluazifop-P-butyl'

获取ChEBI名称到标识符的反向映射

import pyobo

chebi_name_to_id = pyobo.get_name_id_mapping('chebi')

identifier = chebi_name_to_id['fluazifop-P-butyl']
assert identifier == '132964'

可能你生活在 CURIE 世界中，只是想对像 CHEBI:132964 这样的东西进行标准化

import pyobo

name = pyobo.get_name_by_curie('CHEBI:132964')
assert name == 'fluazifop-P-butyl'

有时你可能会意外地获得一个旧的 CURIE。可以使用底层 OBO 中列出的替代标识符将其映射到更近期的版本

import pyobo

# Look up DNA-binding transcription factor activity (go:0003700)
# based on an old id
primary_curie = pyobo.get_primary_curie('go:0001071')
assert primary_curie == 'go:0003700'

# If it's already the primary, it just gets returned
assert 'go:0003700' == pyobo.get_priority_curie('go:0003700')

映射物种

一些资源为其术语提供了物种信息。将 WikiPathway 标识符映射到物种（作为 NCBI 分类学标识符）

import pyobo

wikipathways_id_to_species = pyobo.get_id_species_mapping('wikipathways')

# Apoptosis (Homo sapiens)
taxonomy_id = wikipathways_id_to_species['WP254']
assert taxonomy_id == '9606'

或者，您没有时间写两行代码

import pyobo

# Apoptosis (Homo sapiens)
taxonomy_id = pyobo.get_species('wikipathways', 'WP254')
assert taxonomy_id == '9606'

基础

也许你有一些想要尝试映射回 ChEBI 同义词的名称/同义词。对于 CHEBI:132964 的品牌名称 Fusilade II，它应该能够查找并找到其首选标签。

import pyobo

prefix, identifier, name = pyobo.ground('chebi', 'Fusilade II')
assert prefix == 'chebi'
assert identifier == '132964'
assert name == 'fluazifop-P-butyl'

# When failure happens...
prefix, identifier, name = pyobo.ground('chebi', 'Definitely not a real name')
assert prefix is None
assert identifier is None
assert name is None

如果你不确定一个名称可能属于哪个命名空间，你可以尝试连续尝试几个（优先考虑覆盖适当实体类型的那些，以避免在冲突时出现假阳性）

import pyobo

# looking for phenotypes/pathways
prefix, identifier, name = pyobo.ground(['efo', 'go'], 'ERAD')
assert prefix == 'go'
assert identifier == '0030433'
assert name == 'ubiquitin-dependent ERAD pathway'

交叉引用

从 ChEBI 到 PubChem 获取交叉引用

import pyobo

chebi_id_to_pubchem_compound_id = pyobo.get_filtered_xrefs('chebi', 'pubchem.compound')

pubchem_compound_id = chebi_id_to_pubchem_compound_id['132964']
assert pubchem_compound_id == '3033674'

如果你没有时间写两行

import pyobo

pubchem_compound_id = pyobo.get_xref('chebi', '132964', 'pubchem.compound')
assert pubchem_compound_id == '3033674'

从 Entrez 到 HGNC 获取交叉引用，但它们只能通过 HGNC 获取，所以你需要翻转它们

import pyobo

hgnc_id_to_ncbigene_id = pyobo.get_filtered_xrefs('hgnc', 'ncbigene')
ncbigene_id_to_hgnc_id = {
    ncbigene_id: hgnc_id
    for hgnc_id, ncbigene_id in hgnc_id_to_ncbigene_id.items()
}
mapt_hgnc = ncbigene_id_to_hgnc_id['4137']
assert mapt_hgnc == '6893'

由于这是一个常见的模式，有一个关键字参数 flip 会为你完成这个操作

import pyobo

ncbigene_id_to_hgnc_id = pyobo.get_filtered_xrefs('hgnc', 'ncbigene', flip=True)
mapt_hgnc_id = ncbigene_id_to_hgnc_id['4137']
assert mapt_hgnc_id == '6893'

如果你没有时间写两行（我承认这一点有点令人困惑）并且需要翻转它

import pyobo

hgnc_id = pyobo.get_xref('hgnc', '4137', 'ncbigene', flip=True)
assert hgnc_id == '6893'

基于预定义的优先级列表和 Inspector Javert 的 Xref 数据库重新映射 CURIE

import pyobo

# Map to the best source possible
mapt_ncbigene = pyobo.get_priority_curie('hgnc:6893')
assert mapt_ncbigene == 'ncbigene:4137'

# Sometimes you know you're the best. Own it.
assert 'ncbigene:4137' == pyobo.get_priority_curie('ncbigene:4137')

使用 Inspector Javert 的 Xref 数据库找到映射到给定 CURIE 的所有 CURIE

import pyobo

# Get a set of all CURIEs mapped to MAPT
mapt_curies = pyobo.get_equivalent('hgnc:6893')
assert 'ncbigene:4137' in mapt_curies
assert 'ensembl:ENSG00000186868' in mapt_curies

如果你不想等待在本地构建数据库以用于 pyobo.get_priority_curie 和 pyobo.get_equivalent，你可以使用以下代码从 Zenodo 下载一个版本

import pyobo.resource_utils

pyobo.resource_utils.ensure_inspector_javert()

属性

获取属性，如 SMILES。这些的语义是在 OBO-OBO 基础上定义的。

import pyobo

# I don't make the rules. I wouldn't have chosen this as the key for this property. It could be any string
chebi_smiles_property = 'http://purl.obolibrary.org/obo/chebi/smiles'
chebi_id_to_smiles = pyobo.get_filtered_properties_mapping('chebi', chebi_smiles_property)

smiles = chebi_id_to_smiles['132964']
assert smiles == 'C1(=CC=C(N=C1)OC2=CC=C(C=C2)O[C@@H](C(OCCCC)=O)C)C(F)(F)F'

如果你没有时间写两行

import pyobo

smiles = pyobo.get_property('chebi', '132964', 'http://purl.obolibrary.org/obo/chebi/smiles')
assert smiles == 'C1(=CC=C(N=C1)OC2=CC=C(C=C2)O[C@@H](C(OCCCC)=O)C)C(F)(F)F'

层次结构

检查一个实体是否在层次结构中

import networkx as nx
import pyobo

# check that go:0008219 ! cell death is an ancestor of go:0006915 ! apoptotic process
assert 'go:0008219' in pyobo.get_ancestors('go', '0006915')

# check that go:0070246 ! natural killer cell apoptotic process is a
# descendant of go:0006915 ! apoptotic process
apopototic_process_descendants = pyobo.get_descendants('go', '0006915')
assert 'go:0070246' in apopototic_process_descendants

获取给定节点下的子层次结构

# get the descendant graph of go:0006915 ! apoptotic process
apopototic_process_subhierarchy = pyobo.get_subhierarchy('go', '0006915')

# check that go:0070246 ! natural killer cell apoptotic process is a
# descendant of go:0006915 ! apoptotic process through the subhierarchy
assert 'go:0070246' in apopototic_process_subhierarchy

获取在节点数据字典中预加载属性的自定义层次结构

import pyobo

prop = 'http://purl.obolibrary.org/obo/chebi/smiles'
chebi_hierarchy = pyobo.get_hierarchy('chebi', properties=[prop])

assert 'chebi:132964' in chebi_hierarchy
assert prop in chebi_hierarchy.nodes['chebi:132964']
assert chebi_hierarchy.nodes['chebi:132964'][prop] == 'C1(=CC=C(N=C1)OC2=CC=C(C=C2)O[C@@H](C(OCCCC)=O)C)C(F)(F)F'

关系

获取 HGNC 和 MGI 之间的所有同源关系（注意：这是单向的）

>>> import pyobo
>>> human_mapt_hgnc_id = '6893'
>>> mouse_mapt_mgi_id = '97180'
>>> hgnc_mgi_orthology_mapping = pyobo.get_relation_mapping('hgnc', 'ro:HOM0000017', 'mgi')
>>> assert mouse_mapt_mgi_id == hgnc_mgi_orthology_mapping[human_mapt_hgnc_id]

如果你想一行完成

>>> import pyobo
>>> human_mapt_hgnc_id = '6893'
>>> mouse_mapt_mgi_id = '97180'
>>> assert mouse_mapt_mgi_id == pyobo.get_relation('hgnc', 'ro:HOM0000017', 'mgi', human_mapt_hgnc_id)

使用 PyOBO 编写的测试

如果你正在编写自己的代码，它依赖于 PyOBO，并且在持续集成环境中对其进行了单元测试（你应该这样做），你可能已经意识到在每次构建时加载所有资源并不是那么快。在这些情况下，你可以使用一些预先构建的补丁，如下所示

import unittest
import pyobo
from pyobo.mocks import get_mock_id_name_mapping

mock_id_name_mapping = get_mock_id_name_mapping({
    'chebi': {
        '132964': 'fluazifop-P-butyl',
    },
})

class MyTestCase(unittest.TestCase):
    def my_test(self):
        with mock_id_name_mapping:
            # use functions directly, or use your functions that wrap them
            pyobo.get_name('chebi', '1234')

安装

PyOBO 可以使用以下命令从 PyPI 安装

$ pip install pyobo

它可以从 GitHub 以开发模式安装

$ git clone https://github.com/pyobo/pyobo.git
$ cd pyobo
$ pip install -e .

生物登记的维护

为了标准化引用和识别资源，PyOBO 使用 Bioregistry。它曾经是 PyOBO 的一部分，但后来为了更广泛的重用而外部化。

在 src/pyobo/registries/metaregistry.json 中是经过编纂的“元登记”。这是一个包含所有各种修复 MIRIAM、OLS 和 OBO Foundry 中缺失/错误信息的来源；不包含在任何其中的条目；每个命名空间/前缀的附加同义词信息；用于标准化 xrefs 和 CURIEs 的规则等。

元登记中的其他条目

“remappings”->“full”条目是从给定 OBO 文件中可能跟随“xref:”的字符串到需要完全替换的字符串的字典
“remappings”->“prefix”条目包含需要重新映射的 xrefs 前缀的字典。例如，一些规则会删除 CURIE 内部出现的多余空格，而其他规则则解决 GOGO 问题的实例。
“blacklists”条目包含基于完整字符串、仅前缀或仅后缀抛出格式不正确 xrefs 的规则。

故障排除

关于OBO Foundry的URL到OBO资源的稳定性似乎不太稳定。如果您遇到以下错误

pyobo.getters.MissingOboBuild: OBO Foundry is missing a build for: mondo

那么您应该检查OBO Foundry上的相应页面（在本例中为http://www.obofoundry.org/ontology/mondo.html）并更新Bioregistry中该命名空间的url条目。

项目详情

这些详情未经过PyPI验证

项目链接

发布历史发布通知 | RSS源

本版本

0.10.12

2024年9月23日

0.10.11

2024年4月18日

0.10.10

2024年4月17日

0.10.9

2024年4月9日

0.10.8

2024年3月13日

0.10.7

2024年1月16日

0.10.6

2023年12月1日

0.10.5

2023年10月30日

0.10.4

2023年10月19日

0.10.3

2023年9月26日

0.10.2

2023年9月1日

0.10.1

2023年8月18日

0.10.0

2023年8月12日

0.9.2

2023年7月26日

0.9.1

2023年7月2日

0.9.0

2023年6月29日

0.8.14

2023年4月5日

0.8.13

2023年3月18日

0.8.12

2023年2月25日

0.8.11

2023年2月25日

0.8.10

2023年2月24日

0.8.9

2023年2月24日

0.8.8

2023年2月23日

0.8.7

2023年2月21日

0.8.6

2023年2月21日

0.8.5

2023年2月13日

0.8.4

2022年11月9日

0.8.3

2022年9月27日

0.8.2

2022年7月26日

0.8.1

2022年7月26日

0.8.0

2022年7月19日

0.7.0

2021年12月16日

0.6.5

2021年11月5日

0.6.4

2021年11月4日

0.6.3

2021年8月30日

0.6.2

2021年8月25日

0.6.1

2021年8月25日

0.6.0

2021年8月2日

0.5.0

2021年7月6日

0.4.0

2021年4月19日

0.3.3

2021年2月7日

0.3.2

2021年2月1日

0.3.1

2021年1月7日

0.3.0

2020年12月30日

0.2.15

2020年11月26日

0.2.14

2020年11月26日

0.2.13

2020年10月8日

0.2.12

2020年10月5日

0.2.11

2020年9月29日

0.2.10

2020年9月23日

0.2.9

2020年9月18日

0.2.8

2020年9月12日

0.2.7

2020年9月9日

0.2.6

2020年8月3日

0.2.5

2020年7月30日

0.2.4

2020年5月30日

0.2.3

2020年5月13日

0.2.2

2020年5月7日

0.2.1

2020年4月22日

0.2.0 已撤销

2020年4月22日

0.1.3

2020年4月20日

0.1.2

2020年4月20日

0.1.1

2020年4月1日

0.1.0

2020年3月28日

0.0.7

2020年3月18日

0.0.6

2020年3月15日

0.0.5

2020年3月12日

0.0.4

2020年3月12日

0.0.3

2020年3月11日

0.0.2

2020年3月6日

0.0.1

2019年8月30日

下载文件

下载适用于您的平台文件。如果您不确定选择哪个，请了解更多关于安装包的信息。

源分发

pyobo-0.10.12.tar.gz (23.1 MB 查看哈希值)

上传时间 2024年9月23日 源

构建分发

pyobo-0.10.12-py3-none-any.whl (23.1 MB 查看哈希值)

上传时间 2024年9月23日 Python 3

pyobo-0.10.12.tar.gz的哈希值

pyobo-0.10.12.tar.gz的哈希值
算法	哈希摘要
SHA256	`beab1f616852c486e512e6c4f4ab0c88261ead90fc1dae06c950c56ee2a2ebc2`
MD5	`29468ade456eac88b667146fa63fe3d6`
BLAKE2b-256	`f94b3e3c65f5096053c953bf87707b665e6e295ed93b68e403ddcee65fa64db6`

pyobo-0.10.12-py3-none-any.whl的哈希值

pyobo-0.10.12-py3-none-any.whl的哈希值
算法	哈希摘要
SHA256	`488c6e45633295dfb302c457beaf02b2edb93fc8351f2138dfd7bdb6efed361f`
MD5	`8228c8c865a89851e23f60c6e566f80a`
BLAKE2b-256	`402900114d62a95f2dc0a9296d81774663b422117174aefdb211b2a828f4fda4`

pyobo 0.10.12

导航

验证详情

维护者

元数据

未验证详情

项目链接

元数据

分类器

项目描述

使用示例

标识符和CURIEs的映射

映射物种

基础

交叉引用

属性

层次结构

关系

使用 PyOBO 编写的测试

安装

生物登记的维护

故障排除

项目详情

验证详情

维护者

元数据

未验证详情

项目链接

元数据

分类器

发布历史发布通知 | RSS源

下载文件

源分发

构建分发

pyobo 0.10.12

导航

验证详情

维护者

元数据

未验证详情

项目链接

元数据

分类器

项目描述

使用示例

标识符和CURIEs的映射

映射物种

基础

交叉引用

属性

层次结构

关系

使用 PyOBO 编写的测试

安装

生物登记的维护

故障排除

项目详情

验证详情

维护者

元数据

未验证详情

项目链接

元数据

分类器

发布历史 发布通知 | RSS源

下载文件

源分发

构建分发

发布历史发布通知 | RSS源