encodeproject · PyPI · Python 包索引

Python包，封装了encode项目的一些API。

这些详情尚未由PyPI验证

项目链接

主页

项目描述

Python包，封装了encode项目的一些API。

这里有一个简短的笔记本，其中包含教程。

我该如何安装此包？

像往常一样，只需使用pip下载即可

pip install encodeproject

使用示例

该包包含对Encode Project API进行查询的方法以及过滤响应的方法。每个可用方法都附带完整的docstring，因此欢迎您阅读源代码。

查询

该库目前提供查询方法，这些方法已经集成了某些过滤属性：一个用于实验，另一个用于生物样本。

对于查询实验，您可以运行以下命令

from encodeproject import experiment

experiments = experiment()

让我们来看一个深入示例，展示所有可用参数

from encodeproject import experiment

experiments = experiment(
    # The cell line we are interested in.
    # For example values could be K562 or GM12878.
    # We use None to specify that we are not
    # interested in any particular cell line.
    cell_line = None,
    # The reference genomic assembly we want.
    # For example values could be hg19 or GRCh38
    # We use None to specify that we are not
    # interested in any particular genomic assembly.
    assembly = None,
    # The target (the genes coding for proteins in this context) we want.
    # For example values could be CTCF or H3K27ac
    # We use None to specify that we are not
    # interested in any particular target.
    target = None,
    # The status of the data we want.
    # We only want released data, meaning data that are
    # neither old (archived) or with errors (revoked).
    status = 'released',
    # The organism we are considering.
    # Since we only want Homo sapiens data,
    # we specify that organism name.
    organism = 'Homo sapiens',
    # The format of the files we are interested in
    file_type = 'bigWig',
    # We ask to consider only experiments with replicas
    replicated = True,
    # We only want with the signals
    # expressed as "fold change over control"
    searchTerm = "fold change over control",
    # We do not need to specify any other specific
    # additional parameters
    parameters = None,
    # We want to download all the
    # available experiments
    limit = 'all',
    # We want to drop all the experiments
    # which have been characterized by significand issues
    drop_errors = (
        'extremely low read depth',
        'missing control alignments',
        'control extremely low read depth',
        'extremely low spot score',
        'extremely low coverage',
        'extremely low read length',
        'inconsistent control',
        'inconsistent read count'
    )
)

所有参数都是可选的，它们只是作为额外的过滤器。

对于查询生物样本，您可以运行以下命令

from encodeproject import biosample

my_biosample_query_response = biosample(
    accession="ENCSR000EDP", # The accession code for the desired biosample
)

对于实验，也有许多过滤器可用

hg19_samples = biosamples(
    # The list of accessions to retrieve
    accessions=accession_codes,
    # Wethever to convert the results in dataframe.
    # The following filters only apply if dataframes are used
    to_dataframe = True,
    # The status of the data we want.
    # We only want released data, meaning data that are
    # neither old (archived) or with errors (revoked).
    status = "released",
    # The organism we want.
    organism = "human",
    # The genomic assembly we want to use
    assembly = "hg19",
    # The output type we want.
    output_type = "fold change over control",
    # And finally the bare minimum amount
    # of biological replicates
    min_biological_replicates = 2
)

对于一次性运行多个生物样本查询，您可以运行以下命令

from encodeproject import biosamples

responses = biosamples(
    accessions=["ENCSR000EDP", "ENCSR030EDP", "ENCSR067EDP"], # The accessions code for the desired biosamples
)

过滤器

由于响应文件可能很大且难以阅读，我还准备了一些过滤器函数。

对于从实验响应中过滤访问号代码，您可以使用

from encodeproject import accessions

codes = accessions(my_experiment_query_response)

对于从生物样本响应中过滤下载URL，您可以使用

from encodeproject import download_urls

codes = download_urls(my_biosample_query_response)

实用工具

下载工具

我还增加了一个从给定URL下载的方法，显示一个加载条，基于StackOverflow上的这个答案。

from encodeproject import download

download("https://encode-public.s3.amazonaws.com/2012/07/01/074e1b37-2be1-4f6a-aa42-6c512fd1834b/ENCFF000XOW.bigWig")

将样本转换为DataFrame的指令

将样本转换为相对简单的pandas DataFrame的工具。

from encodeproject import biosample_to_dataframe

df = biosample_to_dataframe(my_biosample_query_response)

问题和功能请求

这个库最初是为了在encodeproject上编写一些查询而创建的。如果您需要当前库中尚未提供的特定功能，请进行拉取请求（最快的方法：自己添加功能并将其推送到库）或者您也可以打开一个问题，当我有时间时我会处理。

项目详情

这些详情尚未由PyPI验证

项目链接

主页

发布历史发布通知 | RSS订阅

本版本

1.0.28

2022年4月27日

1.0.27

2020年11月27日

1.0.26

2020年11月25日

1.0.25

2020年11月25日

1.0.24

2020年11月25日

1.0.22

2020年8月5日

1.0.21

2020年7月28日

1.0.20

2020年7月28日

1.0.19

2020年7月19日

1.0.17

2020年4月6日

1.0.16

2020年4月6日

1.0.15

2020年4月5日

1.0.14

2020年4月5日

1.0.13

2020年3月25日

1.0.12

2020年3月25日

1.0.11

2020年3月25日

1.0.10

2020年3月25日

1.0.9

2020年3月15日

1.0.8

2020年2月8日

1.0.7

2020年1月7日

1.0.6

2019年11月3日

1.0.5

2019年11月2日

1.0.4

2019年11月2日

1.0.3

2019年10月24日

1.0.2

2019年10月24日

1.0.1

2019年10月24日

1.0.0

2019年10月23日

下载文件

下载适用于您平台的文件。如果您不确定选择哪个，请了解更多关于安装包的信息。

源代码分发

encodeproject-1.0.28.tar.gz (8.5 kB 查看哈希值)

2022年4月27日 源代码

encodeproject-1.0.28.tar.gz的哈希值

encodeproject-1.0.28.tar.gz的哈希值
算法	哈希摘要
SHA256	`bfb8ba7331c385d23ea74d7065a75d265951d2308c20db86b7f3dfc42a35f1d0`
MD5	`4a4d45a8ec175c5e4bf3cc75aaa7e051`
BLAKE2b-256	`4bfdab49b0aa09189be113e20d2161986328df604e2e2935ffe35390cca6ea89`

encodeproject 1.0.28

导航

验证详情

维护者

未验证详情

项目链接

元数据

分类器

项目描述

我该如何安装此包？

使用示例

查询

过滤器

实用工具

下载工具

将样本转换为DataFrame的指令

问题和功能请求

项目详情

验证详情

维护者

未验证详情

项目链接

元数据

分类器

发布历史发布通知 | RSS订阅

下载文件

源代码分发

encodeproject 1.0.28

导航

验证详情

维护者

未验证详情

项目链接

元数据

分类器

项目描述

我该如何安装此包？

使用示例

查询

过滤器

实用工具

下载工具

将样本转换为DataFrame的指令

问题和功能请求

项目详情

验证详情

维护者

未验证详情

项目链接

元数据

分类器

发布历史 发布通知 | RSS订阅

下载文件

源代码分发

发布历史发布通知 | RSS订阅