包的简要描述。

项目描述

dbcooper-py

dbcooper包将数据库连接转换为一系列函数，处理跟踪连接的逻辑，并允许您在探索数据库时利用自动完成功能。

当编写特定于数据库的Python包时特别有用，例如在公司内部包或包装公共数据源时。

有关R版本，请参阅dgrtwo/dbcooper。

安装

pip install dbcooper

示例

初始化函数

dbcooper包要求您首先创建连接。以下示例，我们将使用Lahman棒球数据库包（lahman）。

from sqlalchemy import create_engine
from dbcooper.data import lahman_sqlite

# connect to sqlite
engine = create_engine("sqlite://")

# load the lahman data into the "lahman" schema
lahman_sqlite(engine)

接下来我们将设置dbcooper

from dbcooper import DbCooper

dbc = DbCooper(engine)

DbCooper对象包含两个重要部分

获取特定表的访问器。
与底层数据库交互的函数。

使用表访问器

在下面的示例中，我们将使用"Lahman"."Salaries"表作为示例。默认情况下，dbcooper将其作为.lahman_salaries提供。

普通 .lahman_salaries打印出表和列信息，包括类型和描述。

# show table and column descriptions
dbc.lahman_salaries

salaries

(无表描述.)

name	type	description
index	BIGINT
yearID	BIGINT
teamID	TEXT
lgID	TEXT
playerID	TEXT
salary	BIGINT

请注意，sqlite不支持表和列描述，因此这些部分为空。

调用 .lahman_salaries()获取数据的懒加载版本。

dbc.lahman_salaries()

# Source: lazy query
# DB Conn: Engine(sqlite://)
# Preview:
   index  yearID teamID lgID   playerID  salary
0      0    1985    ATL   NL  barkele01  870000
1      1    1985    ATL   NL  bedrost01  550000
2      2    1985    ATL   NL  benedbr01  545000
3      3    1985    ATL   NL   campri01  633333
4      4    1985    ATL   NL  ceronri01  625000
# .. may have more rows

请注意，此数据是siuba的LazyTbl对象，您可以用来分析数据。

from siuba import _, count

dbc.lahman_salaries() >> count(over_100k = _.salary > 100_000)

# Source: lazy query
# DB Conn: Engine(sqlite://)
# Preview:
   over_100k      n
0       True  25374
1      False   1054
# .. may have more rows

使用数据库函数

.list()：获取表列表
.tbl()：访问可以使用siuba处理的表。
.query()：执行SQL查询并处理结果。
._engine：获取底层的 sqlalchemy 引擎。

例如，我们可以从查找 Lahman 数据库中的表名开始。

dbc.list()

['lahman.allstar_full',
 'lahman.appearances',
 'lahman.awards_managers',
 'lahman.awards_players',
 'lahman.awards_share_managers',
 'lahman.awards_share_players',
 'lahman.batting',
 'lahman.batting_post',
 'lahman.college_playing',
 'lahman.fielding',
 'lahman.fielding_of',
 'lahman.fielding_ofsplit',
 'lahman.fielding_post',
 'lahman.hall_of_fame',
 'lahman.home_games',
 'lahman.managers',
 'lahman.managers_half',
 'lahman.parks',
 'lahman.people',
 'lahman.pitching',
 'lahman.pitching_post',
 'lahman.salaries',
 'lahman.schools',
 'lahman.series_post',
 'lahman.teams',
 'lahman.teams_franchises',
 'lahman.teams_half']

我们可以使用 dbc.tbl() 访问这些表之一，然后对其进行任何 siuba 操作。

dbc.tbl("Salaries")

# Source: lazy query
# DB Conn: Engine(sqlite://)
# Preview:
   index  yearID teamID lgID   playerID  salary
0      0    1985    ATL   NL  barkele01  870000
1      1    1985    ATL   NL  bedrost01  550000
2      2    1985    ATL   NL  benedbr01  545000
3      3    1985    ATL   NL   campri01  633333
4      4    1985    ATL   NL  ceronri01  625000
# .. may have more rows

from siuba import _, count
dbc.tbl("Salaries") >> count(_.yearID, sort=True)

# Source: lazy query
# DB Conn: Engine(sqlite://)
# Preview:
   yearID     n
0    1999  1006
1    1998   998
2    1995   986
3    1996   931
4    1997   925
# .. may have more rows

如果您想从 SQL 查询开始，请使用 .query() 方法。

dbc.query("""
    SELECT
        playerID,
        sum(AB) as AB
    FROM Batting
    GROUP BY playerID
""")

# Source: lazy query
# DB Conn: Engine(sqlite://)
# Preview:
    playerID     AB
0  aardsda01      4
1  aaronha01  12364
2  aaronto01    944
3   aasedo01      5
4   abadan01     21
# .. may have more rows

对于您可能想要做的任何其他事情，都可用 sqlalchemy 引擎对象。例如，下面的代码展示了如何设置其 .echo 属性，该属性告诉 sqlalchemy 提供有用的日志。

dbc._engine.echo = True
table_names = dbc.list()

2022-03-20 22:49:37,553 INFO sqlalchemy.engine.Engine PRAGMA database_list
2022-03-20 22:49:37,554 INFO sqlalchemy.engine.Engine [raw sql] ()
2022-03-20 22:49:37,555 INFO sqlalchemy.engine.Engine SELECT name FROM "main".sqlite_master WHERE type='table' ORDER BY name
2022-03-20 22:49:37,555 INFO sqlalchemy.engine.Engine [raw sql] ()
2022-03-20 22:49:37,556 INFO sqlalchemy.engine.Engine SELECT name FROM "lahman".sqlite_master WHERE type='table' ORDER BY name
2022-03-20 22:49:37,557 INFO sqlalchemy.engine.Engine [raw sql] ()

注意，上面的日志消息显示 .list() 方法执行了两个查询：一个用于列出“main”架构中的表（该架构为空），另一个用于列出“lahman”架构中的表。

高级配置

⚠️：这些行为经过充分测试，但 dbcooper 的内部和 API 可能会更改。

dbcooper 可以以三种方式配置，每种方式对应一个类接口

TableFinder：dbcooper 将使用哪些表。
AccessorBuilder：如何将表名转换为访问器。
DbcDocumentedTable：定义访问器将返回什么的类。

from sqlalchemy import create_engine
from dbcooper.data import lahman_sqlite
from dbcooper import DbCooper, AccessorBuilder

engine = create_engine("sqlite://")
lahman_sqlite(engine)

排除架构

from dbcooper import TableFinder

finder = TableFinder(exclude_schemas=["lahman"])
dbc_no_lahman = DbCooper(engine, table_finder=finder)
dbc_no_lahman.list()

[]

格式化表名

from dbcooper import AccessorBuilder

# omits schema, and keeps only table name
# e.g. `salaries`, rather than `lahman_salaries`
builder = AccessorBuilder(format_from_part="table")

tbl_flat = DbCooper(engine, accessor_builder=builder)
tbl_flat.salaries()

# Source: lazy query
# DB Conn: Engine(sqlite://)
# Preview:
   index  yearID teamID lgID   playerID  salary
0      0    1985    ATL   NL  barkele01  870000
1      1    1985    ATL   NL  bedrost01  550000
2      2    1985    ATL   NL  benedbr01  545000
3      3    1985    ATL   NL   campri01  633333
4      4    1985    ATL   NL  ceronri01  625000
# .. may have more rows

按架构分组表

from dbcooper import AccessorHierarchyBuilder

tbl_nested = DbCooper(engine, accessor_builder=AccessorHierarchyBuilder())

# note the form: <schema>.<table>
tbl_nested.lahman.salaries()

# Source: lazy query
# DB Conn: Engine(sqlite://)
# Preview:
   index  yearID teamID lgID   playerID  salary
0      0    1985    ATL   NL  barkele01  870000
1      1    1985    ATL   NL  bedrost01  550000
2      2    1985    ATL   NL  benedbr01  545000
3      3    1985    ATL   NL   campri01  633333
4      4    1985    ATL   NL  ceronri01  625000
# .. may have more rows

不显示表文档

from dbcooper import DbcSimpleTable

dbc_no_doc = DbCooper(engine, table_factory=DbcSimpleTable)
dbc_no_doc.lahman_salaries

DbcSimpleTable(..., 'salaries', 'lahman')

注意，类似于 snowflake-sqlalchemy 的 sqlalchemy 方言无法像其他方言那样查找诸如表和列描述之类的信息，因此可能需要 DbcSimpleTable 来连接到 snowflake（参见此问题）。

开发

# install with development dependencies
pip install -e .[dev]

# or install from requirements file
pip install -r requirements/dev.txt

测试

# run all tests, see pytest section of pyproject.toml
pytest

# run specific backends
pytest -m 'not snowflake and not bigquery'

# stop on first failure, drop into debugger
pytest -x --pdb

发布

# set version number
git tag v0.0.1

# (optional) push to github
git push origin --tags

# check version
python -m setuptools_scm

项目详情

发布历史发布通知 | RSS 源

此版本

0.0.5

2023 年 3 月 23 日

0.0.4

2022 年 6 月 9 日

0.0.3

2022 年 3 月 21 日

0.0.2

2022 年 3 月 20 日

0.0.1

2022 年 3 月 20 日

下载文件

下载您平台上的文件。如果您不确定选择哪个，请了解更多关于安装包的信息。

源分布

dbcooper-0.0.5.tar.gz (18.9 kB 查看哈希)

上传时间 2023 年 3 月 23 日 源

构建分布

dbcooper-0.0.5-py2.py3-none-any.whl (16.6 kB 查看哈希)

上传时间 2023 年 3 月 23 日 Python 2 Python 3

dbcooper-0.0.5.tar.gz 的哈希

dbcooper-0.0.5.tar.gz 的哈希
算法	哈希摘要
SHA256	`4c5684716dc3955a43f562738861a82d391554c1e3e41bf07769455973dc036c`
MD5	`678c641ceca66c3681f017bf046d9a61`
BLAKE2b-256	`dd30a2020c9858e622012859d8c8a49f36fce922fe58b8b4e6f940fbe3d9874e`

dbcooper-0.0.5-py2.py3-none-any.whl 的哈希

dbcooper-0.0.5-py2.py3-none-any.whl 的哈希
算法	哈希摘要
SHA256	`93aa999f5003b16e8929f1170ecbf90701602aeaf082df69713b131f2e4e845d`
MD5	`9ba453ea52107d95953fe31dac3e0bda`
BLAKE2b-256	`bbb30472f06cd2009cfe15dca94c1f02d199464ef61a696f37a5edc16c98141e`

dbcooper 0.0.5

导航

验证详情

维护者

未验证详情

项目链接

元数据

分类

项目描述

dbcooper-py

安装

示例

初始化函数

使用表访问器

salaries

使用数据库函数

高级配置

排除架构

格式化表名

按架构分组表

不显示表文档

开发

测试

发布

项目详情

验证详情

维护者

未验证详情

项目链接

元数据

分类

发布历史发布通知 | RSS 源

下载文件

源分布

构建分布

dbcooper 0.0.5

导航

验证详情

维护者

未验证详情

项目链接

元数据

分类

项目描述

dbcooper-py

安装

示例

初始化函数

使用表访问器

salaries

使用数据库函数

高级配置

排除架构

格式化表名

按架构分组表

不显示表文档

开发

测试

发布

项目详情

验证详情

维护者

未验证详情

项目链接

元数据

分类

发布历史 发布通知 | RSS 源

下载文件

源分布

构建分布

发布历史发布通知 | RSS 源