python-benedict是一个具有键列表/键路径/键属性支持的dict子类,具有标准化的I/O操作(base64、csv、ini、json、pickle、plist、查询字符串、toml、xls、xml、yaml)以及许多实用工具...显然是为了人类而设计的。
项目描述
python-benedict
python-benedict 是一个支持 keylist/keypath/keyattr 的 dict 子类,具有 I/O 快捷方式(base64
、cli
、csv
、html
、ini
、json
、pickle
、plist
、query-string
、toml
、xls
、xml
、yaml
)和许多 实用工具... 显然是为了人类用户。
特性
- 100% 向后兼容,您可以安全地包装现有的字典。
NEW
支持 Keyattr,可以使用 键作为属性 来获取/设置项。- Keylist 支持,使用 键列表 作为键。
- Keypath 支持,使用 keypath分隔符 (默认为点语法)。
- 支持 Keypath 列表索引 (也可以是负数),使用标准的
[n]
后缀。 - 与最常见的格式进行归一化 I/O 操作:
base64
、cli
、csv
、html
、ini
、json
、pickle
、plist
、query-string
、toml
、xls
、xml
、yaml
。 - 多个 I/O 操作 后端:
file-system
(读写)、url
(只读)、s3
(读写)。 - 许多 实用工具 和 解析方法,以便根据需要检索数据 (请参阅API部分)。
- 经过良好的 测试。 ;)
索引
安装
如果您想安装 所有内容
- 运行
pip install "python-benedict[all]"
或者您可以安装主包
- 运行
pip install python-benedict
,然后仅安装您需要的 可选要求。
可选要求
在运行 pip install "python-benedict[...]"
时,可用的可能安装目标层次结构如下 (每个目标安装其所有子目标)
[all]
[io]
[html]
[toml]
[xls]
[xml]
[yaml]
[parse]
[s3]
用法
基础知识
benedict
是一个 dict
子类,因此它可以作为普通字典使用 (您只需将现有字典转换为类型)。
from benedict import benedict
# create a new empty instance
d = benedict()
# or cast an existing dict
d = benedict(existing_dict)
# or create from data source (filepath, url or data-string) in a supported format:
# Base64, CSV, JSON, TOML, XML, YAML, query-string
d = benedict("https://localhost:8000/data.json", format="json")
# or in a Django view
params = benedict(request.GET.items())
page = params.get_int("page", 1)
Keyattr
可以使用 键作为属性(点表示法)来获取/设置项。
d = benedict(keyattr_dynamic=True) # default False
d.profile.firstname = "Fabio"
d.profile.lastname = "Caccamo"
print(d) # -> { "profile":{ "firstname":"Fabio", "lastname":"Caccamo" } }
默认情况下,如果未明确将 keyattr_dynamic
设置为 True
,则此功能仅适用于已存在的项的获取/设置。
禁用 keyattr 功能
您可以在构造函数中传递 keyattr_enabled=False
选项来禁用 keyattr 功能。
d = benedict(existing_dict, keyattr_enabled=False) # default True
或使用 getter/setter
属性。
d.keyattr_enabled = False
动态 keyattr 功能
您可以在构造函数中传递 keyattr_dynamic=True
来启用动态属性访问功能。
d = benedict(existing_dict, keyattr_dynamic=True) # default False
或使用 getter/setter
属性。
d.keyattr_dynamic = True
警告 - 尽管这个特性非常有用,但它有一些明显的限制:它仅适用于不包含
_
且不与当前支持的方法名称冲突的字符串键。
Keylist
无论何时使用键,都可以使用键的 列表(或元组)。
d = benedict()
# set values by keys list
d["profile", "firstname"] = "Fabio"
d["profile", "lastname"] = "Caccamo"
print(d) # -> { "profile":{ "firstname":"Fabio", "lastname":"Caccamo" } }
print(d["profile"]) # -> { "firstname":"Fabio", "lastname":"Caccamo" }
# check if keypath exists in dict
print(["profile", "lastname"] in d) # -> True
# delete value by keys list
del d["profile", "lastname"]
print(d["profile"]) # -> { "firstname":"Fabio" }
Keypath
.
是默认的键路径分隔符。
如果您将现有的字典转换为类型,并且其键包含键路径分隔符,则会引发 ValueError
。
在这种情况下,您应使用 自定义键路径分隔符 或 禁用键路径功能。
d = benedict()
# set values by keypath
d["profile.firstname"] = "Fabio"
d["profile.lastname"] = "Caccamo"
print(d) # -> { "profile":{ "firstname":"Fabio", "lastname":"Caccamo" } }
print(d["profile"]) # -> { "firstname":"Fabio", "lastname":"Caccamo" }
# check if keypath exists in dict
print("profile.lastname" in d) # -> True
# delete value by keypath
del d["profile.lastname"]
自定义键路径分隔符
您可以通过在构造函数中传递 keypath_separator
参数来自定义键路径分隔符。
如果您将现有的字典传递给构造函数,并且其键包含键路径分隔符,则会引发 Exception
。
d = benedict(existing_dict, keypath_separator="/")
更改键路径分隔符
您可以使用 getter/setter
属性在任何时候更改 keypath_separator
。
如果任何现有键包含新的 keypath_separator
,则会引发 Exception
。
d.keypath_separator = "/"
禁用键路径功能
您可以通过在构造函数中传递 keypath_separator=None
选项来禁用键路径功能。
d = benedict(existing_dict, keypath_separator=None)
或使用 getter/setter
属性。
d.keypath_separator = None
列表索引支持
支持列表索引,键路径可以包含索引 (也可以是负数) 使用 [n]
,以非常快速地执行任何操作
# Eg. get last location cordinates of the first result:
loc = d["results[0].locations[-1].coordinates"]
lat = loc.get_decimal("latitude")
lng = loc.get_decimal("longitude")
I/O
为了简化 I/O 操作,benedict
支持多种输入/输出方法,包括大多数常见格式:base64
、cli
、csv
、html
、ini
、json
、pickle
、plist
、query-string
、toml
、xls
、xml
、yaml
。
通过构造函数进行输入
您可以直接从数据源(filepath
、url
、s3
或 data
字符串)创建 benedict
实例,通过在构造函数中传递数据源和数据格式(可选,默认为 "json")。
# filepath
d = benedict("/root/data.yml", format="yaml")
# url
d = benedict("https://localhost:8000/data.xml", format="xml")
# s3
d = benedict("s3://my-bucket/data.xml", s3_options={"aws_access_key_id": "...", "aws_secret_access_key": "..."})
# data
d = benedict('{"a": 1, "b": 2, "c": 3, "x": 7, "y": 8, "z": 9}')
输入方法
- 所有 输入 方法都可以作为类方法访问,并以
from_*
开头,后跟格式名称。 - 在所有 输入 方法中,第一个参数可以表示源:文件路径、URL、S3 URL 或数据字符串。
输入源
所有支持的数据源(文件,URL,S3,数据)默认允许,但在某些情况下,当输入数据来自不可信的来源时,使用 sources
参数限制允许的数据源可能是有用的。
# url
d = benedict("https://localhost:8000/data.json", sources=["url"]) # -> ok
d = benedict.from_json("https://localhost:8000/data.json", sources=["url"]) # -> ok
# s3
d = benedict("s3://my-bucket/data.json", sources=["url"]) # -> raise ValueError
d = benedict.from_json("s3://my-bucket/data.json", sources=["url"]) # -> raise ValueError
输出方法
- 所有 输出 方法都可以作为实例方法访问,并且以
to_*
开头,后跟格式名称。 - 在所有 输出 方法中,如果指定了
filepath="..."
关键字参数,输出也将被保存在指定的文件路径中。
支持格式
以下是支持格式的详细信息,操作和额外选项文档。
格式 | 输入 | 输出 | 额外选项文档 |
---|---|---|---|
base64 |
:white_check_mark | :white_check_mark | - |
cli |
:white_check_mark | :x | argparse |
csv |
:white_check_mark | :white_check_mark | csv |
html |
:white_check_mark | :x | bs4 (Beautiful Soup 4) |
ini |
:white_check_mark | :white_check_mark | configparser |
json |
:white_check_mark | :white_check_mark | json |
pickle |
:white_check_mark | :white_check_mark | pickle |
plist |
:white_check_mark | :white_check_mark | plistlib |
query-string |
:white_check_mark | :white_check_mark | - |
toml |
:white_check_mark | :white_check_mark | toml |
xls |
:white_check_mark | :x | openpyxl - xlrd |
xml |
:white_check_mark | :white_check_mark | xmltodict |
yaml |
:white_check_mark | :white_check_mark | PyYAML |
API
-
实用方法
-
I/O 方法
-
解析方法
实用方法
这些方法是常见的工具,可以加快您日常工作的速度。
接受键参数的工具也支持键路径。
返回字典的工具始终返回一个新的 benedict
实例。
clean
# Clean the current dict instance removing all empty values: None, "", {}, [], ().
# If strings or collections (dict, list, set, tuple) flags are False,
# related empty values will not be deleted.
d.clean(strings=True, collections=True)
clone
# Return a clone (deepcopy) of the dict.
c = d.clone()
dump
# Return a readable representation of any dict/list.
# This method can be used both as static method or instance method.
s = benedict.dump(d.keypaths())
print(s)
# or
d = benedict()
print(d.dump())
filter
# Return a filtered dict using the given predicate function.
# Predicate function receives key, value arguments and should return a bool value.
predicate = lambda k, v: v is not None
f = d.filter(predicate)
find
# Return the first match searching for the given keys/keypaths.
# If no result found, default value is returned.
keys = ["a.b.c", "m.n.o", "x.y.z"]
f = d.find(keys, default=0)
flatten
# Return a new flattened dict using the given separator to join nested dict keys to flatten keypaths.
f = d.flatten(separator="_")
groupby
# Group a list of dicts at key by the value of the given by_key and return a new dict.
g = d.groupby("cities", by_key="country_code")
invert
# Return an inverted dict where values become keys and keys become values.
# Since multiple keys could have the same value, each value will be a list of keys.
# If flat is True each value will be a single value (use this only if values are unique).
i = d.invert(flat=False)
items_sorted_by_keys
# Return items (key/value list) sorted by keys.
# If reverse is True, the list will be reversed.
items = d.items_sorted_by_keys(reverse=False)
items_sorted_by_values
# Return items (key/value list) sorted by values.
# If reverse is True, the list will be reversed.
items = d.items_sorted_by_values(reverse=False)
keypaths
# Return a list of all keypaths in the dict.
# If indexes is True, the output will include list values indexes.
k = d.keypaths(indexes=False)
match
# Return a list of all values whose keypath matches the given pattern (a regex or string).
# If pattern is string, wildcard can be used (eg. [*] can be used to match all list indexes).
# If indexes is True, the pattern will be matched also against list values.
m = d.match(pattern, indexes=True)
merge
# Merge one or more dictionary objects into current instance (deepupdate).
# Sub-dictionaries keys will be merged together.
# If overwrite is False, existing values will not be overwritten.
# If concat is True, list values will be concatenated together.
d.merge(a, b, c, overwrite=True, concat=False)
move
# Move an item from key_src to key_dst.
# It can be used to rename a key.
# If key_dst exists, its value will be overwritten.
d.move("a", "b", overwrite=True)
nest
# Nest a list of dicts at the given key and return a new nested list
# using the specified keys to establish the correct items hierarchy.
d.nest("values", id_key="id", parent_id_key="parent_id", children_key="children")
remove
# Remove multiple keys from the dict.
# It is possible to pass a single key or more keys (as list or *args).
d.remove(["firstname", "lastname", "email"])
rename
# Rename a dict item key from "key" to "key_new".
# If key_new exists, a KeyError will be raised.
d.rename("first_name", "firstname")
search
# Search and return a list of items (dict, key, value, ) matching the given query.
r = d.search("hello", in_keys=True, in_values=True, exact=False, case_sensitive=False)
standardize
# Standardize all dict keys, e.g. "Location Latitude" -> "location_latitude".
d.standardize()
subset
# Return a dict subset for the given keys.
# It is possible to pass a single key or more keys (as list or *args).
s = d.subset(["firstname", "lastname", "email"])
swap
# Swap items values at the given keys.
d.swap("firstname", "lastname")
traverse
# Traverse a dict passing each item (dict, key, value) to the given callback function.
def f(d, key, value):
print(f"dict: {d} - key: {key} - value: {value}")
d.traverse(f)
unflatten
# Return a new unflattened dict using the given separator to split dict keys to nested keypaths.
u = d.unflatten(separator="_")
unique
# Remove duplicated values from the dict.
d.unique()
I/O 方法
这些方法可用于输入/输出操作。
from_base64
# Try to load/decode a base64 encoded data and return it as benedict instance.
# Accept as first argument: url, filepath or data-string.
# It's possible to choose the subformat used under the hood:
# ('csv', 'json', 'query-string', 'toml', 'xml', 'yaml'), default: 'json'.
# It's possible to choose the encoding, default 'utf-8'.
# A ValueError is raised in case of failure.
d = benedict.from_base64(s, subformat="json", encoding="utf-8", **kwargs)
from_cli
# Load and decode data from a string of CLI arguments.
# ArgumentParser specific options can be passed using kwargs:
# https://docs.pythonlang.cn/3/library/argparse.html#argparse.ArgumentParser
# Return a new dict instance. A ValueError is raised in case of failure.
d = benedict.from_cli(s, **kwargs)
from_csv
# Try to load/decode a csv encoded data and return it as benedict instance.
# Accept as first argument: url, filepath or data-string.
# It's possible to specify the columns list, default: None (in this case the first row values will be used as keys).
# It's possible to pass decoder specific options using kwargs:
# https://docs.pythonlang.cn/3/library/csv.html
# A ValueError is raised in case of failure.
d = benedict.from_csv(s, columns=None, columns_row=True, **kwargs)
from_html
# Try to load/decode a html data and return it as benedict instance.
# Accept as first argument: url, filepath or data-string.
# It's possible to pass decoder specific options using kwargs:
# https://beautiful-soup-4.readthedocs.io/
# A ValueError is raised in case of failure.
d = benedict.from_html(s, **kwargs)
from_ini
# Try to load/decode a ini encoded data and return it as benedict instance.
# Accept as first argument: url, filepath or data-string.
# It's possible to pass decoder specific options using kwargs:
# https://docs.pythonlang.cn/3/library/configparser.html
# A ValueError is raised in case of failure.
d = benedict.from_ini(s, **kwargs)
from_json
# Try to load/decode a json encoded data and return it as benedict instance.
# Accept as first argument: url, filepath or data-string.
# It's possible to pass decoder specific options using kwargs:
# https://docs.pythonlang.cn/3/library/json.html
# A ValueError is raised in case of failure.
d = benedict.from_json(s, **kwargs)
from_pickle
# Try to load/decode a pickle encoded in Base64 format and return it as benedict instance.
# Accept as first argument: url, filepath or data-string.
# It's possible to pass decoder specific options using kwargs:
# https://docs.pythonlang.cn/3/library/pickle.html
# A ValueError is raised in case of failure.
d = benedict.from_pickle(s, **kwargs)
from_plist
# Try to load/decode a p-list encoded data and return it as benedict instance.
# Accept as first argument: url, filepath or data-string.
# It's possible to pass decoder specific options using kwargs:
# https://docs.pythonlang.cn/3/library/plistlib.html
# A ValueError is raised in case of failure.
d = benedict.from_plist(s, **kwargs)
from_query_string
# Try to load/decode a query-string and return it as benedict instance.
# Accept as first argument: url, filepath or data-string.
# A ValueError is raised in case of failure.
d = benedict.from_query_string(s, **kwargs)
from_toml
# Try to load/decode a toml encoded data and return it as benedict instance.
# Accept as first argument: url, filepath or data-string.
# It's possible to pass decoder specific options using kwargs:
# https://pypi.ac.cn/project/toml/
# A ValueError is raised in case of failure.
d = benedict.from_toml(s, **kwargs)
from_xls
# Try to load/decode a xls file (".xls", ".xlsx", ".xlsm") from url, filepath or data-string.
# Accept as first argument: url, filepath or data-string.
# It's possible to pass decoder specific options using kwargs:
# - https://openpyxl.readthedocs.io/ (for .xlsx and .xlsm files)
# - https://pypi.ac.cn/project/xlrd/ (for .xls files)
# A ValueError is raised in case of failure.
d = benedict.from_xls(s, sheet=0, columns=None, columns_row=True, **kwargs)
from_xml
# Try to load/decode a xml encoded data and return it as benedict instance.
# Accept as first argument: url, filepath or data-string.
# It's possible to pass decoder specific options using kwargs:
# https://github.com/martinblech/xmltodict
# A ValueError is raised in case of failure.
d = benedict.from_xml(s, **kwargs)
from_yaml
# Try to load/decode a yaml encoded data and return it as benedict instance.
# Accept as first argument: url, filepath or data-string.
# It's possible to pass decoder specific options using kwargs:
# https://pyyaml.org/wiki/PyYAMLDocumentation
# A ValueError is raised in case of failure.
d = benedict.from_yaml(s, **kwargs)
to_base64
# Return the dict instance encoded in base64 format and optionally save it at the specified 'filepath'.
# It's possible to choose the subformat used under the hood:
# ('csv', json', 'query-string', 'toml', 'xml', 'yaml'), default: 'json'.
# It's possible to choose the encoding, default 'utf-8'.
# It's possible to pass decoder specific options using kwargs.
# A ValueError is raised in case of failure.
s = d.to_base64(subformat="json", encoding="utf-8", **kwargs)
to_csv
# Return a list of dicts in the current dict encoded in csv format and optionally save it at the specified filepath.
# It's possible to specify the key of the item (list of dicts) to encode, default: 'values'.
# It's possible to specify the columns list, default: None (in this case the keys of the first item will be used).
# A ValueError is raised in case of failure.
s = d.to_csv(key="values", columns=None, columns_row=True, **kwargs)
to_ini
# Return the dict instance encoded in ini format and optionally save it at the specified filepath.
# It's possible to pass encoder specific options using kwargs:
# https://docs.pythonlang.cn/3/library/configparser.html
# A ValueError is raised in case of failure.
s = d.to_ini(**kwargs)
to_json
# Return the dict instance encoded in json format and optionally save it at the specified filepath.
# It's possible to pass encoder specific options using kwargs:
# https://docs.pythonlang.cn/3/library/json.html
# A ValueError is raised in case of failure.
s = d.to_json(**kwargs)
to_pickle
# Return the dict instance as pickle encoded in Base64 format and optionally save it at the specified filepath.
# The pickle protocol used by default is 2.
# It's possible to pass encoder specific options using kwargs:
# https://docs.pythonlang.cn/3/library/pickle.html
# A ValueError is raised in case of failure.
s = d.to_pickle(**kwargs)
to_plist
# Return the dict instance encoded in p-list format and optionally save it at the specified filepath.
# It's possible to pass encoder specific options using kwargs:
# https://docs.pythonlang.cn/3/library/plistlib.html
# A ValueError is raised in case of failure.
s = d.to_plist(**kwargs)
to_query_string
# Return the dict instance as query-string and optionally save it at the specified filepath.
# A ValueError is raised in case of failure.
s = d.to_query_string(**kwargs)
to_toml
# Return the dict instance encoded in toml format and optionally save it at the specified filepath.
# It's possible to pass encoder specific options using kwargs:
# https://pypi.ac.cn/project/toml/
# A ValueError is raised in case of failure.
s = d.to_toml(**kwargs)
to_xml
# Return the dict instance encoded in xml format and optionally save it at the specified filepath.
# It's possible to pass encoder specific options using kwargs:
# https://github.com/martinblech/xmltodict
# A ValueError is raised in case of failure.
s = d.to_xml(**kwargs)
to_yaml
# Return the dict instance encoded in yaml format.
# If filepath option is passed the output will be saved ath
# It's possible to pass encoder specific options using kwargs:
# https://pyyaml.org/wiki/PyYAMLDocumentation
# A ValueError is raised in case of failure.
s = d.to_yaml(**kwargs)
解析方法
这些方法是 get
方法的包装器,它们尝试解析数据并以期望的类型返回它。
get_bool
# Get value by key or keypath trying to return it as bool.
# Values like `1`, `true`, `yes`, `on`, `ok` will be returned as `True`.
d.get_bool(key, default=False)
get_bool_list
# Get value by key or keypath trying to return it as list of bool values.
# If separator is specified and value is a string it will be splitted.
d.get_bool_list(key, default=[], separator=",")
get_date
# Get value by key or keypath trying to return it as date.
# If format is not specified it will be autodetected.
# If choices and value is in choices return value otherwise default.
d.get_date(key, default=None, format=None, choices=[])
get_date_list
# Get value by key or keypath trying to return it as list of date values.
# If separator is specified and value is a string it will be splitted.
d.get_date_list(key, default=[], format=None, separator=",")
get_datetime
# Get value by key or keypath trying to return it as datetime.
# If format is not specified it will be autodetected.
# If choices and value is in choices return value otherwise default.
d.get_datetime(key, default=None, format=None, choices=[])
get_datetime_list
# Get value by key or keypath trying to return it as list of datetime values.
# If separator is specified and value is a string it will be splitted.
d.get_datetime_list(key, default=[], format=None, separator=",")
get_decimal
# Get value by key or keypath trying to return it as Decimal.
# If choices and value is in choices return value otherwise default.
d.get_decimal(key, default=Decimal("0.0"), choices=[])
get_decimal_list
# Get value by key or keypath trying to return it as list of Decimal values.
# If separator is specified and value is a string it will be splitted.
d.get_decimal_list(key, default=[], separator=",")
get_dict
# Get value by key or keypath trying to return it as dict.
# If value is a json string it will be automatically decoded.
d.get_dict(key, default={})
get_email
# Get email by key or keypath and return it.
# If value is blacklisted it will be automatically ignored.
# If check_blacklist is False, it will be not ignored even if blacklisted.
d.get_email(key, default="", choices=None, check_blacklist=True)
get_float
# Get value by key or keypath trying to return it as float.
# If choices and value is in choices return value otherwise default.
d.get_float(key, default=0.0, choices=[])
get_float_list
# Get value by key or keypath trying to return it as list of float values.
# If separator is specified and value is a string it will be splitted.
d.get_float_list(key, default=[], separator=",")
get_int
# Get value by key or keypath trying to return it as int.
# If choices and value is in choices return value otherwise default.
d.get_int(key, default=0, choices=[])
get_int_list
# Get value by key or keypath trying to return it as list of int values.
# If separator is specified and value is a string it will be splitted.
d.get_int_list(key, default=[], separator=",")
get_list
# Get value by key or keypath trying to return it as list.
# If separator is specified and value is a string it will be splitted.
d.get_list(key, default=[], separator=",")
get_list_item
# Get list by key or keypath and return value at the specified index.
# If separator is specified and list value is a string it will be splitted.
d.get_list_item(key, index=0, default=None, separator=",")
get_phonenumber
# Get phone number by key or keypath and return a dict with different formats (e164, international, national).
# If country code is specified (alpha 2 code), it will be used to parse phone number correctly.
d.get_phonenumber(key, country_code=None, default=None)
get_slug
# Get value by key or keypath trying to return it as slug.
# If choices and value is in choices return value otherwise default.
d.get_slug(key, default="", choices=[])
get_slug_list
# Get value by key or keypath trying to return it as list of slug values.
# If separator is specified and value is a string it will be splitted.
d.get_slug_list(key, default=[], separator=",")
get_str
# Get value by key or keypath trying to return it as string.
# Encoding issues will be automatically fixed.
# If choices and value is in choices return value otherwise default.
d.get_str(key, default="", choices=[])
get_str_list
# Get value by key or keypath trying to return it as list of str values.
# If separator is specified and value is a string it will be splitted.
d.get_str_list(key, default=[], separator=",")
get_uuid
# Get value by key or keypath trying to return it as valid uuid.
# If choices and value is in choices return value otherwise default.
d.get_uuid(key, default="", choices=[])
get_uuid_list
# Get value by key or keypath trying to return it as list of valid uuid values.
# If separator is specified and value is a string it will be splitted.
d.get_uuid_list(key, default=[], separator=",")
测试
# clone repository
git clone https://github.com/fabiocaccamo/python-benedict.git && cd python-benedict
# create virtualenv and activate it
python -m venv venv && . venv/bin/activate
# upgrade pip
python -m pip install --upgrade pip
# install requirements
pip install -r requirements.txt -r requirements-test.txt
# install pre-commit to run formatters and linters
pre-commit install --install-hooks
# run tests using tox
tox
# or run tests using unittest
python -m unittest
许可证
在 MIT 许可证 下发布。
支持
另请参阅
-
python-fontbro
- 在fontTools
上进行友好的字体操作。 -
python-fsutil
- 为懒惰的开发者提供高级文件系统操作。
项目详情
下载文件
下载适合您平台的文件。如果您不确定选择哪个,请了解更多关于安装包的信息。