在HDF文件内映射对象并创建数据集命名空间

这些详情尚未由PyPI 验证

项目描述

hdfmap

在HDF文件中映射对象并创建数据集命名空间。

版本 0.5

由 Dan Porter 编写
Diamond Light Source
2024

TL;DR - 使用方法

from hdfmap import create_nexus_map, load_hdf

# HdfMap from NeXus file:
m = create_nexus_map('file.nxs')
m['energy']  # >> '/entry/instrument/monochromator/energy'
m['signal']  # >> '/entry/measurement/sum'
m['axes']  # >> '/entry/measurement/theta'
m.get_image_path()  # >> '/entry/instrument/pil3_100k/data'

with load_hdf('file.nxs') as nxs:
    path = m.get_path('scan_command')
    cmd = nxs[path][()]  # returns bytes data direct from file
    cmd = m.get_data(nxs, 'scan_command')  # returns converted str output
    string = m.format_hdf(nxs, "the energy is {energy:.2f} keV")
    d = m.get_dataholder(nxs)  # classic data table, d.scannable, d.metadata

# Shortcuts - single file reloader class
from hdfmap import NexusLoader

scan = NexusLoader('file.hdf')
[data1, data2] = scan.get_data(['dataset_name_1', 'dataset_name_2'])
data = scan.eval('dataset_name_1 * 100 + 2')
string = scan.format('my data is {dataset_name_1:.2f}')

# Shortcuts - multifile load data (generate map from first file)
from hdfmap import hdf_data, hdf_eval, hdf_format, hdf_image

all_data = hdf_data([f'file{n}.nxs' for n in range(100)], 'dataset_name')
normalised_data = hdf_eval(filenames, 'total / Transmission / (rc / 300.)')
descriptions = hdf_format(filenames, 'Energy: {en:5.3f} keV')
image_stack = hdf_image(filenames, index=31)

安装

要求： Python >=3.10，Numpy，h5py

python -m pip install --upgrade git+https://github.com/DanPorter/hdfmap.git

描述

这是另一个通用的hdf读取器，但这里的想法是构建一个命名空间字典 {'name': 'path'}，为每个数据集，然后以可能有用的方式进行分组。

HDF文件中的对象分为组和数据集。每个对象都有一个定义的 'path' 和 'name' 参数，以及其他属性

path -> '/entry/measurement/data' -> 对象在文件中的位置
name -> 'data' -> 以简单变量名表示的路径

路径是文件内的唯一位置，但可以用来识别其他文件中类似的对象。名称在文件内可能不唯一，由路径生成。

	名称	路径
描述	数据集的简单标识符	从文件位置构建的hdf路径
示例	`'scan_command'`	`'/entry/scan_command'`

不同类型数据集的名称存储在数组（大小 > 0）和值（大小 0）中。可扫描对象的名称与特定大小的所有数组相关。在可扫描对象 > 数组 > 值的情况下提供了名称的合并列表。

HdfMap 属性


map.groups	按路径存储每个组的属性
map.classes	按 nx_class 存储组路径列表
map.datasets	按路径存储每个数据集的属性
map.arrays	按名称存储数组数据集路径
map.values	按名称存储值数据集路径
map.scannables	按名称存储给定大小数组的路径
map.combined	存储数组和值路径（数组覆盖值）
map.image_data	存储图像数据的数据集路径

例如：

map.groups = {'/hdf/group': ('class', 'name', {attrs}, [datasets])}
map.classes = {'class_name': ['/hdf/group1', '/hdf/group2']}
map.datasets = {'/hdf/group/dataset': ('name', size, shape, {attrs})}
map.arrays = {'name': '/hdf/group/dataset'}
map.values = {'name': '/hdf/group/dataset'}
map.scannables = {'name': '/hdf/group/dataset'}
map.image_data = {'name': '/hdf/group/dataset'}

HdfMap 方法


`map.populate(h5py.File)`	使用给定的文件填充字典
`map.generate_scannables(array_size)`	使用相同大小的数组填充可扫描对象命名空间
`map.most_common_size()`	返回最常见的 > 1 的数据集大小
`map.get_size('name_or_path')`	返回数据集大小
`map.get_shape('name_or_path')`	返回数据集大小
`map.get_attr('name_or_path', 'attr')`	返回数据集属性的值
`map.get_path('name_or_group_or_class')`	返回具有名称的对象的路径
`map.get_image_path()`	返回检测器数据集（或最大数据集）的默认路径
`map.get_group_path('name_or_path_or_class')`	返回具有类别的组路径
`map.get_group_datasets('name_or_path_or_class')`	返回类中数据集路径的列表

HdfMap 文件方法


`map.get_metadata(h5py.File)`	返回值数据集的字典
`map.get_scannables(h5py.File)`	返回可扫描数据集的字典
`map.get_scannalbes_array(h5py.File)`	返回可扫描数据集的 numpy 数组
`map.get_dataholder(h5py.File)`	返回具有元数据和可扫描对象的类似字典的对象
`map.get_image(h5py.File, index)`	返回图像数据
`map.get_data(h5py.File, 'name')`	从数据集返回数据
`map.eval(h5py.File, 'expression')`	使用数据集名称返回表达式的输出
`map.format(h5py.File, 'string {name}')`	返回 str 表达式的输出

NeXus 文件

使用 NeXus 格式的文件可以生成特殊的 NexusMap 对象。这些对象与一般 HdfMaps 以相同的方式工作，但在命名空间中包含额外的特殊名称。


`'axes'`	返回默认 NXaxes 的路径
`'signal'`	返回默认 NXsignal 的路径

此外，map.scannables 字典将自动填充“scan_fields”数据集中给出的名称或第一个 NXdata 组中的数据集名称。默认的 image 数据将从第一个 NXdetector 数据集中获取。

示例

扫描数据 & 元数据

将 Nexus 文件中的单独数据集划分为 Diamond 的经典可扫描对象和元数据，类似于旧 '*.dat' 文件中的内容。

from hdfmap import create_nexus_map, load_hdf

# HdfMap from NeXus file:
hmap = create_nexus_map('file.nxs')
with load_hdf('file.nxs') as nxs:
    scannables = hmap.get_scannables_array(nxs)  # creates 2D numpy array
    labels = scannables.dtype.names
    metadata = hmap.get_metadata(nxs)  # {'name': value}
    d = hmap.get_dataholder(nxs)  # classic data table, d.scannable, d.metadata
d.theta == d['theta']  # scannable array 'theta'
d.metadata.scan_command == d.metadata['scan_command']  # single value 'scan_command'

# OR, use the shortcut:
from hdfmap import nexus_data_block

d = nexus_data_block('file.nxs')

# The data loader class removes the need to open the files:
from hdfmap import NexusLoader

scan = NexusLoader('file.nxs')
metadata = scan.get_metadata()
scannables = scan.get_scannables()

自动默认绘图轴

如果 Nexus 文件中已定义，'axes' 和 'signal' 将自动填充。

import matplotlib.pyplot as plt
from hdfmap import create_nexus_map, load_hdf

# HdfMap from NeXus file:
hmap = create_nexus_map('file.nxs')
with load_hdf('file.nxs') as nxs:
    axes = hmap.get_data(nxs, 'axes')
    signal = hmap.get_data(nxs, 'signal')
    title = hmap.format_hdf(nxs, "{entry_identifier}\n{scan_command}")
axes_label = hmap.get_path('axes')
signal_label = hmap.get_path('signal')
# plot the data (e.g. using matplotlib)
plt.figure()
plt.plot(axes, signal)
plt.xlabel(axes_label)
plt.ylabel(signal_label)
plt.title(title)

# Or, using NexusLoader:
from hdfmap import NexusLoader

scan = NexusLoader('file.nxs')
axes, signal = scan('axes, signal')
axes_label, signal_label = scan('_axes, _signal')
title = scan.format("{entry_identifier}\n{scan_command}")

自动图像数据

从 Nexus 文件中的第一个检测器获取图像

from hdfmap import create_nexus_map, load_hdf

# HdfMap from NeXus file:
hmap = create_nexus_map('file.nxs')
image_location = hmap.get_image_path()  # returns the hdf path chosen for the default detector
with load_hdf('file.nxs') as nxs:
    middle_image = hmap.get_image(nxs)  # returns single image from index len(dataset)//2
    first_image = hmap.get_image(nxs, 0)  # returns single image from dataset[0, :, :]
    volume = hmap.get_image(nxs, ())  # returns whole volume as array
    roi = hmap.get_image(nxs, (0, slice(5, 10, 1), slice(5, 10, 1)))  # returns part of dataset

# Or, using NexusLoader:
from hdfmap import NexusLoader

scan = NexusLoader('file.nxs')
image = scan.get_image(index=0)  # using index as defined above

多扫描元数据字符串

从目录中的每个文件非常快速地生成元数据字符串。HdfMap 仅在第一个文件中创建，其余文件被视为具有相同的结构。

from hdfmap import list_files, hdf_format

format_string = "#{entry_identifier}: {start_time} : E={incident_energy:.3f} keV : {scan_command}"
files = list_files('/directoy/path', extension='.nxs')
strings_list = hdf_format(files, format_string)
print('\n'.join(strings_list))

# other multi-file readers:
from hdfmap import hdf_data, hdf_image, hdf_eval

data_list = hdf_data(files, 'incident_energy')
image_list = hdf_image(files, index=0)
data_list = hdf_eval(files, 'signal / Transmission')

项目详情

这些详情尚未由PyPI 验证

发布历史发布通知 | RSS 源

此版本

0.5.1

2024年10月3日

0.5

2024年9月26日

0.4

2024年8月29日

下载文件

下载您平台上的文件。如果您不确定要选择哪个，请了解有关安装软件包的更多信息。

源分布

hdfmap-0.5.1.tar.gz (38.6 kB 查看散列)

上传于 2024年10月3日 源代码

构建分发

hdfmap-0.5.1-py3-none-any.whl (33.8 kB 查看哈希)

上传于 2024年10月3日 Python 3

hdfmap-0.5.1.tar.gz 的哈希

hdfmap-0.5.1.tar.gz 的哈希
算法	哈希摘要
SHA256	`6aa724cdf370711265a461d075d744d028f101d80e4703194c880e2a62bda618`
MD5	`62d9477246e3161bc735ca90cfb0390b`
BLAKE2b-256	`4396e8a3cd0cc64c324f16a2669e665ea6dea65a0f5bad2539a4d4b690841089`

hdfmap-0.5.1-py3-none-any.whl 的哈希

hdfmap-0.5.1-py3-none-any.whl 的哈希
算法	哈希摘要
SHA256	`a435ff1c7b83b3b24166b6bdc73483c8ab87f7d3b50649bec0f2f780ada84ea9`
MD5	`5f8c2f605f703ee8e5ef65b3939a9f3f`
BLAKE2b-256	`75f13e06eafe367cfdeb85883c2d59cab99c3cc57a1d2f899c007d92ee408ef7`

hdfmap 0.5.1

导航

验证详情

项目链接

所有者

GitHub统计

未验证详情

元数据

分类

项目描述

hdfmap

TL;DR - 使用方法

安装

描述

HdfMap 属性

例如：

HdfMap 方法

HdfMap 文件方法

NeXus 文件

示例

扫描数据 & 元数据

自动默认绘图轴

自动图像数据

多扫描元数据字符串

项目详情

验证详情

项目链接

所有者

GitHub统计

未验证详情

元数据

分类

发布历史发布通知 | RSS 源

下载文件

源分布

构建分发

hdfmap 0.5.1

导航

验证详情

项目链接

所有者

GitHub统计

未验证详情

元数据

分类

项目描述

hdfmap

TL;DR - 使用方法

安装

描述

HdfMap 属性

例如：

HdfMap 方法

HdfMap 文件方法

NeXus 文件

示例

扫描数据 & 元数据

自动默认绘图轴

自动图像数据

多扫描元数据字符串

项目详情

验证详情

项目链接

所有者

GitHub统计

未验证详情

元数据

分类

发布历史 发布通知 | RSS 源

下载文件

源分布

构建分发

发布历史发布通知 | RSS 源