待办事项
项目描述
ndx-binned-spikes NWB扩展
安装
该扩展已在PyPI上可用,可以使用pip进行安装。以下命令安装了扩展的最新版本:Python
pip install -U ndx-binned-spikes
如果您想安装扩展的开发版本,可以直接从GitHub仓库安装。以下命令安装了扩展的开发版本
Python
pip install -U git+https://github.com/catalystneuro/ndx-binned-spikes.git
用法
BinnedAlignedSpikes
对象旨在存储围绕一组事件(例如,刺激或行为事件,如舔)的尖峰计数。事件由其时间戳和用于存储每个事件时间戳周围尖峰计数的分箱数据结构来表征。《BinnedAlignedSpikes》对象为每个单元(即神经元)保留单独的计数,换句话说,单元的尖峰是分别计数的,但与相同的事件集对齐。
简单示例
以下代码说明了此扩展的最小使用示例
import numpy as np
from ndx_binned_spikes import BinnedAlignedSpikes
data = np.array(
[
[ # Data of unit with index 0
[5, 1, 3, 2], # Bin counts around the first timestamp
[6, 3, 4, 3], # Bin counts around the second timestamp
[4, 2, 1, 4], # Bin counts around the third timestamp
],
[ # Data of unit with index 1
[8, 4, 0, 2], # Bin counts around the first timestamp
[3, 3, 4, 2], # Bin counts around the second timestamp
[2, 7, 4, 1], # Bin counts around the third timestamp
],
],
dtype="uint64",
)
event_timestamps = np.array([0.25, 5.0, 12.25]) # The timestamps to which we align the counts
milliseconds_from_event_to_first_bin = -50.0 # The first bin is 50 ms before the event
bin_width_in_milliseconds = 100.0 # Each bin is 100 ms wide
binned_aligned_spikes = BinnedAlignedSpikes(
data=data,
event_timestamps=event_timestamps,
bin_width_in_milliseconds=bin_width_in_milliseconds,
milliseconds_from_event_to_first_bin=milliseconds_from_event_to_first_bin
)
生成的对象通常会被添加到 NWB 文件的处理模块中。以下代码演示了如何将 BinnedAlignedSpikes
对象添加到 NWB 文件中。我们首先创建一个 nwbfile,然后将 BinnedAlignedSpikes
对象添加到处理模块中,最后将 nwbfile 写入磁盘。
from datetime import datetime
from zoneinfo import ZoneInfo
from pynwb import NWBHDF5IO, NWBFile
session_description = "A session of data where a PSTH structure was produced"
session_start_time = datetime.now(ZoneInfo("Asia/Ulaanbaatar"))
identifier = "a_session_identifier"
nwbfile = NWBFile(
session_description=session_description,
session_start_time=session_start_time,
identifier=identifier,
)
ecephys_processing_module = nwbfile.create_processing_module(
name="ecephys", description="Intermediate data derived from extracellular electrophysiology recordings."
)
ecephys_processing_module.add(binned_aligned_spikes)
with NWBHDF5IO("binned_aligned_spikes.nwb", "w") as io:
io.write(nwbfile)
参数和数据结构
bins 的结构通过以下参数来描述
milliseconds_from_event_to_first_bin
:从事件到第一个 bins 开始于毫秒的时间。负值表示第一个 bins 在事件之前,正值表示第一个 bins 在事件之后。bin_width_in_milliseconds
:每个 bins 的宽度,单位为毫秒。
注意,在上面的图中,milliseconds_from_event_to_first_bin
是负值。
传递给 BinnedAlignedSpikes
的 data
参数存储了所有事件时间戳的单元计数。数据是一个三维数组,其中第一个维度索引单元,第二个维度索引事件时间戳,第三个维度索引存储计数的 bins。数据的形状为 (number_of_units
,number_of_events
,number_of_bins
)。
event_timestamps
参数用于存储事件的时间戳,应该与 data
的第二个维度长度相同。请注意,event_timestamps 应该不减少,换句话说,期望事件按时间顺序递增。
data
的第一个维度几乎像字典一样工作。也就是说,您可以通过索引第一个维度来选择特定的单元。例如,data[0]
将返回第一个单元的数据。对于每个单元,数据以时间为第一轴组织,这是 NWB 格式的惯例。因此,每个单元的数据在内存中是连续的。
以下图示说明了具体示例的数据结构
链接到单元表
使 BinnedAlignedSpikes
对象中存储的信息对未来的用户更有用的一种方法是确切地说明 data
属性的第一个维度对应于哪些单元或神经元。这是 可选的但建议的,因为它使数据更有意义且更容易解释。在 NWB 中,单元通常存储在 Units
表中。为了说明如何创建此链接,让我们首先创建一个玩具 Units
表。
import numpy as np
from pynwb.misc import Units
num_units = 5
max_spikes_per_unit = 10
units_table = Units(name="units")
units_table.add_column(name="unit_name", description="name of the unit")
rng = np.random.default_rng(seed=0)
times = rng.random(size=(num_units, max_spikes_per_unit)).cumsum(axis=1)
spikes_per_unit = rng.integers(1, max_spikes_per_unit, size=num_units)
spike_times = []
for unit_index in range(num_units):
# Not all units have the same number of spikes
spike_times = times[unit_index, : spikes_per_unit[unit_index]]
unit_name = f"unit_{unit_index}"
units_table.add_unit(spike_times=spike_times, unit_name=unit_name)
这将创建一个包含 5 个单元的 Units
表。然后我们可以通过创建一个 DynamicTableRegion
对象将 BinnedAlignedSpikes
对象链接到该表。这允许非常具体地说明 BinnedAlignedSpikes
对象中的数据对应于哪些单元。在以下代码中,描述在 BinnedAlignedSpikes
对象上的单元对应于 Units
表上的索引 1 和 3 的单元。其余步骤与之前相同。
from ndx_binned_spikes import BinnedAlignedSpikes
from hdmf.common import DynamicTableRegion
# Now we create the BinnedAlignedSpikes object and link it to the units table
data = np.array(
[
[ # Data of the unit 1 in the units table
[5, 1, 3, 2], # Bin counts around the first timestamp
[6, 3, 4, 3], # Bin counts around the second timestamp
[4, 2, 1, 4], # Bin counts around the third timestamp
],
[ # Data of the unit 3 in the units table
[8, 4, 0, 2], # Bin counts around the first timestamp
[3, 3, 4, 2], # Bin counts around the second timestamp
[2, 7, 4, 1], # Bin counts around the third timestamp
],
],
)
region_indices = [1, 3]
units_region = DynamicTableRegion(
data=region_indices, table=units_table, description="region of units table", name="units_region"
)
event_timestamps = np.array([0.25, 5.0, 12.25])
milliseconds_from_event_to_first_bin = -50.0 # The first bin is 50 ms before the event
bin_width_in_milliseconds = 100.0
name = "BinnedAignedSpikesForMyPurpose"
description = "Spike counts that is binned and aligned to events."
binned_aligned_spikes = BinnedAlignedSpikes(
data=data,
event_timestamps=event_timestamps,
bin_width_in_milliseconds=bin_width_in_milliseconds,
milliseconds_from_event_to_first_bin=milliseconds_from_event_to_first_bin,
description=description,
name=name,
units_region=units_region,
)
与之前的示例一样,这可以添加到 NWB 文件的处理模块中,然后使用与之前完全相同的代码写入磁盘。
存储来自多个条件(即多个刺激)的数据
BinnedAlignedSpikes
还可以用于存储跨多个条件聚合的数据,同时跟踪每一组计数对应于哪个条件。当您想要将多个条件(例如,不同的刺激、行为事件等)的尖峰计数存储在单个结构中时,这非常有用。由于每个条件可能不会发生相同次数(例如,不同的刺激不会以相同的频率出现),因此不可能有统一的数据结构。因此,使用额外的变量 condition_indices
来指示每一组计数对应于哪个条件。
from ndx_binned_spikes import BinnedAlignedSpikes
binned_aligned_spikes = BinnedAlignedSpikes(
bin_width_in_milliseconds=bin_width_in_milliseconds,
milliseconds_from_event_to_first_bin=milliseconds_from_event_to_first_bin,
data=data, # Shape (number_of_units, number_of_events, number_of_bins)
timestamps=timestamps, # Shape (number_of_events,)
condition_indices=condition_indices, # Shape (number_of_events,)
condition_labels=condition_labels, # Shape (number_of_conditions,) or np.unique(condition_indices).size
)
请注意,这里的 number_of_events
代表所有聚合条件的重复总数。例如,如果数据是从两个刺激物中聚合的,其中第一个刺激物出现了两次,第二个刺激物出现了三次,则 number_of_events
将为 5。
condition_indices
是一个指示向量,应该构建成 data[:, condition_indices == condition_index, :]
对应于具有指定条件索引的条件的分箱脉冲计数。您也可以使用方便的方法 binned_aligned_spikes.get_data_for_condition(condition_index)
来检索相同的数据。
condition_labels
参数是可选的,可以用来存储条件的标签。这是为了帮助理解条件的性质。
需要注意的是,时间戳必须是升序的,并且必须与条件索引和数据的第二维相对应。如果不是这样,将引发 ValueError。为了正确组织数据,您可以使用方便的方法 BinnedAlignedSpikes.sort_data_by_event_timestamps(data=data, event_timestamps=event_timestamps, condition_indices=condition_indices)
,该方法确保数据被正确排序。以下是它的使用方法:
sorted_data, sorted_event_timestamps, sorted_condition_indices = BinnedAlignedSpikes.sort_data_by_event_timestamps(data=data, event_timestamps=event_timestamps, condition_indices=condition_indices)
binned_aligned_spikes = BinnedAlignedSpikes(
bin_width_in_milliseconds=bin_width_in_milliseconds,
milliseconds_from_event_to_first_bin=milliseconds_from_event_to_first_bin,
data=sorted_data,
event_timestamps=sorted_event_timestamps,
condition_indices=sorted_condition_indices,
condition_labels=condition_labels
)
同样可以通过以下脚本实现:
sorted_indices = np.argsort(event_timestamps)
sorted_data = data[:, sorted_indices, :]
sorted_event_timestamps = event_timestamps[sorted_indices]
sorted_condition_indices = condition_indices[sorted_indices]
构建两个条件 BinnedAlignedSpikes
的示例
为了更好地理解该对象的工作原理,让我们考虑一个具体的例子。假设我们有两种不同刺激及其相关时间戳的数据。
import numpy as np
# Two units and 4 bins
data_for_first_stimuli = np.array(
[
# Unit 1
[
[0, 1, 2, 3], # Bin counts around the first timestamp
[4, 5, 6, 7], # Bin counts around the second timestamp
],
# Unit 2
[
[8, 9, 10, 11], # Bin counts around the first timestamp
[12, 13, 14, 15], # Bin counts around the second timestamp
],
],
)
# Also two units and 4 bins but this condition occurred three times
data_for_second_stimuli = np.array(
[
# Unit 1
[
[0, 1, 2, 3], # Bin counts around the first timestamp
[4, 5, 6, 7], # Bin counts around the second timestamp
[8, 9, 10, 11], # Bin counts around the third timestamp
],
# Unit 2
[
[12, 13, 14, 15], # Bin counts around the first timestamp
[16, 17, 18, 19], # Bin counts around the second timestamp
[20, 21, 22, 23], # Bin counts around the third timestamp
],
]
)
timestamps_first_stimuli = [5.0, 15.0]
timestamps_second_stimuli = [1.0, 10.0, 20.0]
构建 BinnedAlignedSpikes
对象的数据方式如下:
from ndx_binned_spikes import BinnedAlignedSpikes
bin_width_in_milliseconds = 100.0
milliseconds_from_event_to_first_bin = -50.0
data = np.concatenate([data_for_first_stimuli, data_for_second_stimuli], axis=1)
event_timestamps = np.concatenate([timestamps_first_stimuli, timestamps_second_stimuli])
condition_indices = np.concatenate([np.zeros(2), np.ones(3)])
condition_labels = ["a", "b"]
sorted_data, sorted_event_timestamps, sorted_condition_indices = BinnedAlignedSpikes.sort_data_by_event_timestamps(data=data, event_timestamps=event_timestamps, condition_indices=condition_indices)
binned_aligned_spikes = BinnedAlignedSpikes(
bin_width_in_milliseconds=bin_width_in_milliseconds,
milliseconds_from_event_to_first_bin=milliseconds_from_event_to_first_bin,
data=sorted_data,
event_timestamps=sorted_event_timestamps,
condition_indices=sorted_condition_indices,
)
然后我们可以通过调用 get_data_for_condition
方法来恢复原始数据。
retrieved_data_for_first_stimuli = binned_aligned_spikes.get_data_for_condition(condition_index=0)
np.testing.assert_array_equal(retrieved_data_for_first_stimuli, data_for_first_stimuli)
此扩展是使用 ndx-template 创建的。
项目详细信息
下载文件
下载您平台上的文件。如果您不确定选择哪个,请了解更多关于 安装包 的信息。