跳转到主要内容

来自CASA表的xarray数据集

项目描述

https://img.shields.io/pypi/v/dask-ms.svg https://github.com/ratt-ru/dask-ms/actions/workflows/ci.yml/badge.svg Documentation Status

通过xarray将CASA表构建成python-casacore的数据集。数据集中的dask数组通过延迟调用casacore.tables.table.getcol来支持。

支持将变量写回表中的相应列。

此包的目的是支持将测量集作为并行、分布式射电天文学算法的数据源和接收器。

安装

安装带有xarray支持

$ pip install dask-ms[xarray]

没有与xarray类似的,但功能减少的数据集在dask-ms自身中得到了复制。专家用户可能希望使用此选项来减少Python包依赖。

$ pip install dask-ms

文档

https://dask-ms.readthedocs.io

Gitter页面

https://gitter.im/dask-ms/community

示例用法

  import dask.array as da
  from daskms import xds_from_table, xds_to_table

  # Create xarray datasets from Measurement Set "WSRT.MS"
  ds = xds_from_table("WSRT.MS")
  # Set the flag Variable on first Dataset to it's inverse
  ds[0]['flag'] = (ds[0].flag.dims, da.logical_not(ds[0].flag))
  # Write the flag column back to the Measurement Set
  xds_to_table(ds, "WSRT.MS", "FLAG").compute()

  print ds

[<xarray.Dataset>
 Dimensions:         (chan: 64, corr: 4, row: 6552, uvw: 3)
 Coordinates:
     ROWID           (row) int32 dask.array<shape=(6552,), chunksize=(6552,)>
 Dimensions without coordinates: chan, corr, row, uvw
 Data variables:
     IMAGING_WEIGHT  (row, chan) float32 dask.array<shape=(6552, 64), chunksize=(6552, 64)>
     ANTENNA1        (row) int32 dask.array<shape=(6552,), chunksize=(6552,)>
     STATE_ID        (row) int32 dask.array<shape=(6552,), chunksize=(6552,)>
     EXPOSURE        (row) float64 dask.array<shape=(6552,), chunksize=(6552,)>
     MODEL_DATA      (row, chan, corr) complex64 dask.array<shape=(6552, 64, 4), chunksize=(6552, 64, 4)>
     FLAG_ROW        (row) bool dask.array<shape=(6552,), chunksize=(6552,)>
     CORRECTED_DATA  (row, chan, corr) complex64 dask.array<shape=(6552, 64, 4), chunksize=(6552, 64, 4)>
     PROCESSOR_ID    (row) int32 dask.array<shape=(6552,), chunksize=(6552,)>
     WEIGHT          (row, corr) float32 dask.array<shape=(6552, 4), chunksize=(6552, 4)>
     FLAG            (row, chan, corr) bool dask.array<shape=(6552, 64, 4), chunksize=(6552, 64, 4)>
     TIME            (row) float64 dask.array<shape=(6552,), chunksize=(6552,)>
     SIGMA           (row, corr) float32 dask.array<shape=(6552, 4), chunksize=(6552, 4)>
     SCAN_NUMBER     (row) int32 dask.array<shape=(6552,), chunksize=(6552,)>
     INTERVAL        (row) float64 dask.array<shape=(6552,), chunksize=(6552,)>
     OBSERVATION_ID  (row) int32 dask.array<shape=(6552,), chunksize=(6552,)>
     TIME_CENTROID   (row) float64 dask.array<shape=(6552,), chunksize=(6552,)>
     ARRAY_ID        (row) int32 dask.array<shape=(6552,), chunksize=(6552,)>
     ANTENNA2        (row) int32 dask.array<shape=(6552,), chunksize=(6552,)>
     DATA            (row, chan, corr) complex64 dask.array<shape=(6552, 64, 4), chunksize=(6552, 64, 4)>
     FEED1           (row) int32 dask.array<shape=(6552,), chunksize=(6552,)>
     FEED2           (row) int32 dask.array<shape=(6552,), chunksize=(6552,)>
     UVW             (row, uvw) float64 dask.array<shape=(6552, 3), chunksize=(6552, 3)>
 Attributes:
     FIELD_ID:      0
     DATA_DESC_ID:  0]

局限性

  1. 许多测量集列被定义为可变形状,但实际数据是固定的。dask-ms将从第一行推断数据的形状,并且必须与其他行的形状一致。例如,这可能是在测量集中存在具有不同SPW通道的多个频谱窗口时的问题。

    dask-ms通过将测量集划分为多个数据集来解决此问题。使用第一行的形状来推断分区的形状。因此,在存在多个频谱窗口的情况下,我们可以通过DATA_DESC_ID来分区测量集,为每个频谱窗口创建一个数据集。

项目详情


下载文件

下载您平台上的文件。如果您不确定选择哪个,请了解更多关于安装包的信息。

源分布

dask_ms-0.2.21.tar.gz (122.5 KB 查看哈希值

上传时间

构建分布

dask_ms-0.2.21-py3-none-any.whl (157.6 KB 查看哈希值

上传时间 Python 3

支持者

AWS AWS 云计算和安全赞助商 Datadog Datadog 监控 Fastly Fastly CDN Google Google 下载分析 Microsoft Microsoft PSF 赞助商 Pingdom Pingdom 监控 Sentry Sentry 错误记录 StatusPage StatusPage 状态页面