跳转到主要内容

没有提供项目描述

项目描述

raster2dggs

pypi

基于Python的CLI工具,用于并行将栅格文件索引到DGGS,并写入Parquet格式。

目前仅支持H3 DGGS,由于它是为特定内部用例开发的,因此可能存在其他限制,尽管它旨在作为通用抽象。欢迎贡献、建议、错误报告和措辞强烈的信件。

Example use case for raster2dggs, showing how an input raster can be indexed at different H3 resolutions, while retaining information in separate, named bands

安装

pip install raster2dggs

使用

raster2dggs h3 --help

Usage: raster2dggs h3 [OPTIONS] RASTER_INPUT OUTPUT_DIRECTORY

  Ingest a raster image and index it to the H3 DGGS.

  RASTER_INPUT is the path to input raster data; prepend with protocol like
  s3:// or hdfs:// for remote data. OUTPUT_DIRECTORY should be a directory,
  not a file, as it will be the write location for an Apache Parquet data
  store, with partitions equivalent to parent cells of target cells at a fixed
  offset. However, this can also be remote (use the appropriate prefix, e.g.
  s3://).

Options:
  -v, --verbosity LVL             Either CRITICAL, ERROR, WARNING, INFO or
                                  DEBUG  [default: INFO]
  -r, --resolution [0|1|2|3|4|5|6|7|8|9|10|11|12|13|14|15]
                                  H3 resolution to index  [required]
  -pr, --parent_res [0|1|2|3|4|5|6|7|8|9|10|11|12|13|14|15]
                                  H3 Parent resolution to index and aggregate
                                  to. Defaults to resolution - 6
  -u, --upscale INTEGER           Upscaling factor, used to upsample input
                                  data on the fly; useful when the raster
                                  resolution is lower than the target DGGS
                                  resolution. Default (1) applies no
                                  upscaling. The resampling method controls
                                  interpolation.  [default: 1]
  -c, --compression [snappy|gzip|zstd]
                                  Name of the compression to use when writing
                                  to Parquet.  [default: snappy]
  -t, --threads INTEGER           Number of threads to use when running in
                                  parallel. The default is determined based
                                  dynamically as the total number of available
                                  cores, minus one.  [default: 7]
  -a, --aggfunc [count|mean|sum|prod|std|var|min|max|median]
                                  Numpy aggregate function to apply when
                                  aggregating cell values after DGGS indexing,
                                  in case of multiple pixels mapping to the
                                  same DGGS cell.  [default: mean]
  -d, --decimals INTEGER          Number of decimal places to round values
                                  when aggregating. Use 0 for integer output.
                                  [default: 1]
  -o, --overwrite
  --warp_mem_limit INTEGER        Input raster may be warped to EPSG:4326 if
                                  it is not already in this CRS. This setting
                                  specifies the warp operation's memory limit
                                  in MB.  [default: 12000]
  --resampling [nearest|bilinear|cubic|cubic_spline|lanczos|average|mode|gauss|max|min|med|q1|q3|sum|rms]
                                  Input raster may be warped to EPSG:4326 if
                                  it is not already in this CRS. Or, if the
                                  upscale parameter is greater than 1, there
                                  is a need to resample. This setting
                                  specifies this resampling algorithm.
                                  [default: average]
  --version                       Show the version and exit.
  --help                          Show this message and exit.

可视化输出

输出为Apache Parquet格式,每个分区一个文件。分区基于父单元格ID,父分辨率由目标DGGS分辨率决定。

为了快速查看输出,您可以使用pandas读取Apache Parquet,然后使用h3-pandas和geopandas将其转换为GeoPackage,以便在桌面GIS(如QGIS)中可视化。Apache Parquet输出按DGGS列索引,因此应准备好与同一DGGS中准备的其他数据关联。

>>> import pandas as pd
>>> import h3pandas
>>> o = pd.read_parquet('./tests/data/output/9/Sen2_Test')
>>> o
band             B02  B03  B04  B05  B06  B07  B08  B8A  B11  B12
h3_09                                                            
89bb0981003ffff    9   27   16   62  175  197  228  247  102   36
89bb0981007ffff   10   30   17   66  185  212  238  261  113   40
89bb098100bffff   10   26   15   60  169  190  228  241  103   37
89bb098100fffff   11   29   17   66  181  203  243  257  109   39
89bb0981013ffff    8   26   16   58  172  199  220  244   98   34
...              ...  ...  ...  ...  ...  ...  ...  ...  ...  ...
89bb0d6eea7ffff   10   18   15   41  106  120  140  146  102   47
89bb0d6eeabffff   12   19   15   39   95  107  125  131   84   39
89bb0d6eeafffff   12   21   17   43  101  115  134  141  111   51
89bb0d6eeb7ffff   10   20   14   45  120  137  160  165  111   48
89bb0d6eebbffff   15   28   20   56  146  166  198  202  108   47

[5656 rows x 10 columns]
>>> o.h3.h3_to_geo_boundary().to_file('~/Downloads/Sen2_Test_h3-9.gpkg', driver='GPKG')

安装

开发

简而言之,要开始

  • 安装 Poetry
  • 安装 GDAL
    • 如果您使用的是Windows,在运行后续命令之前,可能需要先执行 pip install gdal
    • 在Linux上,根据您的平台具体说明安装GDAL 3.6+,包括开发头文件,即 libgdal-dev
  • 使用 poetry init 创建虚拟环境。这将安装必要的依赖项。
  • 之后,可以使用 poetry shell 重新激活虚拟环境。

如果您运行 poetry install,CLI工具将被别名化,您可以使用 raster2dggs 而不是 poetry run raster2dggs,这是在不运行 poetry install 的情况下的替代方案。

代码格式化

Code style: black

请提交前运行 black .

测试

已将两个示例文件上传到具有 s3:GetObject 公共权限的S3存储桶中。

  • s3://raster2dggs-test-data/Sen2_Test.tif(示例Sentinel 2影像,10个波段,矩形,Int16,LZW压缩,约10x10m像素,68.6 MB)
  • s3://raster2dggs-test-data/TestDEM.tif(示例激光衍生的DEM,1个波段,不规则形状且有空数据,Float32,未压缩,10x10m像素,183.5 MB)

您可以使用这些文件进行测试。然而,您也可以使用本地文件进行测试,这将更快。

示例命令

raster2dggs h3 --resolution 11 -d 0 s3://raster2dggs-test-data/Sen2_Test.tif ./tests/data/output/11/Sen2_Test
raster2dggs h3 --resolution 13 --compression zstd --resampling nearest -a median -d 1 -u 2 s3://raster2dggs-test-data/TestDEM.tif ./tests/data/output/13/TestDEM

引用

@software{raster2dggs,
  title={{raster2dggs}},
  author={Ardo, James and Law, Richard},
  url={https://github.com/manaakiwhenua/raster2dggs},
  version={0.2.6},
  date={2023-02-09}
}

APA/Harvard

Ardo, J., & Law, R. (2023). raster2dggs (0.2.6) [计算机软件]. https://github.com/manaakiwhenua/raster2dggs

manaakiwhenua-standards

项目详情


下载文件

下载适合您平台的文件。如果您不确定选择哪个,请了解有关 安装包 的更多信息。

源分发

raster2dggs-0.2.6.tar.gz (13.2 kB 查看哈希值

上传时间

构建分发

raster2dggs-0.2.6-py3-none-any.whl (26.2 kB 查看哈希值

上传时间 Python 3

支持者

AWS AWS 云计算和安全赞助商 Datadog Datadog 监控 Fastly Fastly CDN Google Google 下载分析 Microsoft Microsoft PSF 赞助商 Pingdom Pingdom 监控 Sentry Sentry 错误日志 StatusPage StatusPage 状态页面