用于扩展dask-jobqueue并针对NCAR集群设置适当的设置的工具
项目描述
ncar-jobqueue
ncar-jobqueue
提供了为NCAR集群配置 dask-jobqueue 的实用工具,并具有适当的默认设置。
以下计算服务器受支持
- Cheyenne (cheyenne.ucar.edu)
- Casper (DAV) (casper.ucar.edu)
- Hobart (hobart.cgd.ucar.edu)
- Izumi (izumi.unified.ucar.edu)
徽章
CI | |
---|---|
包 | |
许可 |
安装
可以从PyPI使用pip安装NCAR-jobqueue
python -m pip install ncar-jobqueue
NCAR-jobqueue也通过conda-forge提供,适用于conda安装
conda install -c conda-forge ncar-jobqueue
配置
ncar-jobqueue
提供了一个具有不同集群适当默认设置的配置文件。该配置文件位于 ~/.config/dask/ncar-jobqueue.yaml
ncar-jobqueue.yaml
cheyenne:
pbs:
#project: XXXXXXXX
name: dask-worker-cheyenne
cores: 18 # Total number of cores per job
memory: '109GB' # Total amount of memory per job
processes: 18 # Number of Python processes per job
interface: ib0 # Network interface to use like eth0 or ib0
queue: regular
walltime: '01:00:00'
resource-spec: select=1:ncpus=36:mem=109GB
log-directory: '/glade/scratch/${USER}/dask/cheyenne/logs'
local-directory: '/glade/scratch/${USER}/dask/cheyenne/local-dir'
job-extra: []
env-extra: []
death-timeout: 60
casper-dav:
pbs:
#project: XXXXXXXX
name: dask-worker-casper-dav
cores: 2 # Total number of cores per job
memory: '25GB' # Total amount of memory per job
processes: 1 # Number of Python processes per job
interface: ib0
walltime: '01:00:00'
resource-spec: select=1:ncpus=1:mem=25GB
queue: casper
log-directory: '/glade/scratch/${USER}/dask/casper-dav/logs'
local-directory: '/glade/scratch/${USER}/dask/casper-dav/local-dir'
job-extra: []
env-extra: []
death-timeout: 60
hobart:
pbs:
name: dask-worker-hobart
cores: 10 # Total number of cores per job
memory: '96GB' # Total amount of memory per job
processes: 10 # Number of Python processes per job
# interface: null # ib0 doesn't seem to be working on Hobart
queue: medium
walltime: '08:00:00'
resource-spec: nodes=1:ppn=48
log-directory: '/scratch/cluster/${USER}/dask/hobart/logs'
local-directory: '/scratch/cluster/${USER}/dask/hobart/local-dir'
job-extra: ['-r n']
env-extra: []
death-timeout: 60
izumi:
pbs:
name: dask-worker-izumi
cores: 10 # Total number of cores per job
memory: '96GB' # Total amount of memory per job
processes: 10 # Number of Python processes per job
# interface: null # ib0 doesn't seem to be working on Hobart
queue: medium
walltime: '08:00:00'
resource-spec: nodes=1:ppn=48
log-directory: '/scratch/cluster/${USER}/dask/izumi/logs'
local-directory: '/scratch/cluster/${USER}/dask/izumi/local-dir'
job-extra: ['-r n']
env-extra: []
death-timeout: 60
注意
- 要配置
dask-jobqueue
提交批处理作业时使用的默认项目账户,请在~/.config/dask/ncar-jobqueue.yaml
中取消注释project
键/行,并将其设置为适当的值。
使用
注意
⚠️ dask-jobqueue
的在线文档可在此处找到。⚠️
Casper
>>> from ncar_jobqueue import NCARCluster
>>> from dask.distributed import Client
>>> cluster = NCARCluster(project='XXXXXXXX')
>>> cluster
PBSCluster(0f23b4bf, 'tcp://xx.xxx.x.x:xxxx', workers=0, threads=0, memory=0 B)
>>> cluster.scale(jobs=2)
>>> client = Client(cluster)
Cheyenne
>>> from ncar_jobqueue import NCARCluster
>>> from dask.distributed import Client
>>> cluster = NCARCluster(project='XXXXXXXX')
>>> cluster
PBSCluster(0f23b4bf, 'tcp://xx.xxx.x.x:xxxx', workers=0, threads=0, memory=0 B)
>>> cluster.scale(jobs=2)
>>> client = Client(cluster)
Hobart
>>> from ncar_jobqueue import NCARCluster
>>> from dask.distributed import Client
>>> cluster = NCARCluster(project='XXXXXXXX')
>>> cluster
PBSCluster(0f23b4bf, 'tcp://xx.xxx.x.x:xxxx', workers=0, threads=0, memory=0 B)
>>> cluster.scale(jobs=2)
>>> client = Client(cluster)
Izumi
>>> from ncar_jobqueue import NCARCluster
>>> from dask.distributed import Client
>>> cluster = NCARCluster(project='XXXXXXXX')
>>> cluster
PBSCluster(0f23b4bf, 'tcp://xx.xxx.x.x:xxxx', workers=0, threads=0, memory=0 B)
>>> cluster.scale(jobs=2)
>>> client = Client(cluster)
非NCAR机器
在非NCAR机器上,ncar-jobqueue
将会警告用户,并使用 distributed.LocalCluster
>>> from ncar_jobqueue import NCARCluster
.../ncar_jobqueue/cluster.py:17: UserWarning: Unable to determine which NCAR cluster you are running on... Returning a `distributed.LocalCluster` class.
warn(message)
>>> from dask.distributed import Client
>>> cluster = NCARCluster()
>>> cluster
LocalCluster(3a7dd0f6, 'tcp://127.0.0.1:64184', workers=4, threads=8, memory=17.18 GB)
项目详情
下载文件
下载适用于您平台的文件。如果您不确定选择哪个,请了解更多关于 安装包 的信息。
源代码分发
ncar-jobqueue-2021.4.14.tar.gz (16.8 kB 查看哈希值)
构建分发
关闭
ncar-jobqueue-2021.4.14.tar.gz 的哈希值
算法 | 哈希摘要 | |
---|---|---|
SHA256 | c5e6e61f7acb013a9714257ee32f68073e4d803ee484312cabe4c5d38599caf9 |
|
MD5 | 022539dbd7ad7322189beb79a406d97b |
|
BLAKE2b-256 | 618d5cdc8f5757071e77d081d605c0129022c51f625315f1ee98d654c69210e0 |
关闭
ncar_jobqueue-2021.4.14-py3-none-any.whl 的哈希值
算法 | 哈希摘要 | |
---|---|---|
SHA256 | 5ffba69c025fb9062398bae75dde1a0ce87f166c428baf4503f0d85c485e7bbf |
|
MD5 | 53de845d5e53a0b94b6b0aebf6aed41a |
|
BLAKE2b-256 | 240a02f0c21a1476046196d3aa05afcf76d641f20add1a6bb144326f664aa0fa |