跳转到主要内容

Python绑定,用于PACE和拟合代码"pacemaker"的实用工具

项目描述

pyace

pyace是原子簇扩展的Python实现。它提供了分析、势能转换和拟合的基本功能。!!! 这是pyace的有限功能版本 !!!

如果您想要完全功能的版本,请通过电子邮件 yury.lysogorskiy@rub.de 联系我们

安装

pip install pyace-lite

(可选) 安装 tensorpotential

如果您想使用由Dr. Anton Bochkarev制作的原子簇扩展的TensorFlow实现,请通过电子邮件联系我们。

(!) 已知问题

如果您遇到segmentation fault错误,请尝试使用以下命令升级numpy

pip install --upgrade numpy --force 

目录结构

  • lib/:包含pyace的额外库
  • src/pyace/:绑定

实用工具

势能转换

存在两种基本的ACE势能格式

  1. B-基集以YAML格式,即'Al.pbe.yaml'。这是开发人员的内部完整格式
  2. Ctilde-基集以纯文本格式,即'Al.pbe.ace'。该格式从B-基集不可逆地转换为公共势能分布,并由LAMMPS使用

要转换势能,可以使用以下与pyace包一起安装到您的可执行路径的实用工具

  • YAMLace : pace_yaml2ace。用法
  pace_yaml2ace [-h] [-o OUTPUT] input

Pacemaker

pacemaker是拟合原子簇扩展势能的实用工具。用法

pacemaker [-h] [-o OUTPUT] [-p POTENTIAL] [-ip INITIAL_POTENTIAL]
                 [-b BACKEND] [-d DATA] [--query-data] [--prepare-data]
                 [-l LOG]
                 input

Fitting utility for atomic cluster expansion potentials

positional arguments:
  input                 input YAML file

optional arguments:
  -h, --help            show this help message and exit
  -o OUTPUT, --output OUTPUT
                        output B-basis YAML file name, default:
                        output_potential.yaml
  -p POTENTIAL, --potential POTENTIAL
                        input potential YAML file name, will override input
                        file 'potential' section
  -ip INITIAL_POTENTIAL, --initial-potential INITIAL_POTENTIAL
                        initial potential YAML file name, will override input
                        file 'potential::initial_potential' section
  -b BACKEND, --backend BACKEND
                        backend evaluator, will override section
                        'backend::evaluator' from input file
  -d DATA, --data DATA  data file, will override section 'YAML:fit:filename'
                        from input file
  --query-data          query the training data from database, prepare and
                        save them
  --prepare-data        prepare and save training data only
  -l LOG, --log LOG     log filename (default log.txt)

所需的设置由输入YAML文件提供。主要部分

1. 截止值和(可选)元数据

  • 拟合的全局截止值设置为
cutoff: 10.0
  • 元数据(可选)

这是任意键(字符串)-值(字符串)对,将被添加到势能YAML文件中

metadata:
  info: some info
  comment: some comment
  purpose: some purpose

此外,将自动添加starttimeuser字段

2. 数据集规范部分

拟合数据集可以自动从 structdb 查询(如果已安装相应的 structdborm 包并配置了数据库连接,请参阅家目录中的 structdb.ini 文件)。或者,可以将数据集保存为具有特殊列名的 pickled pandas dataframe 文件:#TODO: 添加列名

示例

data: # dataset specification section
  # data configuration section
  config:
    element: Al                    # element name
    calculator: FHI-aims/PBE/tight # calculator type from `structdb` 
    # ref_energy: -1.234           # single atom reference energy
                                   # if not specified, then it will be queried from database

  # seed: 42                       # random seed for shuffling the data  
  # query_limit: 1000              # limiting number of entries to query from `structdb`
                                   # ignored if reading from cache

  # parallel: 3                    # number of parallel workers to preprocess the data, `pandarallel` package required
                                   # if not specified, serial mode will be used 
  # cache_ref_df: True             # whether to store the queried or modified dataset into file, default - True
  # filename: some.pckl.gzip       # force to read reference pickled dataframe from given file
  # ignore_weights: False          # whether to ignore energy and force weighting columns in dataframe
  # datapath: ../data              # path to folder with cache files with pickled dataframes 

或者,除了 data::config 部分,也可以只指定以 pickled dataframe 为内容的缓存文件作为 data::filename

data: 
  filename: small_df_tf_atoms.pckl
  datapath: ../tests/

创建和保存 拟合数据框的子选择 的示例在 notebooks/data_preprocess.ipynb 中给出

生成 自定义能量/力权重 的示例在 notebooks/data_custom_weights.ipynb 中给出

查询数据

您可以直接查询和预处理数据,而无需运行潜在的拟合。以下是简洁的输入 YAML

# input.yaml file

cutoff: 10.0  # use larger cutoff to have excess neighbour list
data: # dataset specification section
  config:
    element: Al                    # element name
    calculator: FHI-aims/PBE/tight # calculator type from `structdb`
  seed: 42
  parallel: 3                      # parallel data processing. WARNING! higher memory usage is possible
  datapath: ../data                # path to the directory with cache files
  # query_limit: 100               # number of entries to query  

然后执行 pacemaker --query-data input.yaml 来查询和构建数据集,并使用 pyace 邻居列表。对于构建 两者 pyacetensorpot 邻居列表,请使用 pacemaker --query-data input.yaml -b tensorpot

准备数据/构建邻居列表

您可以使用现有的 .pckl.gzip 数据集并为其生成所有必要的列,包括邻居列表。以下是最简输入 YAML

# input.yaml file

cutoff: 10.

data:
  filename: my_dataset.pckl.gzip

backend:
  evaluator: tensorpot  # pyace, tensorpot

然后执行 pacemaker --prepare-data input.yaml 从例如 pyiron 生成 my_dataset.pckl.gzip 的示例在 notebooks/convert-pyiron-to-pacemaker.ipynb 中显示

3. 原子间势能(或 B 底部)配置

可以定义初始原子间势能配置为

potential:
  deltaSplineBins: 0.001
  element: Al
  fs_parameters: [1, 1, 1, 0.5]
  npot: FinnisSinclair
  NameOfCutoffFunction: cos

  rankmax: 3
  nradmax: [ 4, 3, 3 ]  # per-rank values of nradmax
  lmax: [ 0, 1, 1 ]     # per-rank values of lmax,  lmax=0 for first rank always!

  ndensity: 2
  rcut: 8.7
  dcut: 0.01
  radparameters: [ 5.25 ]
  radbase: ChebExpCos

 ##hard-core repulsion (optional)
 # core-repulsion: [500, 10]
 # rho_core_cut: 50
 # drho_core_cut: 20

 # basisdf:  /some/path/to/pyace_bbasisfunc_df.pckl      # path to the dataframe with "white list" of basis functions to use in fit
 # initial_potential: whatever.yaml                      # in "ladder" fitting scheme, potential from with to start fit

如果您想继续在 potential.yaml 文件中拟合现有的势能,则指定

potential: potential yaml

或者,可以使用 pacemaker ... -p potential.yaml 选项

4. 拟合设置

fit 部分的示例

fit:
  loss: { kappa: 0, L1_coeffs: 0,  L2_coeffs: 0,  w1_coeffs: 0, w2_coeffs: 0,
          w0_rad: 0, w1_rad: 0, w2_rad: 0 }

  weighting:
   type: EnergyBasedWeightingPolicy
    nfit: 10000
    cutoff: 10
    DElow: 1.0
    DEup: 10.0
    DE: 1.0
    DF: 1.0
    wlow: 0.75
   seed: 42

  optimizer: BFGS # L-BFGS-B # Nelder-Mead
  maxiter: 1000

  # fit_cycles: 2               # (optional) number of consequentive runs of fitting algorithm,
                                # that helps convergence 
  # noise_relative_sigma: 1e-2   # applying Gaussian noise with specified relative sigma/mean ratio to all potential optimizable coefficients
  # noise_absolute_sigma: 1e-3   # applying Gaussian noise with specified absolute sigma to all potential optimizable coefficients
  # ladder_step: [10, 0.02]     # Possible values:
                                #  - integer >= 1 - number of basis functions to add in ladder scheme,
                                #  - float between 0 and 1 - relative ladder step size wrt. current basis step
                                #  - list of both above values - select maximum between two possibilities on each iteration 
                                # see. Ladder scheme fitting for more info 
  # ladder_type: body_order     # default
                                # Possible values:
                                # body_order  -  new basis functions are added according to the body-order, i.e., a function with higher body-order
                                #                will not be added until the list of functions of the previous body-order is exhausted
                                # power_order -  the order of adding new basis functions is defined by the "power rank" p of a function.
                                #                p = len(ns) + sum(ns) + sum(ls). Functions with the smallest p are added first  

如果没有指定,则将使用 均匀权重仅能量 拟合(kappa=0),fit_cycles=1,noise_relative_sigma = 0 的设置。

5. 后端指定

backend:
  evaluator: pyace  # pyace, tensorpot

  ## for `pyace` evaluator, following options are available:
  # parallel_mode: process    # process, serial  - parallelization mode for `pyace` evaluator
  # n_workers: 4              # number of parallel workers for `process` parallelization mode

  ## for `tensorpot` evaluator, following options are available:
  # batch_size: 10            # batch size for loss function evaluation, default is 10 
  # display_step: 20          # frequency of detailed metric calculation and printing  

或者,可以选择后端作为 pacemaker ... -b tensorpot

梯形方案拟合

在梯形方案中,通过逐步添加新的基函数部分,形成从 初始势能最终势能 的“梯子”。以下设置应添加到输入 YAML 文件中

  • 通过提供 potential 部分来指定 最终势能 的形状
potential:
  deltaSplineBins: 0.001
  element: Al
  fs_parameters: [1, 1, 1, 0.5]
  npot: FinnisSinclair
  NameOfCutoffFunction: cos
  rankmax: 3

  nradmax: [4, 1, 1]
  lmax: [0, 1, 1]

  ndensity: 2
  rcut: 8.7
  dcut: 0.01
  radparameters: [5.25]
  radbase: ChebExpCos 
  • 通过在 potential 部分中提供 initial_potential 选项来指定 初始或中间势能
potential:

    ...

    initial_potential: some_start_or_interim_potential.yaml    # potential to start fit from

如果未指定 初始或中间势能,则拟合将从一个空势能开始。或者,您可以通过命令行选项指定 初始或中间势能

pacemaker ... -ip some_start_or_interim_potential.yaml

  • fit 部分中指定 ladder_step
fit:

    ...

  ladder_step: [10, 0.02]       # Possible values:
                                #  - integer >= 1 - number of basis functions to add in ladder scheme,
                                #  - float between 0 and 1 - relative ladder step size wrt. current basis step
                                #  - list of both above values - select maximum between two possibilities on each iteration 

项目详情


下载文件

下载适用于您平台的文件。如果您不确定选择哪个,请了解更多关于 安装软件包 的信息。

源分发

pyace-lite-0.0.1.5.tar.gz (48.6 kB 查看散列)

上传时间

构建的分发

pyace_lite-0.0.1.5-cp39-cp39-manylinux2014_x86_64.whl (1.8 MB 查看散列)

上传于 CPython 3.9

pyace_lite-0.0.1.5-cp38-cp38-manylinux2014_x86_64.whl (1.8 MB 查看哈希值)

上传于 CPython 3.8

pyace_lite-0.0.1.5-cp37-cp37m-manylinux2014_x86_64.whl (1.9 MB 查看哈希值)

上传于 CPython 3.7m

pyace_lite-0.0.1.5-cp36-cp36m-manylinux2014_x86_64.whl (1.9 MB 查看哈希值)

上传于 CPython 3.6m

由以下机构支持

AWS AWS 云计算和安全赞助商 Datadog Datadog 监控 Fastly Fastly CDN Google Google 下载分析 Microsoft Microsoft PSF 赞助商 Pingdom Pingdom 监控 Sentry Sentry 错误日志 StatusPage StatusPage 状态页面