SOAP-BPNN

This is a Behler-Parrinello type neural network [1], which, instead of their original atom-centered symmetry functions, we use the Smooth overlap of atomic positions (SOAP) [2] as the atomic descriptors, computed with torch-spex.

Installation

To install this architecture along with the metatrain package, run:

pip install metatrain[soap_bpnn]

where the square brackets indicate that you want to install the optional dependencies required for soap_bpnn.

Default Hyperparameters

The description of all the hyperparameters used in soap_bpnn is provided further down this page. However, here we provide you with a yaml file containing all the default hyperparameters, which might be convenient as a starting point to create your own hyperparameter files:

architecture:
  name: soap_bpnn
  model:
    soap:
      max_angular: 6
      max_radial: 7
      cutoff:
        radius: 5.0
        width: 0.5
    legacy: true
    bpnn:
      num_hidden_layers: 2
      num_neurons_per_layer: 32
      layernorm: true
    add_lambda_basis: true
    heads: {}
    zbl: false
    long_range:
      enable: false
      use_ewald: false
      smearing: 1.4
      kspace_resolution: 1.33
      interpolation_nodes: 5
  training:
    distributed: false
    distributed_port: 39591
    batch_size: 8
    num_epochs: 100
    warmup_fraction: 0.01
    learning_rate: 0.001
    log_interval: 5
    checkpoint_interval: 25
    atomic_baseline: {}
    scale_targets: true
    fixed_scaling_weights: {}
    per_structure_targets: []
    num_workers: null
    log_mae: false
    log_separate_blocks: false
    best_model_metric: rmse_prod
    loss: mse

Model hyperparameters

The parameters that go under the architecture.model section of the config file are the following:

ModelHypers.soap: SOAPConfig = {'cutoff': {'radius': 5.0, 'width': 0.5}, 'max_angular': 6, 'max_radial': 7}

Configuration of the SOAP descriptors.

ModelHypers.legacy: bool = True

If true, uses the legacy implementation without chemical embedding and with one MLP head per atomic species.

ModelHypers.bpnn: BPNNConfig = {'layernorm': True, 'num_hidden_layers': 2, 'num_neurons_per_layer': 32}

Configuration of the neural network architecture.

ModelHypers.add_lambda_basis: bool = True

This boolean parameter controls whether or not to add a spherical expansion term of the same angular order as the targets, when they are tensorial.

ModelHypers.heads: dict[str, Literal['mlp', 'linear']] = {}

The type of head (“linear” or “mlp”) to use for each target (e.g. heads: {“energy”: “linear”, “mtt::dipole”: “mlp”}). All omitted targets will use a MLP (multi-layer perceptron) head. MLP heads consists of one hidden layer with as many neurons as the SOAP-BPNN (see BPNNConfig.num_neurons_per_layer).

ModelHypers.zbl: bool = False

Whether to use the ZBL short-range repulsion as the baseline for the model. May be needed to achieve better description at the close-contact, repulsive regime.

ModelHypers.long_range: LongRangeHypers = {'enable': False, 'interpolation_nodes': 5, 'kspace_resolution': 1.33, 'smearing': 1.4, 'use_ewald': False}

Parameters related to long-range interactions.

May be needed to describe important long-range effects not captured by the short-range SOAP-BPNN model

with the following definitions needed to fully understand some of the parameters:

class metatrain.soap_bpnn.documentation.SOAPConfig[source]

Configuration for the SOAP descriptors.

max_angular: int = 6

Maximum angular channels of the spherical harmonics when computing the SOAP descriptors.

max_radial: int = 7

Maximum radial channels of the spherical harmonics when computing the SOAP descriptors.

cutoff: SOAPCutoffConfig = {'radius': 5.0, 'width': 0.5}

Determines the cutoff routine of the atomic environment.

class metatrain.soap_bpnn.documentation.SOAPCutoffConfig[source]

Cutoff configuration for the SOAP descriptor.

radius: float = 5.0

Should be set to a value after which most of interatomic is expected to be negligible. Note that the values should be defined in the position units of your dataset.

width: float = 0.5

The radial cutoff of atomic environments is performed smoothly, over another distance defined by this parameter.

class metatrain.soap_bpnn.documentation.BPNNConfig[source]

Configuration for the BPNN architecture.

num_hidden_layers: int = 2

Controls the depth of the neural network. Increasing this generally leads to better accuracy from the increased descriptivity, but comes at the cost of increased training and evaluation time.

num_neurons_per_layer: int = 32

Controls the width of the neural network. Increasing this generally leads to better accuracy from the increased descriptivity, but comes at the cost of increased training and evaluation time.

layernorm: bool = True

Whether to use layer normalization before the neural network. Setting this hyperparameter to false will lead to slower convergence of training, but might lead to better generalization outside of the training set distribution.

Trainer hyperparameters

The parameters that go under the architecture.trainer section of the config file are the following:

TrainerHypers.distributed: bool = False

Whether to use distributed training

TrainerHypers.distributed_port: int = 39591

Port for distributed communication among processes

TrainerHypers.batch_size: int = 8

The number of samples to use in each batch of training. This hyperparameter controls the tradeoff between training speed and memory usage. In general, larger batch sizes will lead to faster training, but might require more memory.

TrainerHypers.num_epochs: int = 100

Number of epochs.

TrainerHypers.warmup_fraction: float = 0.01

Fraction of training steps used for learning rate warmup.

TrainerHypers.learning_rate: float = 0.001

Learning rate.

TrainerHypers.log_interval: int = 5

Interval to log metrics.

TrainerHypers.checkpoint_interval: int = 25

Interval to save checkpoints.

TrainerHypers.atomic_baseline: dict[str, float | dict[int, float]] = {}

The baselines for each target.

By default, metatrain will fit a linear model (CompositionModel) to compute the least squares baseline for each atomic species for each target.

However, this hyperparameter allows you to provide your own baselines. The value of the hyperparameter should be a dictionary where the keys are the target names, and the values are either (1) a single baseline to be used for all atomic types, or (2) a dictionary mapping atomic types to their baselines. For example:

  • atomic_baseline: {"energy": {1: -0.5, 6: -10.0}} will fix the energy baseline for hydrogen (Z=1) to -0.5 and for carbon (Z=6) to -10.0, while fitting the baselines for the energy of all other atomic types, as well as fitting the baselines for all other targets.

  • atomic_baseline: {"energy": -5.0} will fix the energy baseline for all atomic types to -5.0.

  • atomic_baseline: {"mtt:dos": 0.0} sets the baseline for the “mtt:dos” target to 0.0, effectively disabling the atomic baseline for that target.

This atomic baseline is substracted from the targets during training, which avoids the main model needing to learn atomic contributions, and likely makes training easier. When the model is used in evaluation mode, the atomic baseline is added on top of the model predictions automatically.

Note

This atomic baseline is a per-atom contribution. Therefore, if the property you are predicting is a sum over all atoms (e.g., total energy), the contribution of the atomic baseline to the total property will be the atomic baseline multiplied by the number of atoms of that type in the structure.

TrainerHypers.scale_targets: bool = True

Normalize targets to unit std during training.

TrainerHypers.fixed_scaling_weights: dict[str, float | dict[int, float]] = {}

Weights for target scaling.

This is passed to the fixed_weights argument of Scaler.train_model, see its documentation to understand exactly what to pass here.

TrainerHypers.per_structure_targets: list[str] = []

Targets to calculate per-structure losses.

TrainerHypers.num_workers: int | None = None

Number of workers for data loading. If not provided, it is set automatically.

TrainerHypers.log_mae: bool = False

Log MAE alongside RMSE

TrainerHypers.log_separate_blocks: bool = False

Log per-block error.

TrainerHypers.best_model_metric: Literal['rmse_prod', 'mae_prod', 'loss'] = 'rmse_prod'

Metric used to select best checkpoint (e.g., rmse_prod)

TrainerHypers.loss: str | dict[str, LossSpecification | str] = 'mse'

This section describes the loss function to be used. See the Loss functions for more details.

References