NanoPET (deprecated)¶
Warning
This is a deprecated model. You should not use it for anything important, and support for it will be removed in future versions of metatrain. Please use the PET model instead.
Installation¶
To install this architecture along with the metatrain package, run:
pip install metatrain[nanopet]
where the square brackets indicate that you want to install the optional
dependencies required for nanopet.
Default Hyperparameters¶
The description of all the hyperparameters used in nanopet is provided
further down this page. However, here we provide you with a yaml file containing all
the default hyperparameters, which might be convenient as a starting point to
create your own hyperparameter files:
architecture:
name: deprecated.nanopet
model:
cutoff: 5.0
cutoff_width: 0.5
d_pet: 128
num_heads: 4
num_attention_layers: 2
num_gnn_layers: 2
heads: {}
zbl: false
long_range:
enable: false
use_ewald: false
smearing: 1.4
kspace_resolution: 1.33
interpolation_nodes: 5
training:
distributed: false
distributed_port: 39591
batch_size: 16
num_epochs: 10000
learning_rate: 0.0003
scheduler_patience: 100
scheduler_factor: 0.8
log_interval: 10
checkpoint_interval: 100
atomic_baseline: {}
scale_targets: true
fixed_scaling_weights: {}
per_structure_targets: []
num_workers: null
log_mae: false
log_separate_blocks: false
best_model_metric: rmse_prod
loss: mse
Model hyperparameters¶
The parameters that go under the architecture.model section of the config file
are the following:
- ModelHypers.cutoff: float = 5.0¶
Cutoff radius for neighbor search.
This should be set to a value after which most of the interactions between atoms is expected to be negligible. A lower cutoff will lead to faster models.
- ModelHypers.d_pet: int = 128¶
Dimension of the edge features.
This hyperparameters controls width of the neural network. In general, increasing it might lead to better accuracy, especially on larger datasets, at the cost of increased training and evaluation time.
- ModelHypers.num_attention_layers: int = 2¶
The number of attention layers in each layer of the graph neural network. Depending on the dataset, increasing this hyperparameter might lead to better accuracy, at the cost of increased training and evaluation time.
- ModelHypers.num_gnn_layers: int = 2¶
The number of graph neural network layers.
In general, decreasing this hyperparameter to 1 will lead to much faster models, at the expense of accuracy. Increasing it may or may not lead to better accuracy, depending on the dataset, at the cost of increased training and evaluation time.
- ModelHypers.heads: dict[str, Literal['linear', 'mlp']] = {}¶
The type of head (“linear” or “mlp”) to use for each target (e.g.
heads: {"energy": "linear", "mtt::dipole": "mlp"}). All omitted targets will use a MLP (multi-layer perceptron) head. MLP heads consist of two hidden layers with dimensionalityd_pet.
- ModelHypers.long_range: LongRangeHypers = {'enable': False, 'interpolation_nodes': 5, 'kspace_resolution': 1.33, 'smearing': 1.4, 'use_ewald': False}¶
Long-range Coulomb interactions parameters.
Trainer hyperparameters¶
The parameters that go under the architecture.trainer section of the config file
are the following:
- TrainerHypers.batch_size: int = 16¶
The number of samples to use in each batch of training. This hyperparameter controls the tradeoff between training speed and memory usage. In general, larger batch sizes will lead to faster training, but might require more memory.
- TrainerHypers.atomic_baseline: dict[str, float | dict[int, float]] = {}¶
The baselines for each target.
By default,
metatrainwill fit a linear model (CompositionModel) to compute the least squares baseline for each atomic species for each target.However, this hyperparameter allows you to provide your own baselines. The value of the hyperparameter should be a dictionary where the keys are the target names, and the values are either (1) a single baseline to be used for all atomic types, or (2) a dictionary mapping atomic types to their baselines. For example:
atomic_baseline: {"energy": {1: -0.5, 6: -10.0}}will fix the energy baseline for hydrogen (Z=1) to -0.5 and for carbon (Z=6) to -10.0, while fitting the baselines for the energy of all other atomic types, as well as fitting the baselines for all other targets.
atomic_baseline: {"energy": -5.0}will fix the energy baseline for all atomic types to -5.0.
atomic_baseline: {"mtt:dos": 0.0}sets the baseline for the “mtt:dos” target to 0.0, effectively disabling the atomic baseline for that target.This atomic baseline is substracted from the targets during training, which avoids the main model needing to learn atomic contributions, and likely makes training easier. When the model is used in evaluation mode, the atomic baseline is added on top of the model predictions automatically.
Note
This atomic baseline is a per-atom contribution. Therefore, if the property you are predicting is a sum over all atoms (e.g., total energy), the contribution of the atomic baseline to the total property will be the atomic baseline multiplied by the number of atoms of that type in the structure.
- TrainerHypers.fixed_scaling_weights: dict[str, float | dict[int, float]] = {}¶
Weights for target scaling.
This is passed to the
fixed_weightsargument ofScaler.train_model, see its documentation to understand exactly what to pass here.
- TrainerHypers.num_workers: int | None = None¶
Number of workers for data loading. If not provided, it is set automatically.
- TrainerHypers.best_model_metric: Literal['rmse_prod', 'mae_prod', 'loss'] = 'rmse_prod'¶
Metric used to select best checkpoint (e.g.,
rmse_prod)
- TrainerHypers.loss: str | dict[str, LossSpecification | str] = 'mse'¶
This section describes the loss function to be used. See the Loss functions for more details.