Skip to content

Trainset Handler

ReaxFF training set definition (TRAINSET) handler.

This module provides a handler for parsing ReaxFF TRAINSET-style files, which define reference data, weights, and targets used during force-field parameter optimization.

TRAINSET files are sectioned and heterogeneous by design, containing distinct blocks for charges, heats of formation, geometries, cell parameters, and energies.

TrainsetHandler

Bases: BaseHandler

Parser for ReaxFF training set definition files (TRAINSET).

This class parses TRAINSET files and exposes their contents as section-specific tables, one per training target category.

Parsed Data

Summary table The main dataframe() is intentionally empty. TRAINSET files do not have a single global tabular representation.

Section tables Returned via metadata()["tables"] or convenience accessors, with one table per section:

- ``CHARGE``:
  Charge fitting targets, with columns:
  ["section", "iden", "weight", "atom", "lit",
   "inline_comment", "group_comment"]

- ``HEATFO``:
  Heats of formation targets, with columns:
  ["section", "iden", "weight", "lit",
   "inline_comment", "group_comment"]

- ``GEOMETRY``:
  Geometry-related targets (bond, angle, torsion, RMSG), with columns:
  ["section", "iden", "weight", "at1", "at2", "at3", "at4",
   "lit", "inline_comment", "group_comment"]
  (atom index columns are optional depending on the entry type)

- ``CELL_PARAMETERS``:
  Cell and lattice targets, with columns:
  ["section", "iden", "weight", "type", "lit",
   "inline_comment", "group_comment"]

- ``ENERGY``:
  Composite energy expressions, with dynamically generated columns:
  ["section", "weight",
   "op1", "id1", "n1",
   "op2", "id2", "n2", ...,
   "lit", "inline_comment"]

Metadata Returned by metadata(), containing: { "sections": list[str], # present section names "tables": dict[str, DataFrame] # section → parsed table }

Notes
  • Consecutive # comment lines are grouped and stored as group_comment using " /// " as a separator.
  • Inline comments following ! are preserved verbatim.
  • Sections appearing multiple times are concatenated automatically.
  • This handler is not frame-based; n_frames() always returns 0.

section(name)

Return table for a given section (case-insensitive).