MDFactoryMDFactory

Quick Start

Build a simple mixed-box system with the current CLI

This guide walks through the current CLI path for a simple mixed-box system.

Set up configuration

Run the interactive configuration wizard:

mdfactory config init

This creates a config file at ~/.config/mdfactory/config.ini (or the platform-appropriate location) and sets up the database backend and data directories.

You can verify the configuration afterwards:

mdfactory config show

Define your system

Create a file called system.yaml:

engine: gromacs
simulation_type: mixedbox
parametrization: smirnoff
system:
  species:
    - smiles: "O"
      resname: SOL
      count: 900
    - smiles: "CCO"
      resname: ETH
      count: 100
  target_density: 1.0

This defines a mixed box with water and ethanol species at a target density.

Prepare bulk input (optional)

If you have multiple systems defined in a CSV file, convert them to individual YAML files:

mdfactory prepare-build sample_input.csv output_systems

Each row in the CSV becomes a separate hash directory with its own YAML file. Use mdfactory check-csv sample_input.csv to validate the CSV before building.

Build the simulation

Generate simulation files from the YAML:

mkdir -p simulation_dir
mdfactory build system.yaml simulation_dir

This writes GROMACS-ready build output into simulation_dir.

Run with Nextflow (optional)

For high-throughput runs, use the checked-in workflows:

Build many systems from CSV:

nextflow run workflows/build.nf \
  --csv_file sample_input.csv \
  --output_dir output_systems

Run the GROMACS chain using the generated summary YAML:

nextflow run workflows/simulate.nf \
  -c workflows/simulate.config \
  --base_dir output_systems \
  --config_yaml output_systems/sample_input.yaml

Cluster-specific configuration

The shipped simulate.config contains SLURM settings tuned for a specific cluster. Edit this file to match your HPC environment before running. See Running on HPC Clusters for details.

Common build outputs

After mdfactory build, the output directory typically contains:

  • system.pdb
  • topology.top
  • em.mdp
  • nvt.mdp
  • npt.mdp
  • md.mdp

The first run also fills the configured parameter store so later builds can reuse previously parametrized molecules.

Next steps

On this page