push | MDFactory

funcfind_yaml_in_folder(folder) → Path | None

Find BuildInput YAML file in folder with priority ordering.

Priority order:

<hash>.yaml (try to match folder name if it looks like a hash)
build.yaml (common convention)
Any other .yaml file (alphabetically sorted)

paramfolderPath

Directory to search for YAML files

Returns

Path | None

Path to YAML file if found, None otherwise

funcdetermine_simulation_status(folder) → str

Determine simulation status based on output files.

Status hierarchy (checked in this order):

"completed": prod.gro exists
"production": prod.xtc exists (but no prod.gro)
"equilibrated": min.gro AND nvt.gro AND npt.gro all exist
"build": valid build folder but no simulation outputs

Note

This function delegates to Simulation.status for consistency. For new code, prefer using Simulation(folder).status directly.

paramfolderPath

Simulation directory to check

Returns

str

Status string ("completed", "production", "equilibrated", or "build")

funcload_models_from_csv(csv_path) → tuple[list[BuildInput], dict[int, str]]

Load BuildInput models from CSV file.

paramcsv_pathPath

Path to CSV file

Returns

tuple

Tuple of (models, errors) where errors is a dict mapping row indices to error messages

funcsearch_folders_for_hash(hash_value, base_path=Path('.')) → Path | None

Recursively search for a folder matching the given hash.

Searches for directories named exactly as the hash value.

paramhash_valuestr

Hash string to search for (folder name)

parambase_pathPath

= Path('.')

Starting directory for recursive search, by default Path(".")

Returns

Path | None

Path to matching folder if found, None otherwise

funcdiscover_simulation_folders(source=None, csv=None, csv_root=None) → list[tuple[Path, BuildInput]]

Discover and validate simulation folders from various input modes.

Exactly one of source or csv must be provided.

paramsourcePath

= None

Directory path, glob pattern, or summary YAML file (auto-detected)

paramcsvPath

= None

CSV file with build specifications

paramcsv_rootPath

= None

Root directory to search for hash folders when csv is provided

Returns

list

List of (folder_path, build_input) tuples for valid simulations

funcprepare_upload_data(simulations) → list[dict[str, Any]]

Convert simulation list to database records.

paramsimulationslist[tuple[Path, BuildInput]]

List of (folder_path, build_input) tuples

Returns

list

List of database records ready for upload

funcupload_simulations(records, db_type='RUN_DATABASE', force=False, diff=False) → int

Upload records to database with duplicate handling.

paramrecordslist[dict[str, Any]]

Database records to upload

paramdb_typestr

= 'RUN_DATABASE'

Database type to upload to, by default "RUN_DATABASE"

paramforcebool

= False

Delete existing records before uploading, by default False

paramdiffbool

= False

Only upload records not already in database, by default False

Returns

int

funcpush_systems(source=None, csv=None, csv_root=None, force=False, diff=False) → int

Push simulation metadata to RUN_DATABASE.

Discovers simulation folders and uploads their metadata to the database.

paramsourcePath

= None

Directory path, glob pattern, or summary YAML file (auto-detected)

paramcsvPath

= None

Input CSV file (hashes will be extracted and folders searched)

paramcsv_rootPath

= None

Root directory to search for hash folders when using --csv mode

paramforcebool

= False

Delete existing records before uploading, by default False

paramdiffbool

= False

Only upload records not already in database, by default False

Returns

int

Number of records uploaded

funcget_placeholder_record() → dict[str, Any]

Create a placeholder record with correct schema for initialization.

Returns

dict[str, typing.Any]

funcinit_systems_database(reset=False) → dict[str, bool]

Initialize RUN_DATABASE.

Creates the database table (SQLite) or Foundry dataset based on config.

paramresetbool

= False

If True, drop and recreate even if it exists. By default False.

Returns

dict

{table_name: was_created}