MDFactoryMDFactory

push_analysis

Helper functions for pushing analysis data to database.

attribute__all__
= ['serialize_dataframe_to_csv', 'deserialize_csv_to_dataframe', 'prepare_analysis_record', 'prepare_artifact_record', 'prepare_overview_record', 'query_existing_hashes', 'upload_analysis_data', 'update_overview_records', 'discover_and_prepare_analysis_data', 'push_analysis', 'init_analysis_database', 'init_artifact_database', 'get_analysis_table_name', 'get_artifact_table_name', 'get_all_analysis_names', 'get_all_artifact_names', 'get_analyses_for_simulation_type', 'get_artifacts_for_simulation_type']
funcget_overview_placeholder()dict[str, Any]

Create a placeholder record for ANALYSIS_OVERVIEW schema initialization.

Returns

dict[str, typing.Any]
funcget_analysis_placeholder()dict[str, Any]

Create a placeholder record for ANALYSIS_* table schema initialization.

Note on data storage fields:

  • data_csv: Stores full serialized CSV data for database-only retrieval. Enables pulling complete analysis data without filesystem access.
  • data_path: Stores relative path to local parquet file (e.g., ".analysis/apl.parquet"). Used for filesystem-based workflows when working with local files.

Both fields serve complementary purposes: data_csv enables centralized database access while data_path references the canonical local storage location.

Returns

dict[str, typing.Any]
funcget_artifact_placeholder()dict[str, Any]

Create a placeholder record for ARTIFACT_* table schema initialization.

Returns

dict[str, typing.Any]
funcserialize_dataframe_to_csv(df)str

Serialize a DataFrame to CSV string.

paramdfpd.DataFrame

DataFrame to serialize

Returns

str

CSV-formatted string

funcdeserialize_csv_to_dataframe(csv_string)pd.DataFrame

Deserialize a CSV string to DataFrame.

paramcsv_stringstr

CSV-formatted string

Returns

pandas.DataFrame

DataFrame

funcprepare_analysis_record(sim, analysis_name)dict[str, Any] | None

Prepare a single analysis record for upload.

paramsimSimulation

Simulation instance

paramanalysis_namestr

Analysis name

Returns

dict[str, Any] | None

Record dict ready for database, or None if analysis not completed

funcprepare_artifact_record(sim, artifact_name)dict[str, Any] | None

Prepare a single artifact record for upload.

paramsimSimulation

Simulation instance

paramartifact_namestr

Artifact name

Returns

dict[str, Any] | None

Record dict ready for database, or None if artifact not completed

funcprepare_overview_record(sim, item_type, item_name, status, row_count=0, file_count=0)dict[str, Any]

Prepare an overview record.

paramsimSimulation

Simulation instance

paramitem_typestr

"analysis" or "artifact"

paramitem_namestr

Name of the analysis or artifact

paramstatusstr

"completed" or "not_yet_run"

paramrow_countint
= 0

Row count for analyses (0 if not applicable)

paramfile_countint
= 0

File count for artifacts (0 if not applicable)

Returns

dict

Record dict matching OVERVIEW_COLUMNS schema

funcupload_analysis_data(records, table_name, force=False, diff=False)int

Upload analysis records to database.

paramrecordslist[dict[str, Any]]

Records to upload

paramtable_namestr

Table name

paramforcebool
= False

Delete existing records before uploading

paramdiffbool
= False

Skip records that already exist

Returns

int

Number of records uploaded

funcupdate_overview_records(records, force=False, diff=False)int

Update overview table with records.

Conflict behavior is controlled explicitly by the flags:

  • force=True: overwrite existing composite keys
  • diff=True: skip existing keys unless upgrading status (e.g. not_yet_run -> completed)
  • default: insert new keys and allow status upgrades, but raise on other duplicates
paramrecordslist[dict[str, Any]]

Overview records

paramforcebool
= False

Overwrite existing entries

paramdiffbool
= False

Skip records that already exist, but allow status upgrades

Returns

int

Number of records processed

funcdiscover_and_prepare_analysis_data(simulations, analysis_name=None)tuple[dict[str, list[dict]], list[dict]]

Discover analysis data from simulation folders.

paramsimulationslist[tuple[Path, BuildInput]]

List of (folder_path, build_input) tuples

paramanalysis_namestr | None
= None

Specific analysis to discover, or None for all

Returns

tuple

(analysis_records_by_table, overview_records) analysis_records_by_table maps table_name to list of records

funcpush_analysis(source=None, csv=None, csv_root=None, analysis_name=None, force=False, diff=False)dict[str, int]

Push analysis data to database.

paramsourcePath | None
= None

Directory path, glob pattern, or summary YAML file (auto-detected)

paramcsvPath | None
= None

CSV file with build specifications

paramcsv_rootPath | None
= None

Root directory for CSV hash search

paramanalysis_namestr | None
= None

Specific analysis to push, or None for all

paramforcebool
= False

Delete existing records before uploading

paramdiffbool
= False

Skip records that already exist

Returns

dict

{table_name: count_uploaded}

func_build_analysis_table_list()list[tuple[str, dict, list[str]]]

Build table list for analysis database initialization.

Excludes ANALYSIS_OVERVIEW because both init_analysis_database and init_artifact_database handle it separately via _init_overview_table.

Returns

list

List of (table_name, placeholder_record, columns) tuples

func_build_artifact_table_list()list[tuple[str, dict, list[str]]]

Build table list for artifact database initialization.

Intentionally excludes ANALYSIS_OVERVIEW so artifact-only reset cannot remove analysis overview state.

Returns

list

List of (table_name, placeholder_record, columns) tuples

func_init_overview_table(database_type)dict[str, bool]

Ensure ANALYSIS_OVERVIEW exists for the active backend.

paramdatabase_typestr

Backend type ("sqlite", "csv", or "foundry")

Returns

dict

{table_name: was_created}

func_clear_overview_item_type(item_type)None

Delete rows from ANALYSIS_OVERVIEW for a specific item type.

paramitem_typestr

Item type to clear ("analysis" or "artifact")

Returns

None
funcinit_analysis_database(reset=False)dict[str, bool]

Initialize analysis database tables.

Creates tables (SQLite) or datasets (Foundry) for all registered analysis types and the overview table. On reset, only overview rows with item_type='analysis' are cleared.

paramresetbool
= False

Recreate tables even if they exist

Returns

dict

{table_name: was_created}

funcinit_artifact_database(reset=False)dict[str, bool]

Initialize artifact database tables.

Creates tables (SQLite) or datasets (Foundry) for all registered artifact types. On reset, only overview rows with item_type='artifact' are cleared.

paramresetbool
= False

Recreate tables even if they exist

Returns

dict

{table_name: was_created}