push_analysis | MDFactory

Helper functions for pushing analysis data to database.

attribute__all__

= ['serialize_dataframe_to_csv', 'deserialize_csv_to_dataframe', 'prepare_analysis_record', 'prepare_artifact_record', 'prepare_overview_record', 'query_existing_hashes', 'upload_analysis_data', 'update_overview_records', 'discover_and_prepare_analysis_data', 'push_analysis', 'init_analysis_database', 'init_artifact_database', 'get_analysis_table_name', 'get_artifact_table_name', 'get_all_analysis_names', 'get_all_artifact_names', 'get_analyses_for_simulation_type', 'get_artifacts_for_simulation_type']

funcget_overview_placeholder() → dict[str, Any]

Create a placeholder record for ANALYSIS_OVERVIEW schema initialization.

Returns

dict[str, typing.Any]

funcget_analysis_placeholder() → dict[str, Any]

Create a placeholder record for ANALYSIS_* table schema initialization.

Note on data storage fields:

data_csv: Stores full serialized CSV data for database-only retrieval. Enables pulling complete analysis data without filesystem access.
data_path: Stores relative path to local parquet file (e.g., ".analysis/apl.parquet"). Used for filesystem-based workflows when working with local files.

Both fields serve complementary purposes: data_csv enables centralized database access while data_path references the canonical local storage location.

Returns

dict[str, typing.Any]

funcget_artifact_placeholder() → dict[str, Any]

Create a placeholder record for ARTIFACT_* table schema initialization.

Returns

dict[str, typing.Any]

funcserialize_dataframe_to_csv(df) → str

Serialize a DataFrame to CSV string.

paramdfpd.DataFrame

DataFrame to serialize

Returns

str

CSV-formatted string

funcdeserialize_csv_to_dataframe(csv_string) → pd.DataFrame

Deserialize a CSV string to DataFrame.

paramcsv_stringstr

CSV-formatted string

Returns

pandas.DataFrame

DataFrame

funcprepare_analysis_record(sim, analysis_name) → dict[str, Any] | None

Prepare a single analysis record for upload.

paramsimSimulation

Simulation instance

paramanalysis_namestr

Analysis name

Returns

dict[str, Any] | None

Record dict ready for database, or None if analysis not completed

funcprepare_artifact_record(sim, artifact_name) → dict[str, Any] | None

Prepare a single artifact record for upload.

paramsimSimulation

Simulation instance

paramartifact_namestr

Artifact name

Returns

dict[str, Any] | None

Record dict ready for database, or None if artifact not completed

funcprepare_overview_record(sim, item_type, item_name, status, row_count=0, file_count=0) → dict[str, Any]

Prepare an overview record.

paramsimSimulation

Simulation instance

paramitem_typestr

"analysis" or "artifact"

paramitem_namestr

Name of the analysis or artifact

paramstatusstr

"completed" or "not_yet_run"

paramrow_countint

= 0

Row count for analyses (0 if not applicable)

paramfile_countint

= 0

File count for artifacts (0 if not applicable)

Returns

dict

Record dict matching OVERVIEW_COLUMNS schema

funcupload_analysis_data(records, table_name, force=False, diff=False) → int

Upload analysis records to database.

paramrecordslist[dict[str, Any]]

Records to upload

paramtable_namestr

Table name

paramforcebool

= False

Delete existing records before uploading

paramdiffbool

= False

Skip records that already exist

Returns

int

Number of records uploaded

funcupdate_overview_records(records, force=False, diff=False) → int

Update overview table with records.

Conflict behavior is controlled explicitly by the flags:

force=True: overwrite existing composite keys
diff=True: skip existing keys unless upgrading status (e.g. not_yet_run -> completed)
default: insert new keys and allow status upgrades, but raise on other duplicates

paramrecordslist[dict[str, Any]]

Overview records

paramforcebool

= False

Overwrite existing entries

paramdiffbool

= False

Skip records that already exist, but allow status upgrades

Returns

int

Number of records processed

funcdiscover_and_prepare_analysis_data(simulations, analysis_name=None) → tuple[dict[str, list[dict]], list[dict]]

Discover analysis data from simulation folders.

paramsimulationslist[tuple[Path, BuildInput]]

List of (folder_path, build_input) tuples

paramanalysis_namestr | None

= None

Specific analysis to discover, or None for all

Returns

tuple

(analysis_records_by_table, overview_records) analysis_records_by_table maps table_name to list of records

funcpush_analysis(source=None, csv=None, csv_root=None, analysis_name=None, force=False, diff=False) → dict[str, int]

Push analysis data to database.

paramsourcePath | None

= None

Directory path, glob pattern, or summary YAML file (auto-detected)

paramcsvPath | None

= None

CSV file with build specifications

paramcsv_rootPath | None

= None

Root directory for CSV hash search

paramanalysis_namestr | None

= None

Specific analysis to push, or None for all

paramforcebool

= False

Delete existing records before uploading

paramdiffbool

= False

Skip records that already exist

Returns

dict

{table_name: count_uploaded}

func_build_analysis_table_list() → list[tuple[str, dict, list[str]]]

Build table list for analysis database initialization.

Excludes ANALYSIS_OVERVIEW because both init_analysis_database and init_artifact_database handle it separately via _init_overview_table.

Returns

list

List of (table_name, placeholder_record, columns) tuples

func_build_artifact_table_list() → list[tuple[str, dict, list[str]]]

Build table list for artifact database initialization.

Intentionally excludes ANALYSIS_OVERVIEW so artifact-only reset cannot remove analysis overview state.

Returns

list

List of (table_name, placeholder_record, columns) tuples

func_init_overview_table(database_type) → dict[str, bool]

Ensure ANALYSIS_OVERVIEW exists for the active backend.

paramdatabase_typestr

Backend type ("sqlite", "csv", or "foundry")

Returns

dict

{table_name: was_created}

func_clear_overview_item_type(item_type) → None

Delete rows from ANALYSIS_OVERVIEW for a specific item type.

paramitem_typestr

Item type to clear ("analysis" or "artifact")

Returns

None

funcinit_analysis_database(reset=False) → dict[str, bool]

Initialize analysis database tables.

Creates tables (SQLite) or datasets (Foundry) for all registered analysis types and the overview table. On reset, only overview rows with item_type='analysis' are cleared.

paramresetbool

= False

Recreate tables even if they exist

Returns

dict

{table_name: was_created}

funcinit_artifact_database(reset=False) → dict[str, bool]

Initialize artifact database tables.

Creates tables (SQLite) or datasets (Foundry) for all registered artifact types. On reset, only overview rows with item_type='artifact' are cleared.

paramresetbool

= False

Recreate tables even if they exist

Returns

dict

{table_name: was_created}