MDFactoryMDFactory

FoundryDataSource

Foundry data source backed by Palantir Foundry datasets.

Attributes

attributedataset_path
= dataset_path
attributectx
= FoundryContext()
attributedataset
= None
attributerid
= None
attributetable_exists

Functions

funcdataset_exists(cls, dataset_path)bool

Check if a Foundry dataset exists without auto-creating.

paramcls
paramdataset_pathstr

Foundry dataset path to check

Returns

bool

True if the dataset exists in Foundry

func__init__(self, dataset_path)
paramself
paramdataset_pathstr

Returns

None
func_init_dataset(self)

Initialize dataset on Foundry.

paramself

Returns

None
func_check_dataset_exists(self)bool

Check if dataset has schema (may be empty).

paramself

Returns

bool
funcwait_for_row_count(self, expected_rows, timeout_seconds=90, interval_seconds=2)bool

Wait for dataset to report at least expected_rows.

paramself
paramexpected_rowsint

Minimum number of rows to wait for

paramtimeout_secondsint
= 90

Maximum time to wait in seconds

paraminterval_secondsint
= 2

Polling interval in seconds

Returns

bool

True if expected row count was reached before timeout

func_empty_schema_frame(self)pd.DataFrame

Return an empty DataFrame with the dataset schema (if available).

paramself

Returns

pandas.DataFrame

Empty DataFrame with column names from the dataset schema

funcinit_schema(self)

Add schema based on the Foundry inference service. One time execution.

paramself

Returns

None
funcload_data(self)pd.DataFrame

Load all data from the Foundry dataset.

paramself

Returns

pandas.pandas.DataFrame
funcsave_data(self, data, overwrite=False)

Save data to the Foundry dataset. Accepts single dict, list of dicts, or DataFrame.

paramself
paramdataUnion[Dict[str, Any], List[Dict[str, Any]], pd.DataFrame]
paramoverwritebool
= False

Returns

None
funcquery_data(self, conditions)pd.DataFrame

Query data based on conditions.

paramself
paramconditionsDict[str, Any]

Returns

pandas.pandas.DataFrame
funcupdate_data(self, conditions, updates)

Update existing data.

paramself
paramconditionsDict[str, Any]
paramupdatesT

Returns

None
funcdelete_data(self, conditions)

Delete data based on conditions.

paramself
paramconditionsDict[str, Any]

Returns

None
funcgrab_column(self, column_name)pd.Series

Retrieve a specific column.

paramself
paramcolumn_namestr

Returns

pandas.pandas.Series
funcgrab_row(self, index)pd.Series

Retrieve a specific row.

paramself
paramindexint

Returns

pandas.pandas.Series
funccoalesce_parquets(self)

Combine parquets into one file.

paramself

Returns

None

On this page