FoundryDataSource
Foundry data source backed by Palantir Foundry datasets.
Attributes
attributedataset_path= dataset_pathattributectx= FoundryContext()attributedataset= Noneattributerid= Noneattributetable_existsFunctions
funcdataset_exists(cls, dataset_path) → boolCheck if a Foundry dataset exists without auto-creating.
paramclsparamdataset_pathstrFoundry dataset path to check
Returns
boolTrue if the dataset exists in Foundry
func__init__(self, dataset_path)paramselfparamdataset_pathstrReturns
Nonefunc_init_dataset(self)Initialize dataset on Foundry.
paramselfReturns
Nonefunc_check_dataset_exists(self) → boolCheck if dataset has schema (may be empty).
paramselfReturns
boolfuncwait_for_row_count(self, expected_rows, timeout_seconds=90, interval_seconds=2) → boolWait for dataset to report at least expected_rows.
paramselfparamexpected_rowsintMinimum number of rows to wait for
paramtimeout_secondsint= 90Maximum time to wait in seconds
paraminterval_secondsint= 2Polling interval in seconds
Returns
boolTrue if expected row count was reached before timeout
func_empty_schema_frame(self) → pd.DataFrameReturn an empty DataFrame with the dataset schema (if available).
paramselfReturns
pandas.DataFrameEmpty DataFrame with column names from the dataset schema
funcinit_schema(self)Add schema based on the Foundry inference service. One time execution.
paramselfReturns
Nonefuncload_data(self) → pd.DataFrameLoad all data from the Foundry dataset.
paramselfReturns
pandas.pandas.DataFramefuncsave_data(self, data, overwrite=False)Save data to the Foundry dataset. Accepts single dict, list of dicts, or DataFrame.
paramselfparamdataUnion[Dict[str, Any], List[Dict[str, Any]], pd.DataFrame]paramoverwritebool= FalseReturns
Nonefuncquery_data(self, conditions) → pd.DataFrameQuery data based on conditions.
paramselfparamconditionsDict[str, Any]Returns
pandas.pandas.DataFramefuncupdate_data(self, conditions, updates)Update existing data.
paramselfparamconditionsDict[str, Any]paramupdatesTReturns
Nonefuncdelete_data(self, conditions)Delete data based on conditions.
paramselfparamconditionsDict[str, Any]Returns
Nonefuncgrab_column(self, column_name) → pd.SeriesRetrieve a specific column.
paramselfparamcolumn_namestrReturns
pandas.pandas.Seriesfuncgrab_row(self, index) → pd.SeriesRetrieve a specific row.
paramselfparamindexintReturns
pandas.pandas.Seriesfunccoalesce_parquets(self)Combine parquets into one file.
paramselfReturns
None