foundry_dev_tools.foundry_api_client module

Contents

foundry_dev_tools.foundry_api_client module#

Contains FoundryRestClient and FoundrySqlClient and exception classes.

One of the gaols of this module is to be self-contained so that it can be dropped into any python installation with minimal dependency to ‘requests’ Optional dependencies for the SQL functionality to work are pandas and pyarrow.

class foundry_dev_tools.foundry_api_client.FoundryRestClient[source]#

Bases: object

Create an instance of FoundryRestClient.

Parameters:
  • config – config dictionary which tries to get parsed into the v2 configuration, to be backwards compatible

  • ctx – or just pass the v2 FoundryContext directly instead of the ‘old’ configuration, the config dict will be ignored

Examples

>>> fc = FoundryRestClient()
>>> fc = FoundryRestClient(config={"jwt": "<token>"})
>>> fc = FoundryRestClient(config={"client_id": "<client_id>"})
>>> ctx = FoundryContext()
>>> fc = FoundryRestClient(ctx=ctx)
__init__(config=None, ctx=None)[source]#

Create an instance of FoundryRestClient.

Parameters:
  • config (dict | None) – config dictionary which tries to get parsed into the v2 configuration, to be backwards compatible

  • ctx (FoundryContext | None) – or just pass the v2 FoundryContext directly instead of the ‘old’ configuration, the config dict will be ignored

Examples

>>> fc = FoundryRestClient()
>>> fc = FoundryRestClient(config={"jwt": "<token>"})
>>> fc = FoundryRestClient(config={"client_id": "<client_id>"})
>>> ctx = FoundryContext()
>>> fc = FoundryRestClient(ctx=ctx)
create_dataset(dataset_path)[source]#

Creates an empty dataset in Foundry.

Parameters:

dataset_path (str) – Path in Foundry, where this empty dataset should be created for example: /Global/Foundry Operations/Foundry Support/iris_new

Returns:

with keys rid and fileSystemId. The key rid contains the dataset_rid which is the unique identifier of a dataset.

Return type:

dict

get_dataset(dataset_rid)[source]#

Gets dataset_rid and fileSystemId.

Parameters:

dataset_rid (str) – Dataset rid

Returns:

with the keys rid and fileSystemId

Return type:

dict

Raises:

DatasetNotFoundError – if dataset does not exist

delete_dataset(dataset_rid)[source]#

Deletes a dataset in Foundry and moves it to trash.

Parameters:

dataset_rid (str) – Unique identifier of the dataset

Raises:

DatasetNotFoundError – if dataset does not exist

move_resource_to_trash(rid)[source]#

Moves a Compass resource (e.g. dataset or folder) to trash.

Parameters:

rid (str) – rid of the resource

create_branch(dataset_rid, branch, parent_branch_id=None, parent_branch=None)[source]#

Creates a new branch in a dataset.

If dataset is ‘new’, only parameter dataset_rid and branch are required.

Parameters:
  • dataset_rid (str) – Unique identifier of the dataset

  • branch (str) – The branch name to create

  • parent_branch (str | None) – The transaction id, to branch off

  • parent_branch_id (str | None) – The name of the parent branch, if empty creates new root branch

Returns:

the response as a json object

Return type:

dict

update_branch(dataset_rid, branch, parent_branch=None)[source]#

Updates the latest transaction of branch ‘branch’ to the latest transaction of branch ‘parent_branch’.

Parameters:
  • dataset_rid (str) – Unique identifier of the dataset

  • branch (str) – The branch to update (e.g. master)

  • parent_branch (str | None) – the name of the branch to copy the last transaction from or a transaction rid

Returns:

example below for the branch response

Return type:

dict

{
    "id": "..",
    "rid": "ri.foundry.main.branch...",
    "ancestorBranchIds": [],
    "creationTime": "",
    "transactionRid": "ri.foundry.main.transaction....",
}
get_branches(dataset_rid)[source]#

Returns a list of branches available a dataset.

Parameters:

dataset_rid (str) – Unique identifier of the dataset

Returns:

list of dataset branch names

Return type:

list[str]

get_branch(dataset_rid, branch)[source]#

Returns branch information.

Parameters:
  • dataset_rid (str) – Unique identifier of the dataset

  • branch (str) – Branch name

Returns:

with keys id (name) and rid (unique id) of the branch.

Return type:

dict

Raises:

BranchNotFoundError – if branch does not exist.

open_transaction(dataset_rid, mode='SNAPSHOT', branch='master')[source]#

Opens a new transaction on a dataset.

Parameters:
  • dataset_rid (str) – Unique identifier of the dataset

  • mode (str) – APPEND: append files, SNAPSHOT: replace all, UPDATE: replace file if exists, keep existing files

  • branch (str) – dataset branch

Returns:

the transaction ID

Return type:

str

Raises:
remove_dataset_file(dataset_rid, transaction_id, logical_path, recursive=False)[source]#

Removes the given file from an open transaction.

If the logical path matches a file exactly then only that file will be removed, regardless of the value of recursive. If the logical path represents a directory, then all files prefixed with the logical path followed by ‘/’ will be removed when recursive is true and no files will be removed when recursive is false. If the given logical path does not match a file or directory then this call is ignored and does not throw an exception.

Parameters:
  • dataset_rid (str) – Unique identifier of the dataset

  • transaction_id (str) – transaction rid

  • logical_path (str) – logical path in the backing filesystem

  • recursive (bool) – recurse into subdirectories

add_files_to_delete_transaction(dataset_rid, transaction_id, logical_paths)[source]#

Adds files in an open DELETE transaction.

Files added to DELETE transactions affect the dataset view by removing files from the view.

Parameters:
  • dataset_rid (str) – Unique identifier of the dataset

  • transaction_id (str) – transaction rid

  • logical_paths (List[str]) – files in the dataset to delete

commit_transaction(dataset_rid, transaction_id)[source]#

Commits a transaction, should be called after file upload is complete.

Parameters:
  • dataset_rid (str) – Unique identifier of the dataset

  • transaction_id (str) – transaction id

Raises:

KeyError – when there was an issue with committing

abort_transaction(dataset_rid, transaction_id)[source]#

Aborts a transaction. Dataset will remain on transaction N-1.

Parameters:
  • dataset_rid (str) – Unique identifier of the dataset

  • transaction_id (str) – transaction id

Raises:

KeyError – When abort transaction fails

get_dataset_transactions(dataset_rid, branch='master', last=50, include_open_exclusive_transaction=False)[source]#

Returns the transactions of a dataset / branch combination.

Returns last 50 transactions by default, pagination not implemented.

Parameters:
  • dataset_rid (str) – Unique identifier of the dataset

  • branch (str) – Branch

  • last (int) – last

  • include_open_exclusive_transaction (bool) – include_open_exclusive_transaction

Returns:

of transaction information.

Return type:

dict

Raises:
get_dataset_last_transaction(dataset_rid, branch='master')[source]#

Returns the last transaction of a dataset / branch combination.

Parameters:
  • dataset_rid (str) – Unique identifier of the dataset

  • branch (str) – Branch

Returns:

response from transaction API or None if dataset has no transaction.

Return type:

dict | None

get_dataset_last_transaction_rid(dataset_rid, branch='master')[source]#

Returns the last transaction rid of a dataset / branch combination.

Parameters:
  • dataset_rid (str) – Unique identifier of the dataset

  • branch (str) – Branch

Returns:

transaction rid or None if dataset has no transaction.

Return type:

str | None

upload_dataset_file(dataset_rid, transaction_rid, path_or_buf, path_in_foundry_dataset)[source]#

Uploads a file like object to a path in a foundry dataset.

Parameters:
  • dataset_rid (str) – Unique identifier of the dataset

  • transaction_rid (str) – transaction id

  • path_or_buf (str | Path | IO[AnyStr]) – A str or file handle, file path or object

  • path_in_foundry_dataset (str) – The path in the dataset, to which the file is uploaded.

Return type:

requests.Response

upload_dataset_files(dataset_rid, transaction_rid, path_file_dict, parallel_processes=None)[source]#

Uploads multiple local files to a foundry dataset.

Parameters:
  • dataset_rid (str) – Unique identifier of the dataset

  • transaction_rid (str) – transaction id

  • parallel_processes (int | None) – Set number of threads for upload

  • path_file_dict (dict) – A dictionary with the following structure:

{
'<path_in_foundry_dataset>': '<local_file_path>',
...
}
get_dataset_details(dataset_path_or_rid)[source]#

Returns the resource information of a dataset.

Parameters:

dataset_path_or_rid (str) – The full path or rid to the dataset

Returns:

the json response of the api

Return type:

dict

Raises:

DatasetNotFoundError – if dataset not found

get_child_objects_of_folder(folder_rid, page_size=None)[source]#

Returns the child objects of a compass folder.

Parameters:
  • folder_rid (str) – Compass folder rid, e.g. ri.compass.main.folder.f549ae09-9534-44c7-967a-6c86b2339231

  • page_size (int) – to control the pageSize manually

Yields:

dict – information about child objects

Raises:

FolderNotFoundError – if folder not found

Return type:

Iterator[dict]

create_folder(name, parent_id)[source]#

Creates an empty folder in compass.

Parameters:
  • name (str) – name of the new folder

  • parent_id (str) – rid of the parent folder, e.g. ri.compass.main.folder.aca0cce9-2419-4978-bb18-d4bc6e50bd7e

Returns:

with keys rid and name and other properties.

Return type:

dict

get_dataset_rid(dataset_path)[source]#

Returns the rid of a dataset, uses dataset_path as input.

Parameters:

dataset_path (str) – The full path to the dataset

Returns:

the dataset_rid

Return type:

str

get_dataset_path(dataset_rid)[source]#

Returns the path of a dataset as str.

Parameters:

dataset_rid (str) – The rid of the dataset

Returns:

the dataset_path

Return type:

str

Raises:

DatasetNotFoundError – if dataset was not found

get_dataset_paths(dataset_rids)[source]#

Returns a list of paths for a list of passed rid’s of a dataset.

Parameters:

dataset_rids (list) – The rid’s of datasets

Returns:

the dataset_path as dict of string

Return type:

dict

is_dataset_in_trash(dataset_path)[source]#

Returns true if dataset is in compass trash.

Parameters:

dataset_path (str) – The path to the dataset

Returns:

true if dataset is in trash

Return type:

bool

get_dataset_schema(dataset_rid, transaction_rid=None, branch='master')[source]#

Returns the foundry dataset schema for a dataset, transaction, branch combination.

Parameters:
  • dataset_rid (str) – The rid of the dataset

  • transaction_rid (str) – The rid of the transaction

  • branch (str) – The branch

Returns:

with foundry dataset schema

Return type:

dict

Raises:
upload_dataset_schema(dataset_rid, transaction_rid, schema, branch='master')[source]#

Uploads the foundry dataset schema for a dataset, transaction, branch combination.

Parameters:
  • dataset_rid (str) – The rid of the dataset

  • transaction_rid (str) – The rid of the transaction

  • schema (dict) – The foundry schema

  • branch (str) – The branch

Return type:

requests.Response

infer_dataset_schema(dataset_rid, branch='master')[source]#

Calls the foundry-schema-inference service to infer the dataset schema.

Returns dict with foundry schema, if status == SUCCESS

Parameters:
  • dataset_rid (str) – The dataset rid

  • branch (str) – The branch

Returns:

with dataset schema, that can be used to call upload_dataset_schema

Return type:

dict

Raises:

ValueError – if foundry schema inference failed

get_dataset_identity(dataset_path_or_rid, branch='master', check_read_access=True)[source]#

Returns the identity of this dataset (dataset_path, dataset_rid, last_transaction_rid, last_transaction).

Parameters:
  • dataset_path_or_rid (str) – Path to dataset (e.g. /Global/…) or rid of dataset (e.g. ri.foundry.main.dataset…)

  • branch (str) – branch of the dataset

  • check_read_access (bool) – default is True, checks if the user has read access (‘gatekeeper:view-resource’) to the dataset otherwise exception is thrown

Returns:

with the keys ‘dataset_path’, ‘dataset_rid’, ‘last_transaction_rid’, ‘last_transaction’

Return type:

dict

Raises:

DatasetNoReadAccessError – if you have no read access for that dataset

list_dataset_files(dataset_rid, exclude_hidden_files=True, view='master', logical_path=None, detail=False, *, include_open_exclusive_transaction=False)[source]#

Returns list of internal filenames of a dataset.

Parameters:
  • dataset_rid (str) – the dataset rid

  • exclude_hidden_files (bool) – if hidden files should be excluded (e.g. _log files)

  • view (str) – branch or transaction rid of the dataset

  • logical_path (str) – If logical_path is absent, returns all files in the view. If logical_path matches a file exactly, returns just that file. Otherwise, returns all files in the “directory” of logical_path: (a slash is added to the end of logicalPath if necessary and a prefix-match is performed)

  • detail (bool) – if passed as True, returns complete response from catalog API, otherwise only returns logicalPath

  • include_open_exclusive_transaction (bool) – if files added in open transaction should be returned as well in the response

Returns:

filenames

Return type:

list

Raises:

DatasetNotFound – if dataset was not found

get_dataset_stats(dataset_rid, view='master')[source]#

Returns response from foundry catalogue stats endpoint.

Parameters:
  • dataset_rid (str) – the dataset rid

  • view (str) – branch or transaction rid of the dataset

Returns:

sizeInBytes, numFiles, hiddenFilesSizeInBytes, numHiddenFiles, numTransactions

Return type:

dict

foundry_stats(dataset_rid, end_transaction_rid, branch='master')[source]#

Returns row counts and size of the dataset/view.

Parameters:
  • dataset_rid (str) – The dataset RID.

  • end_transaction_rid (str) – The specific transaction RID, which will be used to return the statistics.

  • branch (str) – The branch to query

Returns:

With the following structure: { datasetRid: str, branch: str, endTransactionRid: str, schemaId: str, computedDatasetStats: { rowCount: str | None, sizeInBytes: str, columnStats: { “…”: {nullCount: str | None, uniqueCount: str | None, avgLength: str | None, maxLength: str | None,} }, }, }

Return type:

dict

download_dataset_file(dataset_rid, output_directory, foundry_file_path, view='master')[source]#

Downloads a single foundry dataset file into a directory.

Creates sub folder if necessary.

Parameters:
  • dataset_rid (str) – the dataset rid

  • output_directory (str | None) – the local output directory for the file or None if None is passed, byte content of file is returned

  • foundry_file_path (str) – the file_path on the foundry file system

  • view (str) – branch or transaction rid of the dataset

Returns:

local file path in case output_directory was passed or file content as bytes

Return type:

str | bytes

Raises:

ValueError – If download failed

download_dataset_files(dataset_rid, output_directory, files=None, view='master', parallel_processes=None)[source]#

Downloads files of a dataset (in parallel) to a local output directory.

Parameters:
  • dataset_rid (str) – the dataset rid

  • files (list | None) – list of files or None, in which case all files are downloaded

  • output_directory (str) – the output directory for the files default value is calculated: multiprocessing.cpu_count() - 1

  • view (str) – branch or transaction rid of the dataset

  • parallel_processes (int | None) – Set number of threads for upload

Returns:

path to downloaded files

Return type:

List[str]

download_dataset_files_temporary(dataset_rid, files=None, view='master', parallel_processes=None)[source]#

Downloads all files of a dataset to a temporary directory.

Which is deleted when the context is closed. Function returns the temporary directory. Example usage:

>>> import parquet
>>> import pandas as pd
>>> from pathlib import Path
>>> with client.download_dataset_files_temporary(dataset_rid='ri.foundry.main.dataset.1', view='master') as         >>> temp_folder:
>>>     # Read using Pandas
>>>     df = pd.read_parquet(temp_folder)
>>>     # Read using pyarrow, here we pass only the files, which are normally in subfolder 'spark'
>>>     pq = parquet.ParquetDataset(path_or_paths=[x for x in Path(temp_dir).glob('**/*') if x.is_file()])
Parameters:
  • dataset_rid (str) – the dataset rid

  • files (List[str]) – list of files or None, in which case all files are downloaded

  • view (str) – branch or transaction rid of the dataset

  • parallel_processes (int) – Set number of threads for download

Yields:

Iterator[str] – path to temporary folder containing root of dataset files

Return type:

Iterator[str]

get_dataset_as_raw_csv(dataset_rid, branch='master')[source]#

Uses csv API to download a dataset as csv.

Parameters:
  • dataset_rid (str) – the dataset rid

  • branch (str) – branch of the dataset

Returns:

with the csv stream. Can be converted to a pandas DataFrame >>> pd.read_csv(io.BytesIO(response.content))

Return type:

Response

query_foundry_sql_legacy(query: str, return_type: Literal['pandas'], branch: api_types.Ref = 'master', sql_dialect: api_types.SqlDialect = 'SPARK', timeout: int = 600) pd.core.frame.DataFrame[source]#
query_foundry_sql_legacy(query: str, return_type: Literal['spark'], branch: api_types.Ref = 'master', sql_dialect: api_types.SqlDialect = 'SPARK', timeout: int = 600) pyspark.sql.DataFrame
query_foundry_sql_legacy(query: str, return_type: Literal['arrow'], branch: api_types.Ref = 'master', sql_dialect: api_types.SqlDialect = 'SPARK', timeout: int = 600) pa.Table
query_foundry_sql_legacy(query: str, return_type: Literal['raw'], branch: api_types.Ref = 'master', sql_dialect: api_types.SqlDialect = 'SPARK', timeout: int = 600) tuple[dict, list[list]]
query_foundry_sql_legacy(query: str, return_type: api_types.SQLReturnType = 'raw', branch: api_types.Ref = 'master', sql_dialect: api_types.SqlDialect = 'SPARK', timeout: int = 600) tuple[dict, list[list]] | pd.core.frame.DataFrame | pa.Table | pyspark.sql.DataFrame

Queries the dataproxy query API with spark SQL.

Example

query_foundry_sql_legacy(query=”SELECT * FROM /Global/Foundry Operations/Foundry Support/iris”,

branch=”master”)

Parameters:
Returns:

(foundry_schema, data)

data: contains the data matrix, foundry_schema: the foundry schema (fieldSchemaList key). Can be converted to a pandas Dataframe, see below

foundry_schema, data = self.query_foundry_sql_legacy(query, branch)
df = pd.DataFrame(
    data=data, columns=[e["name"] for e in foundry_schema["fieldSchemaList"]]
)

Return type:

tuple (dict, list)

Raises:
query_foundry_sql(query: str, return_type: Literal['pandas'], branch: api_types.Ref = 'master', sql_dialect: api_types.SqlDialect = 'SPARK', timeout: int = 600) pd.core.frame.DataFrame[source]#
query_foundry_sql(query: str, return_type: Literal['spark'], branch: api_types.Ref = 'master', sql_dialect: api_types.SqlDialect = 'SPARK', timeout: int = 600) pyspark.sql.DataFrame
query_foundry_sql(query: str, return_type: Literal['arrow'], branch: api_types.Ref = 'master', sql_dialect: api_types.SqlDialect = 'SPARK', timeout: int = 600) pa.Table
query_foundry_sql(query: str, return_type: Literal['raw'], branch: api_types.Ref = 'master', sql_dialect: api_types.SqlDialect = 'SPARK', timeout: int = 600) tuple[dict, list[list]]
query_foundry_sql(query: str, return_type: api_types.SQLReturnType = 'pandas', branch: api_types.Ref = 'master', sql_dialect: api_types.SqlDialect = 'SPARK', timeout: int = 600) tuple[dict, list[list]] | pd.core.frame.DataFrame | pa.Table | pyspark.sql.DataFrame

Queries the Foundry SQL server with spark SQL dialect.

Uses Arrow IPC to communicate with the Foundry SQL Server Endpoint.

Falls back to query_foundry_sql_legacy in case pyarrow is not installed or the query does not return Arrow Format.

Example

df1 = client.query_foundry_sql(“SELECT * FROM /Global/Foundry Operations/Foundry Support/iris”) query = (“SELECT col1 FROM {start_transaction_rid}:{end_transaction_rid}@{branch}.`{dataset_path_or_rid}` WHERE filterColumns = ‘value1’ LIMIT 1”) df2 = client.query_foundry_sql(query)

Parameters:
  • query – The SQL Query in Foundry Spark Dialect (use backticks instead of quotes)

  • branch – the dataset branch

  • sql_dialect – the sql dialect

  • return_type – See :py:class:foundry_dev_tools.foundry_api_client.SQLReturnType

  • timeout – Query Timeout, default value is 600 seconds

Returns:

A pandas DataFrame, Spark DataFrame or pyarrow.Table with the result.

Return type:

DataFrame | Table | DataFrame

Raises:

ValueError – Only direct read eligible queries can be returned as arrow Table.

get_user_info()[source]#

Returns the multipass user info.

Return type:

dict

{
    "id": "<multipass-id>",
    "username": "<username>",
    "attributes": {
        "multipass:email:primary": ["<email>"],
        "multipass:given-name": ["<given-name>"],
        "multipass:organization": ["<your-org>"],
        "multipass:organization-rid": ["ri.multipass..organization. ..."],
        "multipass:family-name": ["<family-name>"],
        "multipass:upn": ["<upn>"],
        "multipass:realm": ["<your-company>"],
        "multipass:realm-name": ["<your-org>"],
    },
}
get_group(group_id)[source]#

Returns the multipass group information.

Returns:

The API response

Return type:

dict

Parameters:

group_id (str)

{
    'id': '<id>',
    'name': '<groupname>',
    'attributes': {
    'multipass:realm': ['palantir-internal-realm'],
    'multipass:organization': ['<your-org>'],
    'multipass:organization-rid': ['ri.multipass..organization.<...>'],
    'multipass:realm-name': ['Palantir Internal']
}
delete_group(group_id)[source]#

Deletes multipass group.

Parameters:

group_id (str) – the group id to delete

Return type:

requests.Response

create_third_party_application(client_type, display_name, description, grant_types, redirect_uris, logo_uri, organization_rid, allowed_organization_rids=None, resources=None, operations=None, marking_ids=None, role_set_id=None, role_grants=None, **kwargs)[source]#

Creates Foundry Third Party application (TPA).

https://www.palantir.com/docs/foundry/platform-security-third-party/third-party-apps-overview/ User must have ‘Manage OAuth 2.0 clients’ workflow permissions.

Parameters:
  • client_type (Literal['CONFIDENTIAL', 'PUBLIC']) – Server Application (CONFIDENTIAL) or Native or single-page application (PUBLIC)

  • display_name (str) – Display name of the TPA

  • description (str | None) – Long description of the TPA

  • grant_types (list[Literal['AUTHORIZATION_CODE', 'CLIENT_CREDENTIALS', 'REFRESH_TOKEN']]) – Usually, [“AUTHORIZATION_CODE”, “REFRESH_TOKEN”] (authorization code grant) or [“REFRESH_TOKEN”, “CLIENT_CREDENTIALS”] (client credentials grant)

  • redirect_uris (list | None) – Redirect URLs of TPA, used in combination with AUTHORIZATION_CODE grant

  • logo_uri (str | None) – URI or embedded image ‘data:image/png;base64,<…>’

  • organization_rid (str) – Parent Organization of this TPA

  • allowed_organization_rids (list | None) – Passing None or empty list means TPA is activated for all Foundry organizations

  • resources (list[str] | None) – Resources allowed to access by the client, otherwise no resource restrictions

  • operations (list[str] | None) – Operations the client can be granted, otherwise no operation restrictions

  • marking_ids (list[str] | None) – Markings allowed to access by the client, otherwise no marking restrictions

  • role_set_id (str | None) – roles allowed for this client, defaults to oauth2-client

  • role_grants (dict[str, list[str]] | None) – mapping between roles and principal ids dict[role id,list[principal id]]

  • **kwargs – gets passed to APIClient.api_request()

Return type:

dict

See below for the structure

{
    "clientId":"<...>",
    "clientSecret":"<...>",
    "clientType":"<CONFIDENTIAL/PUBLIC>",
    "organizationRid":"<...>",
    "displayName":"<...>",
    "description":null,
    "logoUri":null,
    "grantTypes":[<"AUTHORIZATION_CODE","REFRESH_TOKEN","CLIENT_CREDENTIALS">],
    "redirectUris":[],
    "allowedOrganizationRids":[]
}
delete_third_party_application(client_id)[source]#

Deletes a Third Party Application.

Parameters:

client_id (str) – The unique identifier of the TPA.

Return type:

requests.Response

update_third_party_application(client_id, client_type, display_name, description, grant_types, redirect_uris, logo_uri, organization_rid, allowed_organization_rids=None, resources=None, operations=None, marking_ids=None, role_set_id=None, **kwargs)[source]#

Updates Foundry Third Party application (TPA).

https://www.palantir.com/docs/foundry/platform-security-third-party/third-party-apps-overview/ User must have ‘Manage OAuth 2.0 clients’ workflow permissions.

Parameters:
  • client_id (str) – The unique identifier of the TPA.

  • client_type (Literal['CONFIDENTIAL', 'PUBLIC']) – Server Application (CONFIDENTIAL) or Native or single-page application (PUBLIC)

  • display_name (str) – Display name of the TPA

  • description (str | None) – Long description of the TPA

  • grant_types (list[Literal['AUTHORIZATION_CODE', 'CLIENT_CREDENTIALS', 'REFRESH_TOKEN']]) – Usually, [“AUTHORIZATION_CODE”, “REFRESH_TOKEN”] (authorization code grant) or [“REFRESH_TOKEN”, “CLIENT_CREDENTIALS”] (client credentials grant)

  • redirect_uris (list | None) – Redirect URLs of TPA, used in combination with AUTHORIZATION_CODE grant

  • logo_uri (str | None) – URI or embedded image ‘data:image/png;base64,<…>’

  • organization_rid (str) – Parent Organization of this TPA

  • allowed_organization_rids (list | None) – Passing None or empty list means TPA is activated for all Foundry organizations

  • resources (list[str] | None) – Resources allowed to access by the client, otherwise no resource restrictions

  • operations (list[str] | None) – Operations the client can be granted, otherwise no operation restrictions

  • marking_ids (list[str] | None) – Markings allowed to access by the client, otherwise no marking restrictions

  • role_set_id (str | None) – roles allowed for this client, defaults to oauth2-client

  • **kwargs – gets passed to APIClient.api_request()

Return type:

dict

Reponse in following structure:

{
    "clientId":"<...>",
    "clientType":"<CONFIDENTIAL/PUBLIC>",
    "organizationRid":"<...>",
    "displayName":"<...>",
    "description":null,
    "logoUri":null,
    "grantTypes":[<"AUTHORIZATION_CODE","REFRESH_TOKEN","CLIENT_CREDENTIALS">],
    "redirectUris":[],
    "allowedOrganizationRids":[]
}
rotate_third_party_application_secret(client_id)[source]#

Rotates Foundry Third Party application (TPA) secret.

Parameters:

client_id (str) – The unique identifier of the TPA.

Returns:

See below for the structure

Return type:

dict

{
    "clientId":"<...>",
    "clientSecret": "<...>",
    "clientType":"<CONFIDENTIAL/PUBLIC>",
    "organizationRid":"<...>",
    "displayName":"<...>",
    "description":null,
    "logoUri":null,
    "grantTypes":[<"AUTHORIZATION_CODE","REFRESH_TOKEN","CLIENT_CREDENTIALS">],
    "redirectUris":[],
    "allowedOrganizationRids":[]
}
enable_third_party_application(client_id, operations=None, resources=None, marking_ids=None, grant_types=None, require_consent=True, **kwargs)[source]#

Enables Foundry Third Party application (TPA).

Parameters:
  • client_id (str) – The unique identifier of the TPA.

  • operations (list | None) – Scopes that this TPA is allowed to use (To be confirmed) if None or empty list is passed, all scopes will be activated.

  • resources (list | None) – Compass Project RID’s that this TPA is allowed to access, if None or empty list is passed, unrestricted access will be given.

  • marking_ids (list[str] | None) – Marking Ids that this TPA is allowed to access, if None or empty list is passed, unrestricted access will be given.

  • grant_types (list[Literal['AUTHORIZATION_CODE', 'CLIENT_CREDENTIALS', 'REFRESH_TOKEN']] | None) – Grant types that this TPA is allowed to use to access resources, if None is passed, no grant type restrictions if an empty list is passed, no grant types are allowed for this TPA

  • require_consent (bool) – Wether users need to provide consent for this application to act on their behalf, defaults to true

  • **kwargs – gets passed to APIClient.api_request()

Return type:

dict

Response with the following structure:

{
    "client": {
        "clientId": "<...>",
        "organizationRid": "ri.multipass..organization.<...>",
        "displayName": "<...>",
        "description": None,
        "logoUri": None,
    },
    "installation": {"resources": [], "operations": [], "markingIds": None},
}
start_checks_and_build(repository_id, ref_name, commit_hash, file_paths)[source]#

Starts checks and builds.

Parameters:
  • repository_id (str) – the repository id where the transform is located

  • ref_name (str) – the git ref_name for the branch

  • commit_hash (str) – the git commit hash

  • file_paths (List[str]) – a list of python transform files

Returns:

the JSON API response

Return type:

dict

get_build(build_rid)[source]#

Get information about the build.

Parameters:

build_rid (str) – the build RID

Returns:

the JSON API response

Return type:

dict

get_job_report(job_rid)[source]#

Get the report for a job.

Parameters:

job_rid (str) – the job RID

Returns:

the job report response

Return type:

dict

get_s3fs_storage_options()[source]#

Get the foundry s3 credentials in the s3fs storage_options format.

Example

>>> fc = FoundryRestClient()
>>> storage_options = fc.get_s3fs_storage_options()
>>> df = pd.read_parquet(
...     "s3://ri.foundry.main.dataset.<uuid>/spark", storage_options=storage_options
... )
Return type:

dict

get_boto3_s3_client(**kwargs)[source]#

Returns the boto3 s3 client with credentials applied and endpoint url set.

See foundry_dev_tools.clients.s3_client.api_assume_role_with_webidentity.

Example

>>> from foundry_dev_tools import FoundryRestClient
>>> fc = FoundryRestClient()
>>> s3_client = fc.get_boto3_client()
>>> s3_client
Parameters:

**kwargs – gets passed to boto3.session.Session.client(), endpoint_url will be overwritten

get_boto3_s3_resource(**kwargs)[source]#

Returns boto3 s3 resource with credentials applied and endpoint url set.

Parameters:

**kwargs – gets passed to boto3.session.Session.resource(), endpoint_url will be overwritten

get_s3_credentials(expiration_duration=3600)[source]#

Parses the AssumeRoleWithWebIdentity response and caches the credentials.

See foundry_dev_tools.clients.s3_client.api_assume_role_with_webidentity.

Parameters:

expiration_duration (int)

Return type:

dict