foundry_dev_tools.foundry_api_client module

foundry_dev_tools.foundry_api_client module#

Contains FoundryRestClient and FoundrySqlClient and exception classes.

One of the gaols of this module is to be self-contained so that it can be dropped into any python installation with minimal dependency to ‘requests’ Optional dependencies for the SQL functionality to work are pandas and pyarrow.

class foundry_dev_tools.foundry_api_client.FoundryRestClient[source]#

Bases: object

Create an instance of FoundryRestClient.

Parameters:

config – config dictionary which tries to get parsed into the v2 configuration, to be backwards compatible
ctx – or just pass the v2 FoundryContext directly instead of the ‘old’ configuration, the config dict will be ignored

Examples

>>> fc = FoundryRestClient()
>>> fc = FoundryRestClient(config={"jwt": "<token>"})
>>> fc = FoundryRestClient(config={"client_id": "<client_id>"})

>>> ctx = FoundryContext()
>>> fc = FoundryRestClient(ctx=ctx)

__init__(config=None, ctx=None)[source]#

Create an instance of FoundryRestClient.

Parameters:

config (dict | None) – config dictionary which tries to get parsed into the v2 configuration, to be backwards compatible
ctx (FoundryContext | None) – or just pass the v2 FoundryContext directly instead of the ‘old’ configuration, the config dict will be ignored

Examples

>>> fc = FoundryRestClient()
>>> fc = FoundryRestClient(config={"jwt": "<token>"})
>>> fc = FoundryRestClient(config={"client_id": "<client_id>"})

>>> ctx = FoundryContext()
>>> fc = FoundryRestClient(ctx=ctx)

create_dataset(dataset_path)[source]#

Creates an empty dataset in Foundry.

Parameters:: dataset_path (str) – Path in Foundry, where this empty dataset should be created for example: /Global/Foundry Operations/Foundry Support/iris_new
Returns:: with keys rid and fileSystemId. The key rid contains the dataset_rid which is the unique identifier of a dataset.
Return type:: dict

get_dataset(dataset_rid)[source]#

Gets dataset_rid and fileSystemId.

Parameters:: dataset_rid (str) – Dataset rid
Returns:: with the keys rid and fileSystemId
Return type:: dict
Raises:: DatasetNotFoundError – if dataset does not exist

delete_dataset(dataset_rid)[source]#

Deletes a dataset in Foundry and moves it to trash.

Parameters:: dataset_rid (str) – Unique identifier of the dataset
Raises:: DatasetNotFoundError – if dataset does not exist

move_resource_to_trash(rid)[source]#

Moves a Compass resource (e.g. dataset or folder) to trash.

Parameters:: rid (str) – rid of the resource

create_branch(dataset_rid, branch, parent_branch_id=None, parent_branch=None)[source]#

Creates a new branch in a dataset.

If dataset is ‘new’, only parameter dataset_rid and branch are required.

Parameters:

dataset_rid (str) – Unique identifier of the dataset
branch (str) – The branch name to create
parent_branch (str | None) – The transaction id, to branch off
parent_branch_id (str | None) – The name of the parent branch, if empty creates new root branch

Returns:

the response as a json object

Return type:

dict

update_branch(dataset_rid, branch, parent_branch=None)[source]#

Updates the latest transaction of branch ‘branch’ to the latest transaction of branch ‘parent_branch’.

Parameters:

dataset_rid (str) – Unique identifier of the dataset
branch (str) – The branch to update (e.g. master)
parent_branch (str | None) – the name of the branch to copy the last transaction from or a transaction rid

Returns:

example below for the branch response

Return type:

dict

{
    "id": "..",
    "rid": "ri.foundry.main.branch...",
    "ancestorBranchIds": [],
    "creationTime": "",
    "transactionRid": "ri.foundry.main.transaction....",
}

get_branches(dataset_rid)[source]#

Returns a list of branches available a dataset.

Parameters:: dataset_rid (str) – Unique identifier of the dataset
Returns:: list of dataset branch names
Return type:: list[str]

get_branch(dataset_rid, branch)[source]#

Returns branch information.

Parameters:

dataset_rid (str) – Unique identifier of the dataset
branch (str) – Branch name

Returns:

with keys id (name) and rid (unique id) of the branch.

Return type:

dict

Raises:

BranchNotFoundError – if branch does not exist.

open_transaction(dataset_rid, mode='SNAPSHOT', branch='master')[source]#

Opens a new transaction on a dataset.

Parameters:

dataset_rid (str) – Unique identifier of the dataset
mode (str) – APPEND: append files, SNAPSHOT: replace all, UPDATE: replace file if exists, keep existing files
branch (str) – dataset branch

Returns:

the transaction ID

Return type:

str

Raises:

BranchNotFoundError – if branch does not exist
DatasetNotFoundError – if dataset does not exist
DatasetHasOpenTransactionError – if dataset has an open transaction

remove_dataset_file(dataset_rid, transaction_id, logical_path, recursive=False)[source]#

Removes the given file from an open transaction.

If the logical path matches a file exactly then only that file will be removed, regardless of the value of recursive. If the logical path represents a directory, then all files prefixed with the logical path followed by ‘/’ will be removed when recursive is true and no files will be removed when recursive is false. If the given logical path does not match a file or directory then this call is ignored and does not throw an exception.

Parameters:

dataset_rid (str) – Unique identifier of the dataset
transaction_id (str) – transaction rid
logical_path (str) – logical path in the backing filesystem
recursive (bool) – recurse into subdirectories

add_files_to_delete_transaction(dataset_rid, transaction_id, logical_paths)[source]#

Adds files in an open DELETE transaction.

Files added to DELETE transactions affect the dataset view by removing files from the view.

Parameters:

dataset_rid (str) – Unique identifier of the dataset
transaction_id (str) – transaction rid
logical_paths (List[str]) – files in the dataset to delete

commit_transaction(dataset_rid, transaction_id)[source]#

Commits a transaction, should be called after file upload is complete.

Parameters:

dataset_rid (str) – Unique identifier of the dataset
transaction_id (str) – transaction id

Raises:

KeyError – when there was an issue with committing

abort_transaction(dataset_rid, transaction_id)[source]#

Aborts a transaction. Dataset will remain on transaction N-1.

Parameters:

dataset_rid (str) – Unique identifier of the dataset
transaction_id (str) – transaction id

Raises:

KeyError – When abort transaction fails

get_dataset_transactions(dataset_rid, branch='master', last=50, include_open_exclusive_transaction=False)[source]#

Returns the transactions of a dataset / branch combination.

Returns last 50 transactions by default, pagination not implemented.

Parameters:

dataset_rid (str) – Unique identifier of the dataset
branch (str) – Branch
last (int) – last
include_open_exclusive_transaction (bool) – include_open_exclusive_transaction

Returns:

of transaction information.

Return type:

dict

Raises:

BranchNotFoundError – if branch not found
DatasetHasNoTransactionsError – If the dataset has not transactions

get_dataset_last_transaction(dataset_rid, branch='master')[source]#

Returns the last transaction of a dataset / branch combination.

Parameters:

dataset_rid (str) – Unique identifier of the dataset
branch (str) – Branch

Returns:

response from transaction API or None if dataset has no transaction.

Return type:

dict | None

get_dataset_last_transaction_rid(dataset_rid, branch='master')[source]#

Returns the last transaction rid of a dataset / branch combination.

Parameters:

dataset_rid (str) – Unique identifier of the dataset
branch (str) – Branch

Returns:

transaction rid or None if dataset has no transaction.

Return type:

str | None

upload_dataset_file(dataset_rid, transaction_rid, path_or_buf, path_in_foundry_dataset)[source]#

Uploads a file like object to a path in a foundry dataset.

Parameters:

dataset_rid (str) – Unique identifier of the dataset
transaction_rid (str) – transaction id
path_or_buf (str | Path | IO[AnyStr]) – A str or file handle, file path or object
path_in_foundry_dataset (str) – The path in the dataset, to which the file is uploaded.

Return type:

requests.Response

upload_dataset_files(dataset_rid, transaction_rid, path_file_dict, parallel_processes=None)[source]#

Uploads multiple local files to a foundry dataset.

Parameters:

dataset_rid (str) – Unique identifier of the dataset
transaction_rid (str) – transaction id
parallel_processes (int | None) – Set number of threads for upload
path_file_dict (dict) – A dictionary with the following structure:

{
'<path_in_foundry_dataset>': '<local_file_path>',
...
}

get_dataset_details(dataset_path_or_rid)[source]#

Returns the resource information of a dataset.

Parameters:: dataset_path_or_rid (str) – The full path or rid to the dataset
Returns:: the json response of the api
Return type:: dict
Raises:: DatasetNotFoundError – if dataset not found

get_child_objects_of_folder(folder_rid, page_size=None)[source]#

Returns the child objects of a compass folder.

Parameters:

folder_rid (str) – Compass folder rid, e.g. ri.compass.main.folder.f549ae09-9534-44c7-967a-6c86b2339231
page_size (int) – to control the pageSize manually

Yields:

dict – information about child objects

Raises:

FolderNotFoundError – if folder not found

Return type:

Iterator[dict]

create_folder(name, parent_id)[source]#

Creates an empty folder in compass.

Parameters:

name (str) – name of the new folder
parent_id (str) – rid of the parent folder, e.g. ri.compass.main.folder.aca0cce9-2419-4978-bb18-d4bc6e50bd7e

Returns:

with keys rid and name and other properties.

Return type:

dict

get_dataset_rid(dataset_path)[source]#

Returns the rid of a dataset, uses dataset_path as input.

Parameters:: dataset_path (str) – The full path to the dataset
Returns:: the dataset_rid
Return type:: str

get_dataset_path(dataset_rid)[source]#

Returns the path of a dataset as str.

Parameters:: dataset_rid (str) – The rid of the dataset
Returns:: the dataset_path
Return type:: str
Raises:: DatasetNotFoundError – if dataset was not found

get_dataset_paths(dataset_rids)[source]#

Returns a list of paths for a list of passed rid’s of a dataset.

Parameters:: dataset_rids (list) – The rid’s of datasets
Returns:: the dataset_path as dict of string
Return type:: dict

is_dataset_in_trash(dataset_path)[source]#

Returns true if dataset is in compass trash.

Parameters:: dataset_path (str) – The path to the dataset
Returns:: true if dataset is in trash
Return type:: bool

get_dataset_schema(dataset_rid, transaction_rid=None, branch='master')[source]#

Returns the foundry dataset schema for a dataset, transaction, branch combination.

Parameters:

dataset_rid (str) – The rid of the dataset
transaction_rid (str) – The rid of the transaction
branch (str) – The branch

Returns:

with foundry dataset schema

Return type:

dict

Raises:

DatasetNotFoundError – if dataset was not found
DatasetHasNoSchemaError – if dataset has no scheme
BranchNotFoundError – if branch was not found
KeyError – if the combination of dataset_rid, transaction_rid and branch was not found

upload_dataset_schema(dataset_rid, transaction_rid, schema, branch='master')[source]#

Uploads the foundry dataset schema for a dataset, transaction, branch combination.

Parameters:

dataset_rid (str) – The rid of the dataset
transaction_rid (str) – The rid of the transaction
schema (dict) – The foundry schema
branch (str) – The branch

Return type:

requests.Response

infer_dataset_schema(dataset_rid, branch='master')[source]#

Calls the foundry-schema-inference service to infer the dataset schema.

Returns dict with foundry schema, if status == SUCCESS

Parameters:

dataset_rid (str) – The dataset rid
branch (str) – The branch

Returns:

with dataset schema, that can be used to call upload_dataset_schema

Return type:

dict

Raises:

ValueError – if foundry schema inference failed

get_dataset_identity(dataset_path_or_rid, branch='master', check_read_access=True)[source]#

Returns the identity of this dataset (dataset_path, dataset_rid, last_transaction_rid, last_transaction).

Parameters:

dataset_path_or_rid (str) – Path to dataset (e.g. /Global/…) or rid of dataset (e.g. ri.foundry.main.dataset…)
branch (str) – branch of the dataset
check_read_access (bool) – default is True, checks if the user has read access (‘gatekeeper:view-resource’) to the dataset otherwise exception is thrown

Returns:

with the keys ‘dataset_path’, ‘dataset_rid’, ‘last_transaction_rid’, ‘last_transaction’

Return type:

dict

Raises:

DatasetNoReadAccessError – if you have no read access for that dataset

list_dataset_files(dataset_rid, exclude_hidden_files=True, view='master', logical_path=None, detail=False, *, include_open_exclusive_transaction=False)[source]#

Returns list of internal filenames of a dataset.

Parameters:

dataset_rid (str) – the dataset rid
exclude_hidden_files (bool) – if hidden files should be excluded (e.g. _log files)
view (str) – branch or transaction rid of the dataset
logical_path (str) – If logical_path is absent, returns all files in the view. If logical_path matches a file exactly, returns just that file. Otherwise, returns all files in the “directory” of logical_path: (a slash is added to the end of logicalPath if necessary and a prefix-match is performed)
detail (bool) – if passed as True, returns complete response from catalog API, otherwise only returns logicalPath
include_open_exclusive_transaction (bool) – if files added in open transaction should be returned as well in the response

Returns:

filenames

Return type:

list

Raises:

DatasetNotFound – if dataset was not found

get_dataset_stats(dataset_rid, view='master')[source]#

Returns response from foundry catalogue stats endpoint.

Parameters:

dataset_rid (str) – the dataset rid
view (str) – branch or transaction rid of the dataset

Returns:

sizeInBytes, numFiles, hiddenFilesSizeInBytes, numHiddenFiles, numTransactions

Return type:

dict

foundry_stats(dataset_rid, end_transaction_rid, branch='master')[source]#

Returns row counts and size of the dataset/view.

Parameters:

dataset_rid (str) – The dataset RID.
end_transaction_rid (str) – The specific transaction RID, which will be used to return the statistics.
branch (str) – The branch to query

Returns:

With the following structure: { datasetRid: str, branch: str, endTransactionRid: str, schemaId: str, computedDatasetStats: { rowCount: str | None, sizeInBytes: str, columnStats: { “…”: {nullCount: str | None, uniqueCount: str | None, avgLength: str | None, maxLength: str | None,} }, }, }

Return type:

dict

download_dataset_file(dataset_rid, output_directory, foundry_file_path, view='master')[source]#

Downloads a single foundry dataset file into a directory.

Creates sub folder if necessary.

Parameters:

dataset_rid (str) – the dataset rid
output_directory (str | None) – the local output directory for the file or None if None is passed, byte content of file is returned
foundry_file_path (str) – the file_path on the foundry file system
view (str) – branch or transaction rid of the dataset

Returns:

local file path in case output_directory was passed or file content as bytes

Return type:

str | bytes

Raises:

ValueError – If download failed

download_dataset_files(dataset_rid, output_directory, files=None, view='master', parallel_processes=None)[source]#

Downloads files of a dataset (in parallel) to a local output directory.

Parameters:

dataset_rid (str) – the dataset rid
files (list | None) – list of files or None, in which case all files are downloaded
output_directory (str) – the output directory for the files default value is calculated: multiprocessing.cpu_count() - 1
view (str) – branch or transaction rid of the dataset
parallel_processes (int | None) – Set number of threads for upload

Returns:

path to downloaded files

Return type:

List[str]

download_dataset_files_temporary(dataset_rid, files=None, view='master', parallel_processes=None)[source]#

Downloads all files of a dataset to a temporary directory.

Which is deleted when the context is closed. Function returns the temporary directory. Example usage:

>>> import parquet
>>> import pandas as pd
>>> from pathlib import Path
>>> with client.download_dataset_files_temporary(dataset_rid='ri.foundry.main.dataset.1', view='master') as         >>> temp_folder:
>>>     # Read using Pandas
>>>     df = pd.read_parquet(temp_folder)
>>>     # Read using pyarrow, here we pass only the files, which are normally in subfolder 'spark'
>>>     pq = parquet.ParquetDataset(path_or_paths=[x for x in Path(temp_dir).glob('**/*') if x.is_file()])

Parameters:

dataset_rid (str) – the dataset rid
files (List[str]) – list of files or None, in which case all files are downloaded
view (str) – branch or transaction rid of the dataset
parallel_processes (int) – Set number of threads for download

Yields:

Iterator[str] – path to temporary folder containing root of dataset files

Return type:

Iterator[str]

get_dataset_as_raw_csv(dataset_rid, branch='master')[source]#

Uses csv API to download a dataset as csv.

Parameters:

dataset_rid (str) – the dataset rid
branch (str) – branch of the dataset

Returns:

with the csv stream. Can be converted to a pandas DataFrame >>> pd.read_csv(io.BytesIO(response.content))

Return type:

Response

query_foundry_sql_legacy(query: str, return_type: Literal['pandas'], branch: str = 'master', sql_dialect: Literal['ANSI', 'SPARK'] = 'SPARK', timeout: int = 600) → pd.core.frame.DataFrame[source]#

query_foundry_sql_legacy(query: str, return_type: Literal['spark'], branch: str = 'master', sql_dialect: Literal['ANSI', 'SPARK'] = 'SPARK', timeout: int = 600) → pyspark.sql.DataFrame

query_foundry_sql_legacy(query: str, return_type: Literal['arrow'], branch: str = 'master', sql_dialect: Literal['ANSI', 'SPARK'] = 'SPARK', timeout: int = 600) → pa.Table

query_foundry_sql_legacy(query: str, return_type: Literal['raw'], branch: str = 'master', sql_dialect: Literal['ANSI', 'SPARK'] = 'SPARK', timeout: int = 600) → tuple[dict, list[list]]

query_foundry_sql_legacy(query: str, return_type: Literal['pandas', 'spark', 'arrow', 'raw'] = 'raw', branch: str = 'master', sql_dialect: Literal['ANSI', 'SPARK'] = 'SPARK', timeout: int = 600) → tuple[dict, list[list]] | pd.core.frame.DataFrame | pa.Table | pyspark.sql.DataFrame

Queries the dataproxy query API with spark SQL.

Example

query_foundry_sql_legacy(query=”SELECT * FROM /Global/Foundry Operations/Foundry Support/iris”,: branch=”master”)

Parameters:

query – the sql query
branch – the branch of the dataset / query
return_type – See foundry_dev_tools.utils.api_types.SQLReturnType
sql_dialect – the SQL dialect used for the query
timeout – the query request timeout

Returns:

(foundry_schema, data): data: contains the data matrix, foundry_schema: the foundry schema (fieldSchemaList key). Can be converted to a pandas Dataframe, see below

foundry_schema, data = self.query_foundry_sql_legacy(query, branch)
df = pd.DataFrame(
    data=data, columns=[e["name"] for e in foundry_schema["fieldSchemaList"]]
)

Return type:

tuple (dict, list)

Raises:

ValueError – if return_type is not in :py:class:SQLReturnType
DatasetHasNoSchemaError – if dataset has no schema
BranchNotFoundError – if branch was not found

query_foundry_sql(query: str, return_type: Literal['pandas'], branch: str = 'master', sql_dialect: Literal['ANSI', 'SPARK'] = 'SPARK', timeout: int = 600) → pd.core.frame.DataFrame[source]#

query_foundry_sql(query: str, return_type: Literal['spark'], branch: str = 'master', sql_dialect: Literal['ANSI', 'SPARK'] = 'SPARK', timeout: int = 600) → pyspark.sql.DataFrame

query_foundry_sql(query: str, return_type: Literal['arrow'], branch: str = 'master', sql_dialect: Literal['ANSI', 'SPARK'] = 'SPARK', timeout: int = 600) → pa.Table

query_foundry_sql(query: str, return_type: Literal['raw'], branch: str = 'master', sql_dialect: Literal['ANSI', 'SPARK'] = 'SPARK', timeout: int = 600) → tuple[dict, list[list]]

query_foundry_sql(query: str, return_type: Literal['pandas', 'spark', 'arrow', 'raw'] = 'pandas', branch: str = 'master', sql_dialect: Literal['ANSI', 'SPARK'] = 'SPARK', timeout: int = 600) → tuple[dict, list[list]] | pd.core.frame.DataFrame | pa.Table | pyspark.sql.DataFrame

Queries the Foundry SQL server with spark SQL dialect.

Uses Arrow IPC to communicate with the Foundry SQL Server Endpoint.

Falls back to query_foundry_sql_legacy in case pyarrow is not installed or the query does not return Arrow Format.

Example

df1 = client.query_foundry_sql(“SELECT * FROM /Global/Foundry Operations/Foundry Support/iris”) query = (“SELECT col1 FROM {start_transaction_rid}:{end_transaction_rid}@{branch}.`{dataset_path_or_rid}` WHERE filterColumns = ‘value1’ LIMIT 1”) df2 = client.query_foundry_sql(query)

Parameters:

query – The SQL Query in Foundry Spark Dialect (use backticks instead of quotes)
branch – the dataset branch
sql_dialect – the sql dialect
return_type – See :py:class:foundry_dev_tools.foundry_api_client.SQLReturnType
timeout – Query Timeout, default value is 600 seconds

Returns:

A pandas DataFrame, Spark DataFrame or pyarrow.Table with the result.

Return type:

DataFrame | Table | DataFrame

Raises:

ValueError – Only direct read eligible queries can be returned as arrow Table.

get_user_info()[source]#

Returns the multipass user info.

Return type:: dict

{
    "id": "<multipass-id>",
    "username": "<username>",
    "attributes": {
        "multipass:email:primary": ["<email>"],
        "multipass:given-name": ["<given-name>"],
        "multipass:organization": ["<your-org>"],
        "multipass:organization-rid": ["ri.multipass..organization. ..."],
        "multipass:family-name": ["<family-name>"],
        "multipass:upn": ["<upn>"],
        "multipass:realm": ["<your-company>"],
        "multipass:realm-name": ["<your-org>"],
    },
}

get_group(group_id)[source]#

Returns the multipass group information.

Returns:: The API response
Return type:: dict
Parameters:: group_id (str)

{
    'id': '<id>',
    'name': '<groupname>',
    'attributes': {
    'multipass:realm': ['palantir-internal-realm'],
    'multipass:organization': ['<your-org>'],
    'multipass:organization-rid': ['ri.multipass..organization.<...>'],
    'multipass:realm-name': ['Palantir Internal']
}

delete_group(group_id)[source]#

Deletes multipass group.

Parameters:: group_id (str) – the group id to delete
Return type:: requests.Response

create_third_party_application(client_type, display_name, description, grant_types, redirect_uris, logo_uri, organization_rid, allowed_organization_rids=None, resources=None, operations=None, marking_ids=None, role_set_id=None, role_grants=None, **kwargs)[source]#

Creates Foundry Third Party application (TPA).

https://www.palantir.com/docs/foundry/platform-security-third-party/third-party-apps-overview/ User must have ‘Manage OAuth 2.0 clients’ workflow permissions.

Parameters:

client_type (Literal['CONFIDENTIAL', 'PUBLIC']) – Server Application (CONFIDENTIAL) or Native or single-page application (PUBLIC)
display_name (str) – Display name of the TPA
description (str | None) – Long description of the TPA
grant_types (list[Literal['AUTHORIZATION_CODE', 'CLIENT_CREDENTIALS', 'REFRESH_TOKEN']]) – Usually, [“AUTHORIZATION_CODE”, “REFRESH_TOKEN”] (authorization code grant) or [“REFRESH_TOKEN”, “CLIENT_CREDENTIALS”] (client credentials grant)
redirect_uris (list | None) – Redirect URLs of TPA, used in combination with AUTHORIZATION_CODE grant
logo_uri (str | None) – URI or embedded image ‘data:image/png;base64,<…>’
organization_rid (str) – Parent Organization of this TPA
allowed_organization_rids (list | None) – Passing None or empty list means TPA is activated for all Foundry organizations
resources (list[str] | None) – Resources allowed to access by the client, otherwise no resource restrictions
operations (list[str] | None) – Operations the client can be granted, otherwise no operation restrictions
marking_ids (list[str] | None) – Markings allowed to access by the client, otherwise no marking restrictions
role_set_id (str | None) – roles allowed for this client, defaults to oauth2-client
role_grants (dict[str, list[str]] | None) – mapping between roles and principal ids dict[role id,list[principal id]]
**kwargs – gets passed to APIClient.api_request()

Return type:

dict

See below for the structure

{
    "clientId":"<...>",
    "clientSecret":"<...>",
    "clientType":"<CONFIDENTIAL/PUBLIC>",
    "organizationRid":"<...>",
    "displayName":"<...>",
    "description":null,
    "logoUri":null,
    "grantTypes":[<"AUTHORIZATION_CODE","REFRESH_TOKEN","CLIENT_CREDENTIALS">],
    "redirectUris":[],
    "allowedOrganizationRids":[]
}

delete_third_party_application(client_id)[source]#

Deletes a Third Party Application.

Parameters:: client_id (str) – The unique identifier of the TPA.
Return type:: requests.Response

update_third_party_application(client_id, client_type, display_name, description, grant_types, redirect_uris, logo_uri, organization_rid, allowed_organization_rids=None, resources=None, operations=None, marking_ids=None, role_set_id=None, **kwargs)[source]#

Updates Foundry Third Party application (TPA).

https://www.palantir.com/docs/foundry/platform-security-third-party/third-party-apps-overview/ User must have ‘Manage OAuth 2.0 clients’ workflow permissions.

Parameters:

client_id (str) – The unique identifier of the TPA.
client_type (Literal['CONFIDENTIAL', 'PUBLIC']) – Server Application (CONFIDENTIAL) or Native or single-page application (PUBLIC)
display_name (str) – Display name of the TPA
description (str | None) – Long description of the TPA
grant_types (list[Literal['AUTHORIZATION_CODE', 'CLIENT_CREDENTIALS', 'REFRESH_TOKEN']]) – Usually, [“AUTHORIZATION_CODE”, “REFRESH_TOKEN”] (authorization code grant) or [“REFRESH_TOKEN”, “CLIENT_CREDENTIALS”] (client credentials grant)
redirect_uris (list | None) – Redirect URLs of TPA, used in combination with AUTHORIZATION_CODE grant
logo_uri (str | None) – URI or embedded image ‘data:image/png;base64,<…>’
organization_rid (str) – Parent Organization of this TPA
allowed_organization_rids (list | None) – Passing None or empty list means TPA is activated for all Foundry organizations
resources (list[str] | None) – Resources allowed to access by the client, otherwise no resource restrictions
operations (list[str] | None) – Operations the client can be granted, otherwise no operation restrictions
marking_ids (list[str] | None) – Markings allowed to access by the client, otherwise no marking restrictions
role_set_id (str | None) – roles allowed for this client, defaults to oauth2-client
**kwargs – gets passed to APIClient.api_request()

Return type:

dict

Reponse in following structure:

{
    "clientId":"<...>",
    "clientType":"<CONFIDENTIAL/PUBLIC>",
    "organizationRid":"<...>",
    "displayName":"<...>",
    "description":null,
    "logoUri":null,
    "grantTypes":[<"AUTHORIZATION_CODE","REFRESH_TOKEN","CLIENT_CREDENTIALS">],
    "redirectUris":[],
    "allowedOrganizationRids":[]
}

rotate_third_party_application_secret(client_id)[source]#

Rotates Foundry Third Party application (TPA) secret.

Parameters:: client_id (str) – The unique identifier of the TPA.
Returns:: See below for the structure
Return type:: dict

{
    "clientId":"<...>",
    "clientSecret": "<...>",
    "clientType":"<CONFIDENTIAL/PUBLIC>",
    "organizationRid":"<...>",
    "displayName":"<...>",
    "description":null,
    "logoUri":null,
    "grantTypes":[<"AUTHORIZATION_CODE","REFRESH_TOKEN","CLIENT_CREDENTIALS">],
    "redirectUris":[],
    "allowedOrganizationRids":[]
}

enable_third_party_application(client_id, operations=None, resources=None, marking_ids=None, grant_types=None, require_consent=True, **kwargs)[source]#

Enables Foundry Third Party application (TPA).

Parameters:

client_id (str) – The unique identifier of the TPA.
operations (list | None) – Scopes that this TPA is allowed to use (To be confirmed) if None or empty list is passed, all scopes will be activated.
resources (list | None) – Compass Project RID’s that this TPA is allowed to access, if None or empty list is passed, unrestricted access will be given.
marking_ids (list[str] | None) – Marking Ids that this TPA is allowed to access, if None or empty list is passed, unrestricted access will be given.
grant_types (list[Literal['AUTHORIZATION_CODE', 'CLIENT_CREDENTIALS', 'REFRESH_TOKEN']] | None) – Grant types that this TPA is allowed to use to access resources, if None is passed, no grant type restrictions if an empty list is passed, no grant types are allowed for this TPA
require_consent (bool) – Wether users need to provide consent for this application to act on their behalf, defaults to true
**kwargs – gets passed to APIClient.api_request()

Return type:

dict

Response with the following structure:

{
    "client": {
        "clientId": "<...>",
        "organizationRid": "ri.multipass..organization.<...>",
        "displayName": "<...>",
        "description": None,
        "logoUri": None,
    },
    "installation": {"resources": [], "operations": [], "markingIds": None},
}

start_checks_and_build(repository_id, ref_name, commit_hash, file_paths)[source]#

Starts checks and builds.

Parameters:

repository_id (str) – the repository id where the transform is located
ref_name (str) – the git ref_name for the branch
commit_hash (str) – the git commit hash
file_paths (List[str]) – a list of python transform files

Returns:

the JSON API response

Return type:

dict

get_build(build_rid)[source]#

Get information about the build.

Parameters:: build_rid (str) – the build RID
Returns:: the JSON API response
Return type:: dict

get_job_report(job_rid)[source]#

Get the report for a job.

Parameters:: job_rid (str) – the job RID
Returns:: the job report response
Return type:: dict

get_s3fs_storage_options()[source]#

Get the foundry s3 credentials in the s3fs storage_options format.

Example

>>> fc = FoundryRestClient()
>>> storage_options = fc.get_s3fs_storage_options()
>>> df = pd.read_parquet(
...     "s3://ri.foundry.main.dataset.<uuid>/spark", storage_options=storage_options
... )

Return type:: dict

get_boto3_s3_client(**kwargs)[source]#

Returns the boto3 s3 client with credentials applied and endpoint url set.

See foundry_dev_tools.clients.s3_client.api_assume_role_with_webidentity.

Example

>>> from foundry_dev_tools import FoundryRestClient
>>> fc = FoundryRestClient()
>>> s3_client = fc.get_boto3_client()
>>> s3_client

Parameters:: **kwargs – gets passed to boto3.session.Session.client(), endpoint_url will be overwritten

get_boto3_s3_resource(**kwargs)[source]#

Returns boto3 s3 resource with credentials applied and endpoint url set.

Parameters:: **kwargs – gets passed to boto3.session.Session.resource(), endpoint_url will be overwritten

get_s3_credentials(expiration_duration=3600)[source]#

Parses the AssumeRoleWithWebIdentity response and caches the credentials.

See foundry_dev_tools.clients.s3_client.api_assume_role_with_webidentity.

Parameters:: expiration_duration (int)
Return type:: dict

foundry_dev_tools.foundry_api_client module

Contents

foundry_dev_tools.foundry_api_client module#