foundry_dev_tools.clients.s3_client module#

S3Client for the S3 compatible dataset API.

class foundry_dev_tools.clients.s3_client.S3Client[source]#

Bases: object

The S3 compatible dataset API.

__init__(context)[source]#
Parameters:

context (FoundryContext)

get_url()[source]#

Return the s3 endpoint url.

Return type:

str

get_s3fs_storage_options()[source]#

Get the foundry s3 credentials in the s3fs storage_options format.

Example

>>> ctx = FoundryContext()
>>> storage_options = ctx.s3.get_s3fs_storage_options()
>>> df = pd.read_parquet(
...     "s3://ri.foundry.main.dataset.<uuid>/spark", storage_options=storage_options
... )
Return type:

dict

get_polars_storage_options()[source]#

Get the foundry s3 credentials in the format that polars expects.

https://docs.rs/object_store/latest/object_store/aws/enum.AmazonS3ConfigKey.html

Example

>>> ctx = FoundryContext()
>>> storage_options = ctx.s3.get_polars_storage_options()
>>> df = pl.read_parquet(
...     "s3://ri.foundry.main.dataset.<uuid>/**/*.parquet", storage_options=storage_options
... )
Return type:

dict

get_duckdb_create_secret_string()[source]#

Returns a CREATE SECRET SQL String with Foundry Configuration.

https://duckdb.org/docs/extensions/httpfs/s3api.html#config-provider

Example

>>> ctx = FoundryContext()
>>> con.execute(ctx.s3.get_duckdb_create_secret_string())
>>> df = con.execute(
...     "SELECT * FROM read_parquet('s3://ri.foundry.main.dataset.<uuid>/**/*.parquet') LIMIT 1;"
... ).df()
Return type:

str

get_boto3_client(**kwargs)[source]#

Returns the boto3 s3 client with credentials applied and endpoint url set.

See foundry_dev_tools.clients.s3_client.api_assume_role_with_webidentity.

Example

>>> from foundry_dev_tools import FoundryContext
>>> ctx = FoundryContext()
>>> s3_client = ctx.s3.get_boto3_client()
>>> s3_client
Parameters:

**kwargs – gets passed to boto3.session.Session.client(), endpoint_url will be overwritten

get_boto3_resource(**kwargs)[source]#

Returns boto3 s3 resource with credentials applied and endpoint url set.

Parameters:

**kwargs – gets passed to boto3.session.Session.resource(), endpoint_url will be overwritten

get_credentials(expiration_duration=3600)[source]#

Parses the AssumeRoleWithWebIdentity response and caches the credentials.

See foundry_dev_tools.clients.s3_client.api_assume_role_with_webidentity.

Parameters:

expiration_duration (int)

Return type:

dict

api_assume_role_with_webidentity(expiration_duration=3600)[source]#

Calls the AssumeRoleWithWebIdentity API to get temporary S3 credentials.

Parameters:

expiration_duration (int) – seconds the credentials should be valid, defaults to 3600 (the upper bound)

Return type:

requests.Response