API documentation#

Read and download STAC items, item collections, collections, and assets.

The main entry points are free functions:

Use Config to configure how assets are downloaded. Every client inherits from Client, which defines a common interface for accessing assets. Writing items, item collections, collections, and assets is currently unsupported, but is on the roadmap.

exception stac_asset.AssetOverwriteError(hrefs: List[str])#

Raised when an asset would be overwritten during download.

class stac_asset.Client#

An abstract base class for all clients.

async assert_href_exists(href: str) None#

Asserts that a href exists.

The default implementation naïvely opens the href and reads one chunk. Clients may implement specialized behavior.

Parameters:

href – The href to open

Raises:

Exception – The underlying error when trying to open the file.

async close() None#

Close this client.

async download_href(href: str, path: PathLike[Any] | str, clean: bool = True, content_type: str | None = None, messages: Queue[Message] | None = None) None#

Downloads a file to the local filesystem.

Parameters:
  • href – The input href

  • path – The output file path

  • clean – If an error occurs, delete the output file if it exists

  • content_type – The expected content type

  • messages – An optional queue to use for progress reporting

async classmethod from_config(config: Config) T#

Creates a client using the provided configuration.

Needed because some client setups require async operations.

Returns:

A new client Client

Return type:

T

async href_exists(href: str) bool#

Returns true if the href exists.

The default implementation naïvely opens the href and reads one chunk. Clients may implement specialized behavior.

Parameters:

href – The href to open

Returns:

Whether the href exists

Return type:

bool

async open_href(href: str, content_type: str | None = None, messages: Queue[Message] | None = None) AsyncIterator[bytes]#

Opens a href and yields an iterator over its bytes.

Parameters:
  • href – The input href

  • content_type – The expected content type

  • messages – An optional queue to use for progress reporting

Yields:

AsyncIterator[bytes] – An iterator over chunks of the read file

abstract async open_url(url: URL, content_type: str | None = None, messages: Queue[Message] | None = None) AsyncIterator[bytes]#

Opens a url and yields an iterator over its bytes.

This is the core method that all clients must implement.

Parameters:
  • url – The input url

  • content_type – The expected content type, to be checked by the client implementations

  • messages – An optional queue to use for progress reporting

Yields:

AsyncIterator[bytes] – An iterator over chunks of the read file

class stac_asset.Config(alternate_assets: ~typing.List[str] = <factory>, file_name_strategy: ~stac_asset.strategy.FileNameStrategy = FileNameStrategy.FILE_NAME, warn: bool = False, fail_fast: bool = False, error_strategy: ~stac_asset.strategy.ErrorStrategy = ErrorStrategy.DELETE, exclude: ~typing.List[str] = <factory>, include: ~typing.List[str] = <factory>, make_directory: bool = True, clean: bool = True, overwrite: bool = False, earthdata_token: str | None = None, s3_region_name: str = 'us-west-2', s3_requester_pays: bool = False, s3_retry_mode: str = 'adaptive', s3_max_attempts: int = 10)#

Configuration for downloading items and their assets.

alternate_assets: List[str]#

Alternate asset keys to prefer, if available.

clean: bool = True#

If true, clean up the downloaded file if it errors.

copy() Config#

Returns a deep copy of this config.

Returns:

A deep copy of this config.

Return type:

Config

earthdata_token: str | None = None#

A token for logging in to Earthdata.

error_strategy: ErrorStrategy = 2#

The strategy to use when errors occur during download.

exclude: List[str]#

Assets to exclude from the download.

Mutually exclusive with include.

fail_fast: bool = False#

If an error occurs during download, fail immediately.

By default, all downloads are completed before raising/warning any errors. Mutually exclusive with warn.

file_name_strategy: FileNameStrategy = 1#

The file name strategy to use when downloading assets.

include: List[str]#

Assets to include in the download.

Mutually exclusive with exclude.

make_directory: bool = True#

Whether to create the output directory.

If False, and the output directory does not exist, an error will be raised.

overwrite: bool = False#

Download files even if they already exist locally.

s3_max_attempts: int = 10#

The maximum number of attempts when downloading assets from s3.

s3_region_name: str = 'us-west-2'#

Default s3 region.

s3_requester_pays: bool = False#

If using the s3 client, enable requester pays.

s3_retry_mode: str = 'adaptive'#

The retry mode to use for s3 requests.

validate() None#

Validates this configuration.

Raises:

CannotIncludeAndExcludeinclude and exclude are mutually exclusive

warn: bool = False#

If an error occurs during download, warn instead of raising the error.

exception stac_asset.ConfigError#

Raised if the configuration is not valid.

exception stac_asset.ContentTypeError(actual: str, expected: str, *args: Any, **kwargs: Any)#

The expected content type does not match the actual content type.

exception stac_asset.DownloadError(exceptions: List[Exception], *args: Any, **kwargs: Any)#

A collection of exceptions encountered while downloading.

exception stac_asset.DownloadWarning#

A warning for when something couldn’t be downloaded.

Used when we don’t want to cancel all downloads, but still inform the user about the problem.

class stac_asset.EarthdataClient(session: ClientSession, check_content_type: bool = True)#

Access data from https://www.earthdata.nasa.gov/.

To access data, you’ll need a personal access token.

  1. Create a new personal access token by going to https://urs.earthdata.nasa.gov/profile and then clicking “Generate Token” (you’ll need to log in).

  2. Set an environment variable named EARTHDATA_PAT to your token.

  3. Use EarthdataClient.from_config() to create a new client.

You can also provide your token directly to EarthdataClient.login().

async classmethod from_config(config: Config) EarthdataClient#

Logs in to Earthdata and returns the default earthdata client.

Uses a token stored in the EARTHDATA_PAT environment variable, if the token is not provided in the config.

Parameters:

config – A configuration object.

Returns:

A logged-in EarthData client.

Return type:

EarthdataClient

async classmethod login(token: str | None = None) EarthdataClient#

Logs in to Earthdata and returns a client.

If token is not provided, it is read from the EARTHDATA_PAT environment variable.

Parameters:

token – The Earthdata bearer token

Returns:

A client configured to use the bearer token

Return type:

EarthdataClient

class stac_asset.ErrorStrategy(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)#

Strategy to use when encountering errors during download.

DELETE = 2#

Delete the asset from the item.

KEEP = 1#

Keep the asset on the item with its original href.

class stac_asset.FileNameStrategy(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)#

Strategy to use for naming files.

FILE_NAME = 1#

Save the asset with the file name in its href.

Could potentially conflict with another asset with the same file name but different path.

KEY = 2#

Save the asset with its key as its file name.

class stac_asset.FilesystemClient#

A simple client for moving files around on the filesystem.

Mostly used for testing, but could be useful in some real-world cases.

async assert_href_exists(href: str) None#

Asserts that an href exists.

async open_url(url: URL, content_type: str | None = None, messages: Queue[Message] | None = None) AsyncIterator[bytes]#

Iterates over data from a local url.

Parameters:
  • url – The url to read bytes from

  • content_type – The expected content type. Ignored by this client, because filesystems don’t have content types.

  • messages – An optional queue to use for progress reporting

Yields:

AsyncIterator[bytes] – An iterator over the file’s bytes.

Raises:

ValueError – Raised if the url has a scheme. This behavior will change if/when we support Windows paths.

class stac_asset.HttpClient(session: ClientSession, check_content_type: bool = True)#

A simple client for making HTTP requests.

By default, doesn’t do any authentication. Configure the session to customize its behavior.

async assert_href_exists(href: str) None#

Asserts that the href exists.

Uses a HEAD request.

check_content_type: bool#

If true, check the asset’s content type against the response from the server.

See stac_asset.validate.content_type() for more information about hte content type check.

async close() None#

Close this http client.

Closes the underlying session.

async classmethod from_config(config: Config) T#

Creates the default http client with a vanilla session object.

async open_url(url: URL, content_type: str | None = None, messages: Queue[Message] | None = None) AsyncIterator[bytes]#

Opens a url with this client’s session and iterates over its bytes.

Parameters:
  • url – The url to open

  • content_type – The expected content type

  • messages – An optional queue to use for progress reporting

Yields:

AsyncIterator[bytes] – An iterator over the file’s bytes

Raises:

aiohttp.ClientResponseError – Raised if the response is not OK

session: ClientSession#

A atiohttp session that will be used for all requests.

class stac_asset.Message#

A message about downloading.

class stac_asset.PlanetaryComputerClient(session: ClientSession, sas_token_endpoint: str = 'https://planetarycomputer.microsoft.com/api/sas/v1/token')#

Open and download assets from Microsoft’s Planetary Computer.

Heavily cribbed from microsoft/planetary-computer-sdk-for-python, thanks Tom Augspurger!

async assert_href_exists(href: str) None#

Asserts that the href exists.

Uses a HEAD request on a signed url.

async open_url(url: URL, content_type: str | None = None, messages: Queue[Any] | None = None) AsyncIterator[bytes]#

Opens a url and iterates over its bytes.

Includes functionality to sign the url with a SAS token fetched from this client’s sas_token_endpoint. Tokens are cached on a per-client basis to prevent a large number of requests when fetching many assets.

Not every URL is modified with a SAS token. We only modify the url if:

  • The url is in Azure blob storage

  • The url is not in the public thumbnail storage account

  • The url hasn’t already signed (we check this by seeing if the url has

    SAS-like query parameters)

Parameters:
  • url – The url to open

  • content_type – The expected content type

  • messages – An optional queue to use for progress reporting

Yields:

AsyncIterator[bytes] – An iterator over the file’s bytes

class stac_asset.S3Client(requester_pays: bool = False, region_name: str = 'us-west-2', retry_mode: str = 'adaptive', max_attempts: int = 10)#

A client for interacting with s3 urls.

To use the requester_pays option, you need to configure your AWS credentials. See the AWS documentation for instructions.

async assert_href_exists(href: str) None#

Asserts that the href exists.

Uses head_object

async classmethod from_config(config: Config) S3Client#

Creates an s3 client from a config.

Parameters:

config – The config object

Returns:

A new s3 client

Return type:

S3Client

async has_credentials() bool#

Returns true if the sessions has credentials.

max_attempts: int#

The maximum number of attempts.

async open_url(url: URL, content_type: str | None = None, messages: Queue[Message] | None = None) AsyncIterator[bytes]#

Opens an s3 url and iterates over its bytes.

Parameters:
  • url – The url to open

  • content_type – The expected content type

  • messages – An optional queue to use for progress reporting

Yields:

AsyncIterator[bytes] – An iterator over the file’s bytes

Raises:

SchemeError – Raised if the url’s scheme is not s3

region_name: str#

The region that all clients will be rooted in.

requester_pays: bool#

If True, enable access to requester pays buckets.

retry_mode: str#

The retry mode, one of “adaptive”, “legacy”, or “standard”.

See the boto3 docs for more information on the available modes.

session: AioSession#

The session that will be used for all s3 requests.

async stac_asset.assert_asset_exists(asset: Asset, config: Config | None = None, clients: List[Client] | None = None) None#

Asserts that an asset exists.

Raises the source error if it does not.

Parameters:
  • asset – The asset the check for existence

  • config – The download configuration to use for the existence check

  • clients – Any pre-configured clients to use for the existence check

Raises:

Exception – An exception from the underlying client.

async stac_asset.asset_exists(asset: Asset, config: Config | None = None, clients: List[Client] | None = None) bool#

Returns true if an asset exists.

Parameters:
  • asset – The asset the check for existence

  • config – The download configuration to use for the existence check

  • clients – Any pre-configured clients to use for the existence check

Returns:

Whether the asset exists or not

Return type:

bool

async stac_asset.download_asset(key: str, asset: Asset, path: Path, config: Config, messages: Queue[Message] | None = None, clients: Clients | None = None) Asset#

Downloads an asset.

Parameters:
  • key – The asset key

  • asset – The asset

  • path – The path to which the asset will be downloaded

  • config – The download configuration

  • messages – An optional queue to use for progress reporting

  • clients – A async-safe cache of clients. If not provided, a new one will be created.

Returns:

The asset with an updated href

Return type:

Asset

Raises:

ValueError – Raised if the asset does not have an absolute href

async stac_asset.download_collection(collection: Collection, directory: PathLike[Any] | str, file_name: str | None = 'collection.json', config: Config | None = None, messages: Queue[Message] | None = None, clients: List[Client] | None = None, keep_non_downloaded: bool = False) Collection#

Downloads a collection to the local filesystem.

Does not download the collection’s items’ assets – use download_item_collection() to download multiple items.

Parameters:
  • collection – A pystac collection

  • directory – The destination directory

  • file_name – The name of the collection file to save. If not provided, will not be saved.

  • config – The download configuration

  • messages – An optional queue to use for progress reporting

  • clients – Pre-configured clients to use for access

  • keep_non_downloaded – Keep all assets on the item, even if they’re not downloaded.

Returns:

The collection, with updated asset hrefs

Return type:

Collection

Raises:

CantIncludeAndExclude – Raised if both include and exclude are not None.

async stac_asset.download_item(item: Item, directory: PathLike[Any] | str, file_name: str | None = None, infer_file_name: bool = True, config: Config | None = None, messages: Queue[Message] | None = None, clients: List[Client] | None = None, keep_non_downloaded: bool = False) Item#

Downloads an item to the local filesystem.

Parameters:
  • item – The pystac.Item.

  • directory – The output directory that will hold the items and assets.

  • file_name – The name of the item file to save. If not provided, will not be saved.

  • infer_file_name – If file_name is None, infer the file name from the item’s id. This argument is unused if file_name is not None.

  • config – The download configuration

  • messages – An optional queue to use for progress reporting

  • clients – Pre-configured clients to use for access

  • keep_non_downloaded – Keep all assets on the item, even if they’re not downloaded.

Returns:

The ~pystac.Item, with the updated asset hrefs and self href.

Return type:

Item

Raises:

ValueError – Raised if the item doesn’t have any assets.

async stac_asset.download_item_collection(item_collection: ItemCollection, directory: PathLike[Any] | str, file_name: str | None = 'item-collection.json', config: Config | None = None, messages: Queue[Message] | None = None, clients: List[Client] | None = None, keep_non_downloaded: bool = False) ItemCollection#

Downloads an item collection to the local filesystem.

Parameters:
  • item_collection – The item collection to download

  • directory – The destination directory

  • file_name – The name of the item collection file to save. If not provided, will not be saved.

  • config – The download configuration

  • messages – An optional queue to use for progress reporting

  • clients – Pre-configured clients to use for access

  • keep_non_downloaded – Keep all assets on the item, even if they’re not downloaded.

Returns:

The item collection, with updated asset hrefs

Return type:

ItemCollection

Raises:

CantIncludeAndExclude – Raised if both include and exclude are not None.

async stac_asset.open_href(href: str, config: Config | None = None, clients: List[Client] | None = None) AsyncIterator[bytes]#

Opens an href and yields byte chunks.

Parameters:
  • href – The href to read

  • config – The download configuration to use

  • clients – Any pre-configured clients to use

Yields:

bytes – The bytes from the href

async stac_asset.read_href(href: str, config: Config | None = None, clients: List[Client] | None = None) bytes#

Reads an href and returns its bytes.

Parameters:
  • href – The href to read

  • config – The download configuration to use

  • clients – Any pre-configured clients to use

Returns:

The bytes from the href

Return type:

bytes

stac_asset.blocking#

Blocking interfaces for functions.

These should only be used from fully synchronous code. If you have _any_ async code in your application, prefer the top-level functions.

stac_asset.blocking.assert_asset_exists(asset: Asset, config: Config | None = None, clients: List[Client] | None = None) None#

Asserts that an asset exists, synchronously.

Raises the source error if it does not.

Parameters:
  • asset – The asset the check for existence

  • config – The download configuration to use for the existence check

  • clients – Any pre-configured clients to use for the existence check

Raises:

Exception – An exception from the underlying client.

stac_asset.blocking.asset_exists(asset: Asset, config: Config | None = None, clients: List[Client] | None = None) bool#

Returns true if an asset exists, synchronously.

Parameters:
  • asset – The asset the check for existence

  • config – The download configuration to use for the existence check

  • clients – Any pre-configured clients to use for the existence check

Returns:

Whether the asset exists or not

Return type:

bool

stac_asset.blocking.download_asset(key: str, asset: Asset, path: Path, config: Config, messages: Queue[Message] | None = None, clients: Clients | None = None) Asset#

Downloads an asset, synchronously.

Parameters:
  • key – The asset key

  • asset – The asset

  • path – The path to which the asset will be downloaded

  • config – The download configuration

  • messages – An optional queue to use for progress reporting

  • clients – A async-safe cache of clients. If not provided, a new one will be created.

Returns:

The asset with an updated href

Return type:

Asset

Raises:

ValueError – Raised if the asset does not have an absolute href

stac_asset.blocking.download_collection(collection: Collection, directory: PathLike[Any] | str, file_name: str | None = 'collection.json', config: Config | None = None, messages: Queue[Message] | None = None, clients: List[Client] | None = None, keep_non_downloaded: bool = False) Collection#

Downloads a collection to the local filesystem, synchronously.

Does not download the collection’s items’ assets – use download_item_collection() to download multiple items.

Parameters:
  • collection – A pystac collection

  • directory – The destination directory

  • file_name – The name of the collection file to save. If not provided, will not be saved.

  • config – The download configuration

  • messages – An optional queue to use for progress reporting

  • clients – Pre-configured clients to use for access

  • keep_non_downloaded – Keep all assets on the item, even if they’re not downloaded.

Returns:

The collection, with updated asset hrefs

Return type:

Collection

Raises:

CantIncludeAndExclude – Raised if both include and exclude are not None.

stac_asset.blocking.download_item(item: Item, directory: PathLike[Any] | str, file_name: str | None = None, infer_file_name: bool = True, config: Config | None = None, messages: Queue[Message] | None = None, clients: List[Client] | None = None, keep_non_downloaded: bool = False) Item#

Downloads an item to the local filesystem, synchronously.

Parameters:
  • item – The pystac.Item.

  • directory – The output directory that will hold the items and assets.

  • file_name – The name of the item file to save. If not provided, will not be saved.

  • infer_file_name – If file_name is None, infer the file name from the item’s id. This argument is unused if file_name is not None.

  • config – The download configuration

  • messages – An optional queue to use for progress reporting

  • clients – Pre-configured clients to use for access

  • keep_non_downloaded – Keep all assets on the item, even if they’re not downloaded.

Returns:

The ~pystac.Item, with the updated asset hrefs and self href.

Return type:

Item

Raises:

ValueError – Raised if the item doesn’t have any assets.

stac_asset.blocking.download_item_collection(item_collection: ItemCollection, directory: PathLike[Any] | str, file_name: str | None = 'item-collection.json', config: Config | None = None, messages: Queue[Message] | None = None, clients: List[Client] | None = None, keep_non_downloaded: bool = False) ItemCollection#

Downloads an item collection to the local filesystem, synchronously.

Parameters:
  • item_collection – The item collection to download

  • directory – The destination directory

  • file_name – The name of the item collection file to save. If not provided, will not be saved.

  • config – The download configuration

  • messages – An optional queue to use for progress reporting

  • clients – Pre-configured clients to use for access

  • keep_non_downloaded – Keep all assets on the item, even if they’re not downloaded.

Returns:

The item collection, with updated asset hrefs

Return type:

ItemCollection

Raises:

CantIncludeAndExclude – Raised if both include and exclude are not None.

stac_asset.blocking.read_href(href: str, config: Config | None = None, clients: List[Client] | None = None) bytes#

Reads an href and returns its bytes.

Parameters:
  • href – The href to read

  • config – The download configuration to use

  • clients – Any pre-configured clients to use

Returns:

The bytes from the href

Return type:

bytes

stac_asset.validate#

stac_asset.validate.content_type(actual: str, expected: str) None#

Validates that the actual content type matches the expected.

This is normally a simple string comparison, but has some extra rules:

  • COGs are allowed in place of TIFFs, and vice versa

  • Responses with binary/octet-stream and application/octet-stream are always allowed

Parameters:
  • actual – The actual content type

  • expected – The expected content type

Raises:

ContentTypeError – Raised if the actual doesn’t match the expected.