API documentation#
Read and download STAC items, item collections, collections, and assets.
The main entry points are free functions:
Use Config
to configure how assets are downloaded. Every client
inherits from Client
, which defines a common interface for accessing
assets. Writing items, item collections, collections, and assets is currently
unsupported, but is on the roadmap.
- exception stac_asset.AssetOverwriteError(hrefs: List[str])#
Raised when an asset would be overwritten during download.
- class stac_asset.Client#
An abstract base class for all clients.
- async assert_href_exists(href: str) None #
Asserts that a href exists.
The default implementation naïvely opens the href and reads one chunk. Clients may implement specialized behavior.
- Parameters:
href – The href to open
- Raises:
Exception – The underlying error when trying to open the file.
- async close() None #
Close this client.
- async download_href(href: str, path: PathLike[Any] | str, clean: bool = True, content_type: str | None = None, messages: Queue[Message] | None = None) None #
Downloads a file to the local filesystem.
- Parameters:
href – The input href
path – The output file path
clean – If an error occurs, delete the output file if it exists
content_type – The expected content type
messages – An optional queue to use for progress reporting
- async classmethod from_config(config: Config) T #
Creates a client using the provided configuration.
Needed because some client setups require async operations.
- Returns:
A new client Client
- Return type:
T
- async href_exists(href: str) bool #
Returns true if the href exists.
The default implementation naïvely opens the href and reads one chunk. Clients may implement specialized behavior.
- Parameters:
href – The href to open
- Returns:
Whether the href exists
- Return type:
bool
- async open_href(href: str, content_type: str | None = None, messages: Queue[Message] | None = None) AsyncIterator[bytes] #
Opens a href and yields an iterator over its bytes.
- Parameters:
href – The input href
content_type – The expected content type
messages – An optional queue to use for progress reporting
- Yields:
AsyncIterator[bytes] – An iterator over chunks of the read file
- abstract async open_url(url: URL, content_type: str | None = None, messages: Queue[Message] | None = None) AsyncIterator[bytes] #
Opens a url and yields an iterator over its bytes.
This is the core method that all clients must implement.
- Parameters:
url – The input url
content_type – The expected content type, to be checked by the client implementations
messages – An optional queue to use for progress reporting
- Yields:
AsyncIterator[bytes] – An iterator over chunks of the read file
- class stac_asset.Config(alternate_assets: ~typing.List[str] = <factory>, file_name_strategy: ~stac_asset.strategy.FileNameStrategy = FileNameStrategy.FILE_NAME, warn: bool = False, fail_fast: bool = False, error_strategy: ~stac_asset.strategy.ErrorStrategy = ErrorStrategy.DELETE, exclude: ~typing.List[str] = <factory>, include: ~typing.List[str] = <factory>, make_directory: bool = True, clean: bool = True, overwrite: bool = False, earthdata_token: str | None = None, s3_region_name: str = 'us-west-2', s3_requester_pays: bool = False, s3_retry_mode: str = 'adaptive', s3_max_attempts: int = 10)#
Configuration for downloading items and their assets.
- alternate_assets: List[str]#
Alternate asset keys to prefer, if available.
- clean: bool = True#
If true, clean up the downloaded file if it errors.
- copy() Config #
Returns a deep copy of this config.
- Returns:
A deep copy of this config.
- Return type:
- earthdata_token: str | None = None#
A token for logging in to Earthdata.
- error_strategy: ErrorStrategy = 2#
The strategy to use when errors occur during download.
- exclude: List[str]#
Assets to exclude from the download.
Mutually exclusive with
include
.
- fail_fast: bool = False#
If an error occurs during download, fail immediately.
By default, all downloads are completed before raising/warning any errors. Mutually exclusive with
warn
.
- file_name_strategy: FileNameStrategy = 1#
The file name strategy to use when downloading assets.
- include: List[str]#
Assets to include in the download.
Mutually exclusive with
exclude
.
- make_directory: bool = True#
Whether to create the output directory.
If False, and the output directory does not exist, an error will be raised.
- overwrite: bool = False#
Download files even if they already exist locally.
- s3_max_attempts: int = 10#
The maximum number of attempts when downloading assets from s3.
- s3_region_name: str = 'us-west-2'#
Default s3 region.
- s3_requester_pays: bool = False#
If using the s3 client, enable requester pays.
- s3_retry_mode: str = 'adaptive'#
The retry mode to use for s3 requests.
- validate() None #
Validates this configuration.
- Raises:
CannotIncludeAndExclude –
include
andexclude
are mutually exclusive
- warn: bool = False#
If an error occurs during download, warn instead of raising the error.
- exception stac_asset.ConfigError#
Raised if the configuration is not valid.
- exception stac_asset.ContentTypeError(actual: str, expected: str, *args: Any, **kwargs: Any)#
The expected content type does not match the actual content type.
- exception stac_asset.DownloadError(exceptions: List[Exception], *args: Any, **kwargs: Any)#
A collection of exceptions encountered while downloading.
- exception stac_asset.DownloadWarning#
A warning for when something couldn’t be downloaded.
Used when we don’t want to cancel all downloads, but still inform the user about the problem.
- class stac_asset.EarthdataClient(session: ClientSession, check_content_type: bool = True)#
Access data from https://www.earthdata.nasa.gov/.
To access data, you’ll need a personal access token.
Create a new personal access token by going to https://urs.earthdata.nasa.gov/profile and then clicking “Generate Token” (you’ll need to log in).
Set an environment variable named
EARTHDATA_PAT
to your token.Use
EarthdataClient.from_config()
to create a new client.
You can also provide your token directly to
EarthdataClient.login()
.- async classmethod from_config(config: Config) EarthdataClient #
Logs in to Earthdata and returns the default earthdata client.
Uses a token stored in the
EARTHDATA_PAT
environment variable, if the token is not provided in the config.- Parameters:
config – A configuration object.
- Returns:
A logged-in EarthData client.
- Return type:
- async classmethod login(token: str | None = None) EarthdataClient #
Logs in to Earthdata and returns a client.
If token is not provided, it is read from the
EARTHDATA_PAT
environment variable.- Parameters:
token – The Earthdata bearer token
- Returns:
A client configured to use the bearer token
- Return type:
- class stac_asset.ErrorStrategy(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)#
Strategy to use when encountering errors during download.
- DELETE = 2#
Delete the asset from the item.
- KEEP = 1#
Keep the asset on the item with its original href.
- class stac_asset.FileNameStrategy(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)#
Strategy to use for naming files.
- FILE_NAME = 1#
Save the asset with the file name in its href.
Could potentially conflict with another asset with the same file name but different path.
- KEY = 2#
Save the asset with its key as its file name.
- class stac_asset.FilesystemClient#
A simple client for moving files around on the filesystem.
Mostly used for testing, but could be useful in some real-world cases.
- async assert_href_exists(href: str) None #
Asserts that an href exists.
- async open_url(url: URL, content_type: str | None = None, messages: Queue[Message] | None = None) AsyncIterator[bytes] #
Iterates over data from a local url.
- Parameters:
url – The url to read bytes from
content_type – The expected content type. Ignored by this client, because filesystems don’t have content types.
messages – An optional queue to use for progress reporting
- Yields:
AsyncIterator[bytes] – An iterator over the file’s bytes.
- Raises:
ValueError – Raised if the url has a scheme. This behavior will change if/when we support Windows paths.
- class stac_asset.HttpClient(session: ClientSession, check_content_type: bool = True)#
A simple client for making HTTP requests.
By default, doesn’t do any authentication. Configure the session to customize its behavior.
- async assert_href_exists(href: str) None #
Asserts that the href exists.
Uses a HEAD request.
- check_content_type: bool#
If true, check the asset’s content type against the response from the server.
See
stac_asset.validate.content_type()
for more information about hte content type check.
- async close() None #
Close this http client.
Closes the underlying session.
- async classmethod from_config(config: Config) T #
Creates the default http client with a vanilla session object.
- async open_url(url: URL, content_type: str | None = None, messages: Queue[Message] | None = None) AsyncIterator[bytes] #
Opens a url with this client’s session and iterates over its bytes.
- Parameters:
url – The url to open
content_type – The expected content type
messages – An optional queue to use for progress reporting
- Yields:
AsyncIterator[bytes] – An iterator over the file’s bytes
- Raises:
aiohttp.ClientResponseError – Raised if the response is not OK
- session: ClientSession#
A atiohttp session that will be used for all requests.
- class stac_asset.Message#
A message about downloading.
- class stac_asset.PlanetaryComputerClient(session: ClientSession, sas_token_endpoint: str = 'https://planetarycomputer.microsoft.com/api/sas/v1/token')#
Open and download assets from Microsoft’s Planetary Computer.
Heavily cribbed from microsoft/planetary-computer-sdk-for-python, thanks Tom Augspurger!
- async assert_href_exists(href: str) None #
Asserts that the href exists.
Uses a HEAD request on a signed url.
- async open_url(url: URL, content_type: str | None = None, messages: Queue[Any] | None = None) AsyncIterator[bytes] #
Opens a url and iterates over its bytes.
Includes functionality to sign the url with a SAS token fetched from this client’s
sas_token_endpoint
. Tokens are cached on a per-client basis to prevent a large number of requests when fetching many assets.Not every URL is modified with a SAS token. We only modify the url if:
The url is in Azure blob storage
The url is not in the public thumbnail storage account
- The url hasn’t already signed (we check this by seeing if the url has
SAS-like query parameters)
- Parameters:
url – The url to open
content_type – The expected content type
messages – An optional queue to use for progress reporting
- Yields:
AsyncIterator[bytes] – An iterator over the file’s bytes
- class stac_asset.S3Client(requester_pays: bool = False, region_name: str = 'us-west-2', retry_mode: str = 'adaptive', max_attempts: int = 10)#
A client for interacting with s3 urls.
To use the
requester_pays
option, you need to configure your AWS credentials. See the AWS documentation for instructions.- async assert_href_exists(href: str) None #
Asserts that the href exists.
Uses
head_object
- async classmethod from_config(config: Config) S3Client #
Creates an s3 client from a config.
- Parameters:
config – The config object
- Returns:
A new s3 client
- Return type:
- async has_credentials() bool #
Returns true if the sessions has credentials.
- max_attempts: int#
The maximum number of attempts.
- async open_url(url: URL, content_type: str | None = None, messages: Queue[Message] | None = None) AsyncIterator[bytes] #
Opens an s3 url and iterates over its bytes.
- Parameters:
url – The url to open
content_type – The expected content type
messages – An optional queue to use for progress reporting
- Yields:
AsyncIterator[bytes] – An iterator over the file’s bytes
- Raises:
SchemeError – Raised if the url’s scheme is not
s3
- region_name: str#
The region that all clients will be rooted in.
- requester_pays: bool#
If True, enable access to requester pays buckets.
- retry_mode: str#
The retry mode, one of “adaptive”, “legacy”, or “standard”.
See the boto3 docs for more information on the available modes.
- session: AioSession#
The session that will be used for all s3 requests.
- async stac_asset.assert_asset_exists(asset: Asset, config: Config | None = None, clients: List[Client] | None = None) None #
Asserts that an asset exists.
Raises the source error if it does not.
- Parameters:
asset – The asset the check for existence
config – The download configuration to use for the existence check
clients – Any pre-configured clients to use for the existence check
- Raises:
Exception – An exception from the underlying client.
- async stac_asset.asset_exists(asset: Asset, config: Config | None = None, clients: List[Client] | None = None) bool #
Returns true if an asset exists.
- Parameters:
asset – The asset the check for existence
config – The download configuration to use for the existence check
clients – Any pre-configured clients to use for the existence check
- Returns:
Whether the asset exists or not
- Return type:
bool
- async stac_asset.download_asset(key: str, asset: Asset, path: Path, config: Config, messages: Queue[Message] | None = None, clients: Clients | None = None) Asset #
Downloads an asset.
- Parameters:
key – The asset key
asset – The asset
path – The path to which the asset will be downloaded
config – The download configuration
messages – An optional queue to use for progress reporting
clients – A async-safe cache of clients. If not provided, a new one will be created.
- Returns:
The asset with an updated href
- Return type:
Asset
- Raises:
ValueError – Raised if the asset does not have an absolute href
- async stac_asset.download_collection(collection: Collection, directory: PathLike[Any] | str, file_name: str | None = 'collection.json', config: Config | None = None, messages: Queue[Message] | None = None, clients: List[Client] | None = None, keep_non_downloaded: bool = False) Collection #
Downloads a collection to the local filesystem.
Does not download the collection’s items’ assets – use
download_item_collection()
to download multiple items.- Parameters:
collection – A pystac collection
directory – The destination directory
file_name – The name of the collection file to save. If not provided, will not be saved.
config – The download configuration
messages – An optional queue to use for progress reporting
clients – Pre-configured clients to use for access
keep_non_downloaded – Keep all assets on the item, even if they’re not downloaded.
- Returns:
The collection, with updated asset hrefs
- Return type:
Collection
- Raises:
CantIncludeAndExclude – Raised if both include and exclude are not None.
- async stac_asset.download_item(item: Item, directory: PathLike[Any] | str, file_name: str | None = None, infer_file_name: bool = True, config: Config | None = None, messages: Queue[Message] | None = None, clients: List[Client] | None = None, keep_non_downloaded: bool = False) Item #
Downloads an item to the local filesystem.
- Parameters:
item – The
pystac.Item
.directory – The output directory that will hold the items and assets.
file_name – The name of the item file to save. If not provided, will not be saved.
infer_file_name – If
file_name
is None, infer the file name from the item’s id. This argument is unused iffile_name
is not None.config – The download configuration
messages – An optional queue to use for progress reporting
clients – Pre-configured clients to use for access
keep_non_downloaded – Keep all assets on the item, even if they’re not downloaded.
- Returns:
The ~pystac.Item, with the updated asset hrefs and self href.
- Return type:
Item
- Raises:
ValueError – Raised if the item doesn’t have any assets.
- async stac_asset.download_item_collection(item_collection: ItemCollection, directory: PathLike[Any] | str, file_name: str | None = 'item-collection.json', config: Config | None = None, messages: Queue[Message] | None = None, clients: List[Client] | None = None, keep_non_downloaded: bool = False) ItemCollection #
Downloads an item collection to the local filesystem.
- Parameters:
item_collection – The item collection to download
directory – The destination directory
file_name – The name of the item collection file to save. If not provided, will not be saved.
config – The download configuration
messages – An optional queue to use for progress reporting
clients – Pre-configured clients to use for access
keep_non_downloaded – Keep all assets on the item, even if they’re not downloaded.
- Returns:
The item collection, with updated asset hrefs
- Return type:
ItemCollection
- Raises:
CantIncludeAndExclude – Raised if both include and exclude are not None.
- async stac_asset.open_href(href: str, config: Config | None = None, clients: List[Client] | None = None) AsyncIterator[bytes] #
Opens an href and yields byte chunks.
- Parameters:
href – The href to read
config – The download configuration to use
clients – Any pre-configured clients to use
- Yields:
bytes – The bytes from the href
- async stac_asset.read_href(href: str, config: Config | None = None, clients: List[Client] | None = None) bytes #
Reads an href and returns its bytes.
- Parameters:
href – The href to read
config – The download configuration to use
clients – Any pre-configured clients to use
- Returns:
The bytes from the href
- Return type:
bytes
stac_asset.blocking#
Blocking interfaces for functions.
These should only be used from fully synchronous code. If you have _any_ async code in your application, prefer the top-level functions.
- stac_asset.blocking.assert_asset_exists(asset: Asset, config: Config | None = None, clients: List[Client] | None = None) None #
Asserts that an asset exists, synchronously.
Raises the source error if it does not.
- Parameters:
asset – The asset the check for existence
config – The download configuration to use for the existence check
clients – Any pre-configured clients to use for the existence check
- Raises:
Exception – An exception from the underlying client.
- stac_asset.blocking.asset_exists(asset: Asset, config: Config | None = None, clients: List[Client] | None = None) bool #
Returns true if an asset exists, synchronously.
- Parameters:
asset – The asset the check for existence
config – The download configuration to use for the existence check
clients – Any pre-configured clients to use for the existence check
- Returns:
Whether the asset exists or not
- Return type:
bool
- stac_asset.blocking.download_asset(key: str, asset: Asset, path: Path, config: Config, messages: Queue[Message] | None = None, clients: Clients | None = None) Asset #
Downloads an asset, synchronously.
- Parameters:
key – The asset key
asset – The asset
path – The path to which the asset will be downloaded
config – The download configuration
messages – An optional queue to use for progress reporting
clients – A async-safe cache of clients. If not provided, a new one will be created.
- Returns:
The asset with an updated href
- Return type:
Asset
- Raises:
ValueError – Raised if the asset does not have an absolute href
- stac_asset.blocking.download_collection(collection: Collection, directory: PathLike[Any] | str, file_name: str | None = 'collection.json', config: Config | None = None, messages: Queue[Message] | None = None, clients: List[Client] | None = None, keep_non_downloaded: bool = False) Collection #
Downloads a collection to the local filesystem, synchronously.
Does not download the collection’s items’ assets – use
download_item_collection()
to download multiple items.- Parameters:
collection – A pystac collection
directory – The destination directory
file_name – The name of the collection file to save. If not provided, will not be saved.
config – The download configuration
messages – An optional queue to use for progress reporting
clients – Pre-configured clients to use for access
keep_non_downloaded – Keep all assets on the item, even if they’re not downloaded.
- Returns:
The collection, with updated asset hrefs
- Return type:
Collection
- Raises:
CantIncludeAndExclude – Raised if both include and exclude are not None.
- stac_asset.blocking.download_item(item: Item, directory: PathLike[Any] | str, file_name: str | None = None, infer_file_name: bool = True, config: Config | None = None, messages: Queue[Message] | None = None, clients: List[Client] | None = None, keep_non_downloaded: bool = False) Item #
Downloads an item to the local filesystem, synchronously.
- Parameters:
item – The
pystac.Item
.directory – The output directory that will hold the items and assets.
file_name – The name of the item file to save. If not provided, will not be saved.
infer_file_name – If
file_name
is None, infer the file name from the item’s id. This argument is unused iffile_name
is not None.config – The download configuration
messages – An optional queue to use for progress reporting
clients – Pre-configured clients to use for access
keep_non_downloaded – Keep all assets on the item, even if they’re not downloaded.
- Returns:
The ~pystac.Item, with the updated asset hrefs and self href.
- Return type:
Item
- Raises:
ValueError – Raised if the item doesn’t have any assets.
- stac_asset.blocking.download_item_collection(item_collection: ItemCollection, directory: PathLike[Any] | str, file_name: str | None = 'item-collection.json', config: Config | None = None, messages: Queue[Message] | None = None, clients: List[Client] | None = None, keep_non_downloaded: bool = False) ItemCollection #
Downloads an item collection to the local filesystem, synchronously.
- Parameters:
item_collection – The item collection to download
directory – The destination directory
file_name – The name of the item collection file to save. If not provided, will not be saved.
config – The download configuration
messages – An optional queue to use for progress reporting
clients – Pre-configured clients to use for access
keep_non_downloaded – Keep all assets on the item, even if they’re not downloaded.
- Returns:
The item collection, with updated asset hrefs
- Return type:
ItemCollection
- Raises:
CantIncludeAndExclude – Raised if both include and exclude are not None.
- stac_asset.blocking.read_href(href: str, config: Config | None = None, clients: List[Client] | None = None) bytes #
Reads an href and returns its bytes.
- Parameters:
href – The href to read
config – The download configuration to use
clients – Any pre-configured clients to use
- Returns:
The bytes from the href
- Return type:
bytes
stac_asset.validate#
- stac_asset.validate.content_type(actual: str, expected: str) None #
Validates that the actual content type matches the expected.
This is normally a simple string comparison, but has some extra rules:
COGs are allowed in place of TIFFs, and vice versa
Responses with
binary/octet-stream
andapplication/octet-stream
are always allowed
- Parameters:
actual – The actual content type
expected – The expected content type
- Raises:
ContentTypeError – Raised if the actual doesn’t match the expected.