utilities
Download and management utilities for syncing time and auxiliary files
Can list a directory on a ftp host
Can download a file from a ftp or http host
Can download a file from CDDIS via https when NASA Earthdata credentials are supplied
Checks
MD5orsha1hashes between local and remote files
General Methods
- SMBcorr.utilities.get_data_path(relpath)[source]
Get the absolute path within a package from a relative path
- Parameters:
- relpath: str,
relative path
- SMBcorr.utilities.import_dependency(name: str, extra: str = '', raise_exception: bool = False)[source]
Import an optional dependency
Adapted from
pandas.compat._optional::import_optional_dependency- Parameters:
- name: str
Module name
- extra: str, default “”
Additional text to include in the
ImportErrormessage- raise_exception: bool, default False
Raise an
ImportErrorif the module is not found
- Returns:
- module: obj
Imported module
- SMBcorr.utilities.get_hash(local, algorithm='MD5')[source]
Get the hash value from a local file or
BytesIOobject- Parameters:
- local: obj or str
BytesIO object or path to file
- algorithm: str, default ‘MD5’
hashing algorithm for checksum validation
'MD5': Message Digest'sha1': Secure Hash Algorithm
- SMBcorr.utilities.url_split(s)[source]
Recursively split a url path into a list
- Parameters:
- s: str
url string
- SMBcorr.utilities.get_unix_time(time_string, format='%Y-%m-%d %H:%M:%S')[source]
Get the Unix timestamp value for a formatted date string
- Parameters:
- time_string: str
formatted time string to parse
- format: str, default ‘%Y-%m-%d %H:%M:%S’
format for input time string
- SMBcorr.utilities.isoformat(time_string)[source]
Reformat a date string to ISO formatting
- Parameters:
- time_string: str
formatted time string to parse
- SMBcorr.utilities.even(value)[source]
Rounds a number to an even number less than or equal to original
- Parameters:
- value: float
number to be rounded
- SMBcorr.utilities.ceil(value)[source]
Rounds a number upward to its nearest integer
- Parameters:
- value: float
number to be rounded upward
- SMBcorr.utilities.copy(source, destination, move=False, **kwargs)[source]
Copy or move a file with all system information
- Parameters:
- source: str
source file
- destination: str
copied destination file
- move: bool, default False
remove the source file
- SMBcorr.utilities.check_ftp_connection(HOST, username=None, password=None)[source]
Check internet connection with ftp host
- Parameters:
- HOST: str
remote ftp host
- username: str or NoneType
ftp username
- password: str or NoneType
ftp password
- SMBcorr.utilities.ftp_list(HOST, username=None, password=None, timeout=None, basename=False, pattern=None, sort=False)[source]
List a directory on a ftp host
- Parameters:
- HOST: str or list
remote ftp host path split as list
- username: str or NoneType
ftp username
- password: str or NoneType
ftp password
- timeout: int or NoneType, default None
timeout in seconds for blocking operations
- basename: bool, default False
return the file or directory basename instead of the full path
- pattern: str or NoneType, default None
regular expression pattern for reducing list
- sort: bool, default False
sort output list
- Returns:
- output: list
items in a directory
- mtimes: list
last modification times for items in the directory
- SMBcorr.utilities.from_ftp(HOST, username=None, password=None, timeout=None, local=None, hash='', chunk=8192, verbose=False, fid=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, mode=509)[source]
Download a file from a ftp host
- Parameters:
- HOST: str or list
remote ftp host path
- username: str or NoneType
ftp username
- password: str or NoneType
ftp password
- timeout: int or NoneType, default None
timeout in seconds for blocking operations
- local: str or NoneType, default None
path to local file
- hash: str, default ‘’
MD5 hash of local file
- chunk: int, default 8192
chunk size for transfer encoding
- verbose: bool, default False
print file transfer information
- fid: obj, default sys.stdout
open file object to print if verbose
- mode: oct, default 0o775
permissions mode of output local file
- Returns:
- remote_buffer: obj
BytesIO representation of file
- SMBcorr.utilities.http_list(HOST, timeout=None, context=<ssl.SSLContext object>, parser=<lxml.etree.HTMLParser object>, format='%Y-%m-%d %H:%M', pattern='', sort=False)[source]
List a directory on an Apache http Server
- Parameters:
- HOST: str or list
remote http host path
- timeout: int or NoneType, default None
timeout in seconds for blocking operations
- context: obj, default ssl.SSLContext(ssl.PROTOCOL_TLS)
SSL context for
urllibopener object- parser: obj, default lxml.etree.HTMLParser()
HTML parser for
lxml- format: str, default ‘%Y-%m-%d %H:%M’
format for input time string
- pattern: str, default ‘’
regular expression pattern for reducing list
- sort: bool, default False
sort output list
- Returns:
- colnames: list
column names in a directory
- collastmod: list
last modification times for items in the directory
- SMBcorr.utilities.from_http(HOST, timeout=None, context=<ssl.SSLContext object>, local=None, hash='', chunk=16384, verbose=False, fid=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, mode=509)[source]
Download a file from a http host
- Parameters:
- HOST: str or list
remote http host path split as list
- timeout: int or NoneType, default None
timeout in seconds for blocking operations
- context: obj, default ssl.SSLContext(ssl.PROTOCOL_TLS)
SSL context for
urllibopener object- timeout: int or NoneType, default None
timeout in seconds for blocking operations
- local: str or NoneType, default None
path to local file
- hash: str, default ‘’
MD5 hash of local file
- chunk: int, default 16384
chunk size for transfer encoding
- verbose: bool, default False
print file transfer information
- fid: obj, default sys.stdout
open file object to print if verbose
- mode: oct, default 0o775
permissions mode of output local file
- Returns:
- remote_buffer: obj
BytesIO representation of file
- SMBcorr.utilities.build_opener(username, password, context=<ssl.SSLContext object>, password_manager=False, get_ca_certs=False, redirect=False, authorization_header=True, urs='https://urs.earthdata.nasa.gov')[source]
Build
urllibopener for NASA Earthdata with supplied credentials- Parameters:
- username: str or NoneType, default None
NASA Earthdata username
- password: str or NoneType, default None
NASA Earthdata password
- context: obj, default ssl.SSLContext(ssl.PROTOCOL_TLS)
SSL context for
urllibopener object- password_manager: bool, default False
Create password manager context using default realm
- get_ca_certs: bool, default False
Get list of loaded “certification authority” certificates
- redirect: bool, default False
Create redirect handler object
- authorization_header: bool, default True
Add base64 encoded authorization header to opener
- urs: str, default ‘https://urs.earthdata.nasa.gov’
Earthdata login URS 3 host
- SMBcorr.utilities.gesdisc_list(HOST, username=None, password=None, build=False, timeout=None, urs='urs.earthdata.nasa.gov', parser=<lxml.etree.HTMLParser object>, format='%Y-%m-%d %H:%M', pattern='', sort=False)[source]
List a directory on NASA GES DISC servers
- Parameters:
- HOST: str or list
remote https host
- username: str or NoneType, default None
NASA Earthdata username
- password: str or NoneType, default None
NASA Earthdata password
- build: bool, default True
Build opener with NASA Earthdata credentials
- timeout: int or NoneType, default None
timeout in seconds for blocking operations
- urs: str, default ‘urs.earthdata.nasa.gov’
Earthdata login URS 3 host
- parser: obj, default lxml.etree.HTMLParser()
HTML parser for
lxml- format: str, default ‘%Y-%m-%d %H:%M’
format for input time string
- pattern: str, default ‘’
regular expression pattern for reducing list
- sort: bool, default False
sort output list
- Returns:
- colnames: list
column names in a directory
- collastmod: list
last modification times for items in the directory
- SMBcorr.utilities.cmr_filter_json(search_results, endpoint='data', request_type='application/x-netcdf')[source]
Filter the CMR json response for desired data files
- Parameters:
- search_results: dict
json response from CMR query
- endpoint: str, default ‘data’
url endpoint type
'data': NASA Earthdata https archive'opendap': NASA Earthdata OPeNDAP archive's3': NASA Earthdata Cumulus AWS S3 bucket
- request_type: str, default ‘application/x-netcdf’
data type for reducing CMR query
- Returns:
- granule_names: list
Model granule names
- granule_urls: list
Model granule urls
- granule_mtimes: list
Model granule modification times
- SMBcorr.utilities.cmr(short_name, version=None, start_date=None, end_date=None, provider='GES_DISC', endpoint='data', request_type='application/x-netcdf', verbose=False, fid=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)[source]
Query the NASA Common Metadata Repository (CMR) for model data
- Parameters:
- short_name: str
Model shortname in the CMR system
- version: str or NoneType, default None
Model version
- start_date: str or NoneType, default None
starting date for CMR product query
- end_date: str or NoneType, default None
ending date for CMR product query
- provider: str, default ‘GES_DISC’
CMR data provider
'GES_DISC': GESDISC'GESDISCCLD': GESDISC Cumulus'PODAAC': PO.DAAC Drive'POCLOUD': PO.DAAC Cumulus
- endpoint: str, default ‘data’
url endpoint type
'data': NASA Earthdata https archive'opendap': NASA Earthdata OPeNDAP archive's3': NASA Earthdata Cumulus AWS S3 bucket
- request_type: str, default ‘application/x-netcdf’
data type for reducing CMR query
- verbose: bool, default False
print CMR query information
- fid: obj, default sys.stdout
open file object to print if verbose
- Returns:
- granule_names: list
Model granule names
- granule_urls: list
Model granule urls
- granule_mtimes: list
Model granule modification times
- SMBcorr.utilities.build_request(short_name, dataset_version, url, variables=[], format='bmM0Lw', service='L34RS_MERRA2', version='1.02', bbox=[-90, -180, 90, 180], **kwargs)[source]
Build requests for the GES DISC subsetting API
- Parameters:
- short_name: str
Model shortname in the CMR system
- url: str
url for granule returned by the CMR system
- variables: list, default []
Variables for product to subset
- format: str, default ‘bmM0Lw’
Coded output format for GES DISC subsetting API
- service: str, default ‘L34RS_MERRA2’
GES DISC subsetting API service
- version: str, default ‘1.02’
GES DISC subsetting API service version
- bbox: list, default [-90,-180,90,180]
Bounding box to spatially subset
- **kwargs: dict, default {}
Additional parameters for GES DISC subsetting API
- Returns:
- request_url: str
Formatted url for GES DISC subsetting API