mltk.utils.archive_downloader¶
Utilities for downloading and extracting archives
See the source code on Github: mltk/utils/archive_downloader.py
Functions
|
Downloads the tarball or zip file from url into dst_path. |
|
Download an archive, verify its hash, and extract |
|
Verify the archive hash and extract |
|
Return True if the calculated hash of the file matches the given hash, false else |
|
|
|
- download_verify_extract(url, dest_dir=None, dest_subdir=None, download_dir=None, archive_fname=None, show_progress=False, file_hash=None, file_hash_algorithm='auto', logger=None, extract_nested=False, remove_root_dir=False, clean_dest_dir=True, update_onchange_only=True, download_details_fname=None, extract=True, return_uptodate=False)[source]¶
Download an archive, verify its hash, and extract
- Parameters:
url (
str
) – Download URLdest_dir (
str
) – Directory to extract archive into If omitted, defaults to MLTK_CACHE_DIR/<dest_subdir>/ OR ~/.mltk/<dest_subdir>/dest_subdir (
str
) – Destination sub-directory, if omitted default to archive path’s basename This is only used if dest_dir is omitteddownload_dir (
str
) – Directory to download archive to If omitted, defaults to MLTK_CACHE_DIR/downloads/<archive_fname> OR ~/.mltk/downloads/<archive_fname>archive_fname (
str
) – Name of downloaded archive file, if omitted default to URL filenameshow_progress (
bool
) – Show a download progressbarfile_hash (
str
) – md5, sha1, sha256 hash of filefile_hash_algorithm (
str
) – File hashing algorithm, if auto then determine automaticallyextract_nested (
bool
) – If the archive has a sub archive, then extract that as wellremove_root_dir (
bool
) – If the archive has a root directory, then remove it from the extracted pathclean_dest_dir (
bool
) – Remove the destination directory BEFORE extractingupdate_onchange_only (
bool
) – Only download and extract if given url hasn’t been previously downloaded and extracted, otherwise return immediatelydownload_details_fname (
str
) – If update_onchange_only=True then a download details .json file is generated. This argument specifies the name of that file. If omitted, then the filename is <archive filename>-mltk.jsonextract (
bool
) – If false, then do NOT extract the downloaded file. In this case, return the path to the downloaded filereturn_uptodate – If true, then return a tuple, (path, <is up-to-date bool>)
logger (Logger) –
- Return type:
Union
[str
,Tuple
[str
,bool
]]- Returns:
If return_uptodate=False, Path to extracted directory OR path to downloaded archive is extract=False if return_uptodate=True, (<path>, <is up-to-date bool>)
- verify_extract(archive_path, dest_dir=None, dest_subdir=None, show_progress=False, file_hash=None, file_hash_algorithm='auto', logger=None, extract_nested=False, remove_root_dir=False, clean_dest_dir=True, update_onchange_only=True, extract_details_fname=None)[source]¶
Verify the archive hash and extract
- Parameters:
archive_path (
str
) – File path to archivedest_dir (
str
) – Directory to extract archive into If omitted, defaults to MLTK_CACHE_DIR/<dest_subdir>/ OR ~/.mltk/<dest_subdir>/dest_subdir (
str
) – Destination sub-directory, if omitted default to archive path’s basename This is only used if dest_dir is omittedshow_progress (
bool
) – Show a download progressbarfile_hash (
str
) – md5, sha1, sha256 hash of filefile_hash_algorithm (
str
) – File hashing algorithm, if auto then determine automaticallyextract_nested (
bool
) – If the archive has a sub archive, then extract that as wellremove_root_dir (
bool
) – If the archive has a root directory, then remove it from the extracted pathclean_dest_dir (
bool
) – Remove the destination directory BEFORE extractingupdate_onchange_only (
bool
) – Only download and extract if given url hasn’t been previously downloaded and extracted, otherwise return immediatelyextract_details_fname (
str
) – If update_onchange_only=True then a details .json file is generated. This argument specifies the name of that file. If omitted, then the filename is <archive filename>-mltk.jsonlogger (Logger) –
- Return type:
str
- Returns:
Path to extracted directory
- download_url(url, dst_path, show_progress=False, logger=None)[source]¶
Downloads the tarball or zip file from url into dst_path. :type url:
str
:param url: The URL of a tarball or zip file. :type dst_path:str
:param dst_path: The path where the file is download :param show_progress: Show a progress bar while downloadingIf the file at
dst_path
is already found, then just return the local version without downloading- Return type:
str
- Parameters:
url (str) –
dst_path (str) –