mltk.utils.archive

Utilities for extracting archives

See the source code on Github: mltk/utils/archive.py

Functions

extract_archive(archive_path, dest_dir[, ...])

Extract the given archive file to the specified directory

gzip_directory_files(src_dir[, dst_archive, ...])

Recursively gzip all files in given directory.

gzip_file(src_path[, dst_path])

GZip file and return path to gzip archive

extract_archive(archive_path, dest_dir, extract_nested=False, remove_root_dir=False, clean_dest_dir=True)[source]

Extract the given archive file to the specified directory

Parameters:
  • archive_path (str) – Path to archive file

  • dest_dir (str) – Path to directory where archive will be extracted

  • extract_nested (bool) – If true and the give archive contains nested archive, then extract those as well

  • remove_root_dir (bool) – If the archive has a root directory, then remove it from the extracted path

  • clean_dest_dir (Union[bool, Callable]) – Clean the destination directory before extracting

gzip_file(src_path, dst_path=None)[source]

GZip file and return path to gzip archive

Parameters:
  • src_path (str) – Path to local file to gzip

  • dst_path (str) – Optional path to destination gzip file. If omitted then use src_path + .gz

Return type:

str

Returns:

Path to generated .gz file

gzip_directory_files(src_dir, dst_archive=None, regex=None)[source]

Recursively gzip all files in given directory. The generated .tar.gz contains the same directory structure as the src_dir.

Parameters:
  • src_dir (str) – Path to directory to generated .tar.gz archive

  • dst_archive (str) – Path to generated .tar.gz. If omitted then use src_dir + .tar.gz

  • regex (Union[str, Pattern, Callable[[str], bool]]) – Optional regex of file paths to INCLUDE in the returned list This can either be a string, re.Pattern, or a callback function The tested path is the relative path to src_dir with forward slashes If a callback function is given, if the function returns True then the path is INCLUDED, else it is excluded

Return type:

str

Returns:

Path to generated .tar.gz