cached_path

The main use case of this module is to allow on-demand caching of possibly remote file systems that can be configured on a per-user-system level. While the caching needs to be explicitly supported by the user code (e.g. through using get_path()), it is the user’s system setup (namely an environment variable specifying which paths to cache) that decide if and where caching is used or not.

The environment variable DirCache_RootPaths can be used to specify which paths are to be cached, using a string of os.pathsep–separated root paths.

file_caching.cached_path.PERSISTENT_CACHE_ENV_KEY = 'DirCache_RootPaths'

Name of the environment variable that contains a string of os.pathsep-separated root paths.

class file_caching.cached_path.CachingPolicy(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: Enum

Caching policy that can be specified for each request.

  • ‘AUTO’ means that it will use the cache if the path is located in one of the paths specified in the environment variable DirCache_RootPaths

  • ‘ALWAYS’ means that the source path will definitely be cached. If no matching persistent cache was found, a so-called ‘ad-hoc cache’ will be created to do this.

  • ‘NEVER’ means that the result path will definitely point to the cache, even if a matching persistent cache and/or a cached version exists.

file_caching.cached_path.get_responsible_cache(source_path: str, must_include_parent_dir: bool = False) DirCache | None[source]

Returns the (first) cache for which source_path is a valid source path, or None if none was found.

Parameters:
  • source_path – Source path for which the corresponding cache shall be found

  • must_include_parent_dir – If true, the function will look for source_path’s parent directory’s cache instead

Returns:

The responsible DirCache found, or None if there was none

file_caching.cached_path.get_path(path_to_dir_or_file: str, include_containing_dir: bool = False, caching_policy: str | CachingPolicy = 'AUTO', prune_target_folder: bool = True) str[source]

Returns the cache location of the directory dir_path, but only if dir_path is a sub-path of any of the registered DirCache root paths, or caching_policy is ALWAYS. If caching is applicable and the module’s log level is INFO or DEBUG, you will see additional information. INFO will log an ‘Updating cache…’ message before starting the caching, DEBUG also logs time consumption and target location after successful caching. If caching is no applicable and log_level is DEBUG, you will also get a message indicating why caching was not applicable.

Parameters:
  • path_to_dir_or_file – Path to attempt caching on.

  • include_containing_dir – If True, any caching will cache the entire parent directory of the given path (but still return the cached location of source_path)

  • caching_policy – You can override the default to ALWAYS or NEVER cache (default AUTO caches if the path is cacheable by any existing DirCache). See CacheMode for more info.

  • prune_target_folder – If the path is a directory AND source access exists AND the result path contains unexpected extra contents, removing the extra contents on cache update can be disabled by providing False.

Returns:

The path to the cache resource or the path of the original resource, depending on the caching setup and the caching_policy parameter

Raise:

FileNotFoundError if the source path does not exist and there is either no cached copy, or caching_policy was NEVER

file_caching.cached_path.will_attempt_caching(source_path: str, include_containing_dir: bool, caching_policy: str | CachingPolicy) bool[source]

Can be used to predict if caching will be attempted if path() or get_path() are called with the same parameters, which can be useful for logging, etc.

Parameters:
  • path_to_dir_or_file – Path to attempt caching on.

  • include_containing_dir – If True, any caching will cache the entire parent directory of the given path (but still return the cached location of source_path)

  • caching_policy – You can override the default to ALWAYS or NEVER cache (default AUTO caches if the path is cacheable by any existing DirCache). See CacheMode for more info.

Returns:

True if caching would be attempted based on the provided parameters.

file_caching.cached_path.path(source_path: str, include_containing_dir: bool = False, caching_policy: str | CachingPolicy = CachingPolicy.AUTO, auto_remove: bool = True) str[source]

Convenience context manager for caches that are to be auto-deleted on exit.

Parameters:
  • source_path – Path to attempt caching on.

  • include_containing_dir – If True, any caching will cache the entire parent directory of the given path (but still return the cached location of source_path)

  • caching_policy – You can specify to ALWAYS or NEVER cache (default AUTO caches if the path is cacheable by any existing DirCache). See CacheMode for more info.

  • auto_remove – If enabled (default), the cache is removed again on exit. Setting to False is possible for cache debugging purposes.

Yield:

The location of source_path that is to be used in the context (may point to the cache or to source_path, depending on your caching configuration and the provided caching_policy).

file_caching.cached_path.remove_cached_path(cache_path: str) None[source]

Will remove the folder/files at cache_path from the cache, but only if it is actually located in a cache directory (so you can safely call it for the paths returned from get_path without checking if the files were actually cached).

file_caching.cached_path.clear_cache(source_root_path: str) None[source]

Clears the cache managing source_root_path. If no managing cache was found, nothing is deleted and an error is logged.

Parameters:

source_root_path – Source root path of the cache to clear.