file_caching

Introduction

The file_caching package helps when working with slow, possibly remote and unreliable data sources through semi-transparent on-demand caches configured on a per-system level.

You will mainly use the file_caching.cached_path API. While the caching needs to be explicitly supported by the user code, it depends on the user’s system setup (namely an environment variable specifying which paths to cache) on whether caching is used for any particular path or not.

Preconfigured Caching

Via the file_caching.cached_path module, preconfigured caching allows you to specify a number of ‘to be cached’ root paths and return cached or original paths depending on whether they are located there.

Therefore, one has to configure the environment variable DirCache_RootPaths, e.g. like this (Windows example):

setx DirCache_RootPaths = "//gaia/functionalTestData;//gaia/imageData"

Now we can just use file_caching.cached_path.get_path() in our code:

 1 # import the cached_path api
 2 from file_caching import cached_path
 3
 4 [...]
 5
 6 #-------------------------------------------------------------
 7 # Example 1: Access caching-preconfigured path
 8
 9 # This may take some time to fill/sync the cache
10 path = cached_path.get_path( "//gaia/functionalTestData/MeVisLabModules/General/ML/FastWatershed" )
11
12 # The second call will return almost immediately because the cache is still up-to-date
13 path = cached_path.get_path( "//gaia/functionalTestData/MeVisLabModules/General/ML/FastWatershed" )
14
15 # path points to a local cache copy
16 print( path ) # --> C:\Users\TESTUS~1\AppData\Local\Temp/__DIRCACHE__/____gaia__functionalTestData/MeVisLabModules/General/ML/FastWatershed
17
18
19 #-------------------------------------------------------------
20 # Example 2: Access non-caching path:
21
22 # Request path for which no caching is confiugred
23 path = cached_path.get_path( "//gaia/fme/multimedia" )
24
25 # path just points to the original path
26 print( path ) # --> //gaia/fme/multimedia

Explicit Caching

The file_caching.dir_cache.DirCache class handles caching for on particular source directory structure.

For example:

 1 # Import the DirCache class
 2 from file_caching.dir_cache import DirCache
 3
 4 # Set up a new cache
 5 cache = DirCache( "//gaia/functionalTestData" )
 6
 7 # Request data, this may take some time to fill/sync the cache
 8 cachedPath = cache.get( "//gaia/functionalTestData/MeVisLabModules/General/ML/FastWatershed" )
 9
10 # The second call will return almost immediately because the cache is still up-to-date
11 cachedPath = cache.get( "//gaia/functionalTestData/MeVisLabModules/General/ML/FastWatershed" )
12
13 # cachedPath points to a cached path (located in a local temp folder by default)
14 print( cachedPath ) # --> C:\Users\TESTUS~1\AppData\Local\Temp/__DIRCACHE__/____gaia__functionalTestData/MeVisLabModules/General/ML/FastWatershed

Note that the second get call would also ‘work’ if //gaia would suddenly no longer be accessible.

Documents

Indices and tables