file_caching¶
Introduction¶
The file_caching
package helps when working with slow, possibly remote and unreliable data sources through
semi-transparent on-demand caches configured on a per-system level.
You will mainly use the file_caching.cached_path
API. While the caching needs to be explicitly supported by the user code,
it depends on the user’s system setup (namely an environment variable specifying which paths to cache) on whether
caching is used for any particular path or not.
Preconfigured Caching¶
Via the file_caching.cached_path
module, preconfigured caching allows you to specify a number of ‘to be cached’ root paths
and return cached or original paths depending on whether they are located there.
Therefore, one has to configure the environment variable DirCache_RootPaths, e.g. like this (Windows example):
setx DirCache_RootPaths = "//gaia/functionalTestData;//gaia/imageData"
Now we can just use file_caching.cached_path.get_path()
in our code:
1 # import the cached_path api
2 from file_caching import cached_path
3
4 [...]
5
6 #-------------------------------------------------------------
7 # Example 1: Access caching-preconfigured path
8
9 # This may take some time to fill/sync the cache
10 path = cached_path.get_path( "//gaia/functionalTestData/MeVisLabModules/General/ML/FastWatershed" )
11
12 # The second call will return almost immediately because the cache is still up-to-date
13 path = cached_path.get_path( "//gaia/functionalTestData/MeVisLabModules/General/ML/FastWatershed" )
14
15 # path points to a local cache copy
16 print( path ) # --> C:\Users\TESTUS~1\AppData\Local\Temp/__DIRCACHE__/____gaia__functionalTestData/MeVisLabModules/General/ML/FastWatershed
17
18
19 #-------------------------------------------------------------
20 # Example 2: Access non-caching path:
21
22 # Request path for which no caching is confiugred
23 path = cached_path.get_path( "//gaia/fme/multimedia" )
24
25 # path just points to the original path
26 print( path ) # --> //gaia/fme/multimedia
Explicit Caching¶
The file_caching.dir_cache.DirCache
class handles caching for on particular source directory structure.
For example:
1 # Import the DirCache class
2 from file_caching.dir_cache import DirCache
3
4 # Set up a new cache
5 cache = DirCache( "//gaia/functionalTestData" )
6
7 # Request data, this may take some time to fill/sync the cache
8 cachedPath = cache.get( "//gaia/functionalTestData/MeVisLabModules/General/ML/FastWatershed" )
9
10 # The second call will return almost immediately because the cache is still up-to-date
11 cachedPath = cache.get( "//gaia/functionalTestData/MeVisLabModules/General/ML/FastWatershed" )
12
13 # cachedPath points to a cached path (located in a local temp folder by default)
14 print( cachedPath ) # --> C:\Users\TESTUS~1\AppData\Local\Temp/__DIRCACHE__/____gaia__functionalTestData/MeVisLabModules/General/ML/FastWatershed
Note that the second get call would also ‘work’ if //gaia would suddenly no longer be accessible.