A very basic caching helper for procedures:
import cacheme
type
Bar = object
name: string
a, b: int
proc foo(a, b: int): Bar {.cacheMe: "/tmp/test_cache".} =
result = Bar(a: a, b: b, name: "Hehe")
echo foo(1, 5)
echo foo(5, 3)
echo foo(1, 5)
echo foo(8, 12)
It caches function calls of foo
with the arguments a, b
over
multiple runs of the same program. It arose from a need in my code
where I have some function that needs to perform a lengthy
calculation, but caching it is fine. Instead of writing similar
caching code in each case, I decided to just write a short library to
take care of it generally. Note that the cache also works well for
multiple processes (e.g. via cligen’s procpool) accessing the same
file by trying to merge changes between different processes writing
potentially different data (see more below).
The idea is we construct an HDF5 file with the given prefix path given to the pragma. It effectively wraps the procedure body in the following skeleton:
template injectOldBody(args, retTyp, path, body: untyped): untyped {.dirty.} =
var theCache {.global.} = initCache[typeof args, retTyp](path)
if args.inCache(theCache, path):
result = args.getFromCache(theCache)
else:
body
result.addToCache(args, theCache, path)
That means we construct a global (!) Table[K, V]
to store the
arguments and return values at runtime. That table is serialized to
disk to an HDF5 file between accesses using nimhdf5
.
Feel free to write the files to /dev/shm
if you’re on linux and
don’t want to actually suffer the IO costs of writing to disk.
One word of caution: If your procedure contains an explicit return
statement, caching won’t work correctly! And for obvious reasons do
not use this for functions that depend on some runtime state either!
Finally, not every possible Nim type is supported, but the
serialization of nimhdf5
supports
some pretty complex objects. You may have to / want to write a
custom serialization hook for specific types, like here.
The only dependency is nimhdf5
, because the caching is done by
serializing the arguments and return values to an HDF5 file. Other
backends could be added trivially (as long as they have serialization support).
nimble install nimhdf5
Multiple threads is a different matter obviously and theoretical races are obviously possible. However, if the procedure is a pure function in the sense of same input ⇒ same output and if you are fine if sometimes a value is computed twice before the cache is synchronized again, it works well in practice!
- memo by @andreaferretti performs memoization of functions in a similar manner, however only persistent between a single program run. On the other hand it can also handle recursive functions. Different use cases require different solutions!