kazu.utils.caching¶

Module Attributes

`Ret`	A TypeVar to represent a return value.
`kazu_disk_cache`	We use the `diskcache.Cache` concept to cache expensive to produce resources to disk.

Classes

`CacheProtocol`
`EntityLinkingLookupCache`	A simple wrapper around LFUCache to reduce calls to expensive processes (e.g. bert).
`Memoization`

class kazu.utils.caching.CacheProtocol[source]¶

Bases: Protocol

__init__(*args, **kwargs)[source]¶

clear()[source]¶

Return type:: int

delete(key)[source]¶

Parameters:: key (Any)
Return type:: bool

get(key)[source]¶

Parameters:: key (str)
Return type:: Any

memoize(name=None, typed=False, expire=None, tag=None, ignore={})[source]¶

Parameters:

name (str | None)
typed (bool)
expire (float | None)
tag (str | None)
ignore (set[str | int])

Return type:

Callable[[Callable[[…], Ret]], Memoization[Ret]]

class kazu.utils.caching.EntityLinkingLookupCache[source]¶

Bases: object

A simple wrapper around LFUCache to reduce calls to expensive processes (e.g. bert)

__init__(lookup_cache_size=5000)[source]¶

Parameters:: lookup_cache_size (int)

check_lookup_cache(entities)[source]¶

Checks the cache for linking candidates. If relevant candidates are found for an entity, update it accordingly. If not return as a list of cache misses (e.g. for further processing)

Parameters:: entities (Iterable[Entity])
Returns:
Return type:: list[Entity]

update_candidates_lookup_cache(entity, candidates)[source]¶

Parameters:

entity (Entity)
candidates (dict[LinkingCandidate, LinkingMetrics])

Return type:

None

class kazu.utils.caching.Memoization[source]¶

Bases: Protocol[Ret]

__call__(*args, **kwargs)[source]¶

Call self as a function.

Parameters:

args (Any)
kwargs (Any)

Return type:

Ret

__init__(*args, **kwargs)[source]¶

class kazu.utils.caching.Ret¶

A TypeVar to represent a return value.

Used in Memoization to encode that a memorized function returns the same type as the original function it ‘memoized’.

alias of TypeVar(‘Ret’, covariant=True)

kazu.utils.caching.kazu_disk_cache: CacheProtocol = <diskcache.core.Cache object>¶

We use the diskcache.Cache concept to cache expensive to produce resources to disk. Methods and functions can be decorated with kazu_disk_cache.memoize() to use this feature. Note, when used with a class method, the default behaviour of this function is to generate a key based on the constructor arguments of the class instance. Since these can be large (e.g. OntologyParser), we sometimes use the ignore argument to override this behaviour.

e.g.

@kazu_disk_cache.memoize(ignore={0})
def method_of_class_with_lots_of_args(self): ...