kazu.steps.linking.dictionary

Classes

class kazu.steps.linking.dictionary.DictionaryEntityLinkingStep[source]

Bases: Step

Uses kazu.utils.link_index.DictionaryIndex to match entities to ontologies.

Note, this is not an instance of kazu.steps.step.ParserDependentStep, as this logic would duplicate the work of kazu.utils.link_index.DictionaryIndex

__call__(docs)[source]

logic of entity linker:

  1. first obtain an entity list from all docs

  2. check the lookup LRUCache to see if an entity has been recently processed

  3. if the cache misses, run a string similarity search using the configured kazu.utils.link_index.DictionaryIndexs

Parameters:
Returns:

Return type:

tuple[list[Document], list[Document]]

__init__(indices, lookup_cache_size=5000, top_n=20, skip_ner_namespaces=None)[source]
Parameters:
  • indices (list[DictionaryIndex]) – indices to query

  • lookup_cache_size (int) – the size of the Least Recently Used lookup cache to maintain

  • top_n (int) – keep the top_n results for the query (passed to kazu.utils.link_index.DictionaryIndex)

  • skip_ner_namespaces (set[str] | None) – set of NER-step namespaces – linking will be skipped for entities generated by these namespaces