kazu.steps.joint_ner_and_linking.explosion

Classes

ExplosionStringMatchingStep

A wrapper for the explosion ontology-based entity matcher and linker.

class kazu.steps.joint_ner_and_linking.explosion.ExplosionStringMatchingStep[source]

Bases: ParserDependentStep

A wrapper for the explosion ontology-based entity matcher and linker.

__call__(docs)[source]

Process documents and respond with processed and failed documents.

Note that many steps will be decorated by document_iterating_step() or document_batch_step() which will modify the ‘original’ __call__ function signature to match the expected signature for a step, as the decorators handle the exception/failed documents logic for you.

Parameters:
Returns:

The first element is all the provided docs (now modified by the processing), the second is the docs that failed to (fully) process correctly.

Return type:

tuple[list[Document], list[Document]]

__init__(parsers, path, include_sentence_offsets=True, ignore_cache=False)[source]
Parameters:
  • parsers (Iterable[OntologyParser]) – the parsers used for the matching.

  • path (str | Path) – path to spaCy pipeline including Ontology Matcher.

  • include_sentence_offsets (bool) – whether to add sentence offsets to the metadata.

  • ignore_cache (bool) – ignore cached version of spaCy pipeline (if available) and rebuild

extract_entity_data_from_spans(spans)[source]
Parameters:

spans (Iterable[Span])

Return type:

Iterator[tuple[int, int, str, dict[str, set[tuple[str, str, str]]]]]