


A simple spacy NER implementation.

class kazu.steps.ner.spacy_ner.SpacyNerStep[source]

Bases: Step

A simple spacy NER implementation.

Runs a spacy pipeline over document sections, expecting the resulting spacy doc to have a populated doc.ents field.


Process documents and respond with processed and failed documents.

Note that many steps will be decorated by document_iterating_step() or document_batch_step() which will modify the ‘original’ __call__ function signature to match the expected signature for a step, as the decorators handle the exception/failed documents logic for you.


The first element is all the provided docs (now modified by the processing), the second is the docs that failed to (fully) process correctly.

Return type:

tuple[list[Document], list[Document]]

__init__(path, add_sentence_spans=True)[source]
  • path (str) – path to the spacy pipeline to use.

  • add_sentence_spans (bool) – If True, add sentence spans to the section.