kazu.steps.other.cleanup¶
Classes
Removes instances of |
|
Strip the IDs in |
- class kazu.steps.other.cleanup.CleanupStep[source]¶
Bases:
Step
- __call__(doc)[source]¶
Process documents and respond with processed and failed documents.
Note that many steps will be decorated by
document_iterating_step()
ordocument_batch_step()
which will modify the ‘original’__call__
function signature to match the expected signature for a step, as the decorators handle the exception/failed documents logic for you.
- __init__(cleanup_actions)[source]¶
- Parameters:
cleanup_actions (list[CleanupAction])
- class kazu.steps.other.cleanup.DropMappingsByConfidenceMappingFilter[source]¶
Bases:
object
- __init__(string_match_ranks_to_drop, disambiguation_ranks_to_drop)[source]¶
- Parameters:
string_match_ranks_to_drop (Iterable[StringMatchConfidence])
disambiguation_ranks_to_drop (Iterable[DisambiguationConfidence])
- class kazu.steps.other.cleanup.DropMappingsByParserNameRankAction[source]¶
Bases:
CleanupAction
Removes instances of
Mapping
based upon some preferential order of parsers.Useful if you want to filter results based upon some predefined hierarchy of importance, for entity classes mapping to multiple parsers. For instance, you may prefer Meddra entities over Mondo ones, but will accept Mondo ones if Meddra mappings aren’t available.
Caution
To ensure this class is configured correctly, ensure that all the parsers you intend to use with it have populated the metadata database first. See
populate_databases()
.
- class kazu.steps.other.cleanup.DropUnmappedEntityFilter[source]¶
Bases:
object
- __init__(from_ent_namespaces=None, min_confidence_level=MentionConfidence.PROBABLE)[source]¶
- Parameters:
min_confidence_level (MentionConfidence | None)
- class kazu.steps.other.cleanup.StripMappingURIsAction[source]¶
Bases:
object
Strip the IDs in
kazu.data.Mapping
to just the final part of the URI.For example, this will turn http://purl.obolibrary.org/obo/MONDO_0004979 into just MONDO_004979.
If you don’t want URI stripping at all, don’t use this Action as part of the CleanupStep/in the pipeline.