Contents Menu Expand Light mode Dark mode Auto light/dark, in light mode Auto light/dark, in dark mode Skip to content
Kazu documentation
Logo

Guides and Tutorials

  • Introduction
  • Quickstart
  • The Default Kazu Pipeline
  • Kazu Data Model
  • Visualising results in Label Studio
  • The OntologyParser
  • The Kazu Resource Tool
  • Curating a knowledge base for NER and Linking
  • Scaling with Ray
  • Building a multilabel NER model with Kazu
  • Kazu as a WebService
  • Using Kazu as a library
  • Development Setup
  • Changelog

API Documentation

  • kazu
    • kazu.annotation
      • kazu.annotation.acceptance_test
      • kazu.annotation.label_studio
    • kazu.data
    • kazu.database
      • kazu.database.in_memory_db
    • kazu.krt
      • kazu.krt.components
      • kazu.krt.ontology_update_editor
        • kazu.krt.ontology_update_editor.components
        • kazu.krt.ontology_update_editor.utils
      • kazu.krt.resource_discrepancy_editor
        • kazu.krt.resource_discrepancy_editor.components
        • kazu.krt.resource_discrepancy_editor.utils
      • kazu.krt.resource_manager
      • kazu.krt.string_editor
        • kazu.krt.string_editor.components
        • kazu.krt.string_editor.utils
      • kazu.krt.utils
    • kazu.language
      • kazu.language.language_phenomena
      • kazu.language.string_similarity_scorers
    • kazu.ontology_matching
      • kazu.ontology_matching.assemble_pipeline
      • kazu.ontology_matching.ontology_matcher
    • kazu.ontology_preprocessing
      • kazu.ontology_preprocessing.autocuration
      • kazu.ontology_preprocessing.base
      • kazu.ontology_preprocessing.constants
      • kazu.ontology_preprocessing.curation_utils
      • kazu.ontology_preprocessing.downloads
      • kazu.ontology_preprocessing.ontology_upgrade_report
      • kazu.ontology_preprocessing.parsers
      • kazu.ontology_preprocessing.synonym_generation
    • kazu.pipeline
    • kazu.quantization
      • kazu.quantization.int8_x86_quantizer
    • kazu.steps
      • kazu.steps.document_post_processing
        • kazu.steps.document_post_processing.abbreviation_finder
      • kazu.steps.joint_ner_and_linking
        • kazu.steps.joint_ner_and_linking.explosion
        • kazu.steps.joint_ner_and_linking.memory_efficient_string_matching
      • kazu.steps.linking
        • kazu.steps.linking.dictionary
        • kazu.steps.linking.entity_class_disambiguation
        • kazu.steps.linking.post_processing
          • kazu.steps.linking.post_processing.disambiguation
            • kazu.steps.linking.post_processing.disambiguation.context_scoring
            • kazu.steps.linking.post_processing.disambiguation.strategies
          • kazu.steps.linking.post_processing.mapping_step
          • kazu.steps.linking.post_processing.mapping_strategies
            • kazu.steps.linking.post_processing.mapping_strategies.strategies
          • kazu.steps.linking.post_processing.strategy_runner
          • kazu.steps.linking.post_processing.xref_manager
        • kazu.steps.linking.rules_based_disambiguation
      • kazu.steps.ner
        • kazu.steps.ner.entity_post_processing
        • kazu.steps.ner.gliner
        • kazu.steps.ner.hf_token_classification
        • kazu.steps.ner.llm_ner
        • kazu.steps.ner.opsin
        • kazu.steps.ner.seth
        • kazu.steps.ner.spacy_ner
        • kazu.steps.ner.tokenized_word_processor
      • kazu.steps.other
        • kazu.steps.other.cleanup
        • kazu.steps.other.merge_overlapping_ents
        • kazu.steps.other.stanza
      • kazu.steps.step
    • kazu.training
      • kazu.training.config
      • kazu.training.evaluate_script
      • kazu.training.modelling
      • kazu.training.modelling_utils
      • kazu.training.predict_script
      • kazu.training.train_multilabel_ner
      • kazu.training.train_script
    • kazu.utils
      • kazu.utils.abbreviation_detector
      • kazu.utils.build_and_test_model_packs
      • kazu.utils.caching
      • kazu.utils.constants
      • kazu.utils.download_gilda_contexts
      • kazu.utils.grouping
      • kazu.utils.link_index
      • kazu.utils.sapbert
      • kazu.utils.spacy_object_mapper
      • kazu.utils.spacy_pipeline
      • kazu.utils.stanza_pipeline
      • kazu.utils.string_normalizer
      • kazu.utils.utils
    • kazu.web
      • kazu.web.jwtauth
      • kazu.web.ls_web_utils
      • kazu.web.req_id_header
      • kazu.web.routes
      • kazu.web.server

Site Index

  • Index
  • Module Index
Back to top
View this page

kazu.utils.link_index¶

Classes

DictionaryIndex

The dictionary index looks for LinkingCandidates via a char ngram search between the normalised version of the query string and the synonym_norm of all LinkingCandidates associated with the provided OntologyParser.

class kazu.utils.link_index.DictionaryIndex[source]¶

Bases: object

The dictionary index looks for LinkingCandidates via a char ngram search between the normalised version of the query string and the synonym_norm of all LinkingCandidates associated with the provided OntologyParser.

__init__(parser, boolean_scorers=None)[source]¶
Parameters:
  • parser (OntologyParser)

  • boolean_scorers (list[BooleanStringSimilarityScorer] | None) – precision can be increased by applying boolean checks on the returned result, for instance, checking that all integers are represented, that any noun modifiers are present etc

apply_boolean_scorers(reference_term, query_term)[source]¶
Parameters:
  • reference_term (str)

  • query_term (str)

Return type:

bool

build_index_cache()[source]¶

Build the cache for the index.

Return type:

tuple[TfidfVectorizer, ndarray]

search(query, top_n=15)[source]¶

Search the index with a query string.

Note

This method will only return results with search scores above 0. As a result, it will return fewer than top_n results when there are not top_n LinkingCandidates in the index that score about 0 for the given query.

Parameters:
  • query (str) – query string to search

  • top_n (int) – max number of results

Returns:

Return type:

Iterable[tuple[LinkingCandidate, LinkingMetrics]]

Next
kazu.utils.sapbert
Previous
kazu.utils.grouping
Copyright © 2021, Korea University, AstraZeneca
Made with Sphinx and @pradyunsg's Furo
On this page
  • kazu.utils.link_index
    • DictionaryIndex
      • DictionaryIndex.__init__()
      • DictionaryIndex.apply_boolean_scorers()
      • DictionaryIndex.build_index_cache()
      • DictionaryIndex.search()