kazu.krt.resource_discrepancy_editor.utils

Classes

ResourceDiscrepancyManger

This class manages SynonymDiscrepancys between human-generated resources and auto- generated resources.

SynonymDiscrepancy

This class represents a discrepancy between a human-generated OntologyStringResource and an auto-generated OntologyStringResource.

class kazu.krt.resource_discrepancy_editor.utils.ResourceDiscrepancyManger[source]

Bases: object

This class manages SynonymDiscrepancys between human-generated resources and auto- generated resources.

It provides methods to automatically resolve all discrepancies, commit changes to the resources, and get a summary DataFrame.

__init__(parser_name, manager)[source]

Initializes the ResourceDiscrepancyManager.

Parameters:
  • parser_name (str) – The name of the parser used to generate the resources.

  • manager (ResourceManager) – The ResourceManager object used to manage the resources.

apply_autofix_to_all()[source]

Attempts to automatically resolve all discrepancies.

If a discrepancy can be auto-resolved, it syncs the resources and removes the discrepancy from the internal unresolved discrepancies list.

Return type:

None

commit(original_human_resource, new_resource, index)[source]

Commits changes to the resources and removes the discrepancy from the internal todo list.

Parameters:
Return type:

None

summary_df()[source]

Returns a pandas.DataFrame summarizing the unresolved discrepancies.

Returns:

A DataFrame with columns for id, example text, and the number of unique synonyms in the human and auto resources.

Return type:

DataFrame

class kazu.krt.resource_discrepancy_editor.utils.SynonymDiscrepancy[source]

Bases: object

This class represents a discrepancy between a human-generated OntologyStringResource and an auto-generated OntologyStringResource.

It provides methods to automatically resolve the discrepancy, convert the resources to a DataFrame, and get an example string for display.

__init__(human_resource, auto_resource)[source]
Parameters:
auto_resolve()[source]

This method attempts to automatically resolve discrepancies between human and auto resources.

It first creates a set of tuples containing the mention confidence and case sensitivity for all Synonyms in the human resource. If there is more than one unique tuple in the set, it means there are discrepancies in the human resource itself, and the method returns None. If there is exactly one unique tuple, it means all synonyms in the human resource have the same mention confidence and case sensitivity. In this case, it updates all forms of the auto resource with this mention confidence and case sensitivity, and returns the updated auto resource.

Returns:

The updated auto resource if discrepancies can be auto-resolved, which can be used as a human override. None otherwise.

Return type:

OntologyStringResource | None

dataframe()[source]

Converts the human and auto resources to DataFrames, merges them, and returns the rows with any null values (i.e. discrepancies)

Returns:

A DataFrame representing the discrepancies between the human and auto resources.

Return type:

DataFrame

example_string()[source]

Returns an example string from the human resource’s original synonyms.

Return type:

str