kazu.ontology_preprocessing.autocuration

Functions

initial_lowercase_then_upper_to_case_sensitive(...)

If a synonym starts with a lowercase character followed by an uppercase character, then all synonyms should be case-sensitive.

is_upper_case_word_to_case_insensitive(resource)

Make Resources where all original synonyms are all uppercase alphabetical characters case-insensitive.

multiword(resource)

If any synonym has more than one words, it's likely a noun phase and we should mark all synonyms PROBABLE.

Classes

AutoCurationAction

AutoCurator

IsCommmonWord

LikelyAcronym

If all synonyms are less than or equal to the specified length, and are all upper case, give a confidence of POSSIBLE to all forms.

MaxLength

Drop resources that exceed a maximum string length.

MinLength

SymbolicToCaseSensitiveAction

class kazu.ontology_preprocessing.autocuration.AutoCurationAction[source]

Bases: Protocol

__call__(resource)[source]

Call self as a function.

Parameters:

resource (OntologyStringResource)

Return type:

OntologyStringResource

__init__(*args, **kwargs)[source]
class kazu.ontology_preprocessing.autocuration.AutoCurator[source]

Bases: object

__call__(resources)[source]

Call self as a function.

Parameters:

resources (set[OntologyStringResource])

Return type:

Iterable[OntologyStringResource]

__init__(actions)[source]
Parameters:

actions (list[AutoCurationAction])

Return type:

None

class kazu.ontology_preprocessing.autocuration.IsCommmonWord[source]

Bases: AutoCurationAction

__call__(resource)[source]

Call self as a function.

Parameters:

resource (OntologyStringResource)

Return type:

OntologyStringResource

__init__(path)[source]
Parameters:

path (str)

Return type:

None

class kazu.ontology_preprocessing.autocuration.LikelyAcronym[source]

Bases: AutoCurationAction

If all synonyms are less than or equal to the specified length, and are all upper case, give a confidence of POSSIBLE to all forms.

__call__(resource)[source]

Call self as a function.

Parameters:

resource (OntologyStringResource)

Return type:

OntologyStringResource

__init__(max_len_to_consider=5)[source]
Parameters:

max_len_to_consider (int)

Return type:

None

class kazu.ontology_preprocessing.autocuration.MaxLength[source]

Bases: AutoCurationAction

Drop resources that exceed a maximum string length.

__call__(resource)[source]

Call self as a function.

Parameters:

resource (OntologyStringResource)

Return type:

OntologyStringResource

__init__(max_len=60)[source]
Parameters:

max_len (int)

Return type:

None

class kazu.ontology_preprocessing.autocuration.MinLength[source]

Bases: AutoCurationAction

__call__(resource)[source]

Call self as a function.

Parameters:

resource (OntologyStringResource)

Return type:

OntologyStringResource

__init__(min_len=2)[source]
Parameters:

min_len (int)

Return type:

None

class kazu.ontology_preprocessing.autocuration.SymbolicToCaseSensitiveAction[source]

Bases: AutoCurationAction

__call__(resource)[source]

Call self as a function.

Parameters:

resource (OntologyStringResource)

Return type:

OntologyStringResource

__init__(entity_class)[source]
Parameters:

entity_class (str)

Return type:

None

kazu.ontology_preprocessing.autocuration.initial_lowercase_then_upper_to_case_sensitive(resource)[source]

If a synonym starts with a lowercase character followed by an uppercase character, then all synonyms should be case-sensitive.

E.g. “eGFR” vs “EGFR”.

Parameters:

resource (OntologyStringResource)

Returns:

Return type:

OntologyStringResource

kazu.ontology_preprocessing.autocuration.is_upper_case_word_to_case_insensitive(resource)[source]

Make Resources where all original synonyms are all uppercase alphabetical characters case-insensitive.

Some data sources use all-caps strings for nouns that can be considered case- insensitive (e.g. Chembl).

Parameters:

resource (OntologyStringResource)

Returns:

Return type:

OntologyStringResource

kazu.ontology_preprocessing.autocuration.multiword(resource)[source]

If any synonym has more than one words, it’s likely a noun phase and we should mark all synonyms PROBABLE.

Parameters:

resource (OntologyStringResource)

Returns:

Return type:

OntologyStringResource