kazu.web.server

Module Attributes

KAZU_WEBSERVER_SPINUP_TIMEOUT

A timeout limit for spinning up the kazu server, including pipeline load time.

Functions

get_id_log_prefix_if_available(request)

Utility function for generating the appropriate prefix for logs.

get_request_id(request)

Utility function for extracting custom header from HTTPConnection Object.

log_request_to_path_with_prefix(request[, ...])

Utility function to log the log prefix plus the endpoint the request was sent to.

openapi_no_auth()

Remove the bits of the openapi schema that put auth buttons on the Swagger UI.

start(cfg)

Deploy the web app to Ray Serve.

stop()

Classes

DocumentCollection

SectionedWebDocument

SimpleWebDocument

SingleEntityDocumentConverter

Add an entity for the whole of every section in every document you provide.

class kazu.web.server.DocumentCollection[source]

Bases: BaseModel

class Config[source]

Bases: object

schema_extra = {'example': [{'text': 'A single string document that you want to recognise entities in. Using the default kazu pipeline, this will recognise things like asthma, acetaminophen, EGFR and many others.'}, {'sections': {'abstract': 'We carried out a study on trastuzumab.', 'fulltext': 'A much longer text that probably mentions all of HER2, breast and gastric cancer, and trastuzumab.', 'title': 'a study about HER2 in breast cancer'}}, {'text': 'Another simple doc, this one is about hypertension'}, {'sections': {'final section': 'A section about drugs: paracetamol, naproxin, ibuprofen.', 'first section': 'A section about non-small cell lung cancer', 'second section': 'A section with nothing interesting in it'}}]}
convert_to_kazu_documents()[source]
Return type:

list[Document]

class kazu.web.server.SectionedWebDocument[source]

Bases: BaseModel

class Config[source]

Bases: object

schema_extra = {'example': {'sections': {'abstract': 'We carried out a study on trastuzumab.', 'fulltext': 'A much longer text that probably mentions all of HER2, breast and gastric cancer, and trastuzumab.', 'title': 'a study about HER2 in breast cancer'}}}
to_kazu_document()[source]
Return type:

Document

sections: dict[str, str]
class kazu.web.server.SimpleWebDocument[source]

Bases: BaseModel

class Config[source]

Bases: object

schema_extra = {'example': {'text': 'A single string document that you want to recognise entities in. Using the default kazu pipeline, this will recognise things like asthma, acetaminophen, EGFR and many others.'}}
to_kazu_document()[source]
Return type:

Document

text: str
class kazu.web.server.SingleEntityDocumentConverter[source]

Bases: object

Add an entity for the whole of every section in every document you provide.

Essentially a subclass of DocumentCollection, but pydantic makes this a pain to do with inheritance, so use composition instead of inheritance here.

__init__(entity_free_doc_collection, entity_class)[source]
Parameters:
Return type:

None

convert_to_kazu_documents()[source]
Return type:

list[Document]

kazu.web.server.get_id_log_prefix_if_available(request)[source]

Utility function for generating the appropriate prefix for logs.

Parameters:

request (HTTPConnection) – Starlette HTTPConnection object

Returns:

Prefix to pre-pend to log messages containing the request id.

Return type:

str

kazu.web.server.get_request_id(request)[source]

Utility function for extracting custom header from HTTPConnection Object.

Parameters:

request (HTTPConnection) – Starlette HTTPConnection object

Returns:

ID string

Return type:

str | None

kazu.web.server.log_request_to_path_with_prefix(request, log_prefix=None)[source]

Utility function to log the log prefix plus the endpoint the request was sent to.

Parameters:
  • request (HTTPConnection) – Starlette HTTPConnection object

  • log_prefix (str | None) – the prefix to log. Provided in case the log prefix has already been calculated in order to save re-calculation. Will call get_id_log_prefix_if_available() if not provided (or None is provided).

Return type:

None

kazu.web.server.openapi_no_auth()[source]

Remove the bits of the openapi schema that put auth buttons on the Swagger UI.

When we don’t configure any Authentication middleware, we otherwise still get the buttons, even though they aren’t necessary to fill in and don’t do anything, which may be confusing to users.

Return type:

dict[str, Any]

kazu.web.server.start(cfg)[source]

Deploy the web app to Ray Serve.

Parameters:

cfg (DictConfig) – DictConfig from Hydra

Returns:

None

Return type:

None

kazu.web.server.stop()[source]
Return type:

None

kazu.web.server.KAZU_WEBSERVER_SPINUP_TIMEOUT = '180'

A timeout limit for spinning up the kazu server, including pipeline load time.

Defaults to 3 minutes, but will read the RAY_SERVE_PROXY_READY_CHECK_TIMEOUT_S environment variable and use that value if present.

If you have a custom pipeline that is very slow to spin up, you may need to increase this timeout. If this is the case, you will get an error like:

ray.exceptions.RayActorError: The actor died unexpectedly before finishing this task.
        class_name: HTTPProxyActor
        actor_id: d241a36c326e982cac79013101000000
        pid: 274
        name: SERVE_CONTROLLER_ACTOR:FmBUWo:SERVE_PROXY_ACTOR-71c5a59a2c003ba52ae3f190e537d552af676c9d895ad98c7942cfe0
        namespace: serve
        ip: 172.29.34.98
The actor is dead because it was killed by `ray.kill`

Note

Normally we provide options to control environment variables within the hydra config, but unfortunately we can’t set this with hydra because ray reads this at import time.

Attempts to work around this with local imports interfere with the ray serve/FastAPI integration and break the server.