kazu.utils.build_and_test_model_packs¶
Module Attributes
A default timeout in seconds for Ray to finish building the model packs within. |
Functions
|
Build multiple model packs. |
|
Classes
Dataclass that controls how a base model pack and config should be merged with a target model pack. |
|
Exceptions
- class kazu.utils.build_and_test_model_packs.BuildConfiguration[source]¶
Bases:
object
Dataclass that controls how a base model pack and config should be merged with a target model pack.
- __init__(requires_base_config, resources, has_own_config, run_acceptance_tests=False, acceptance_test_json_path=None, run_consistency_checks=False, sanity_test_strings=<factory>)[source]¶
- acceptance_test_json_path: str | None = None¶
if run_acceptance_tests, path to serialised label studio tasks.
- has_own_config: bool¶
does this model pack have its own config dir? (if used with use_base_config these will override any config files from the base config)
- requires_resources: bool¶
Whether resources (e.g. model binaries) are required to build this model pack This will be set automatically based on the values of the other fields, it’s not available to set when instantiating the class.
- class kazu.utils.build_and_test_model_packs.ModelPackBuilder[source]¶
Bases:
object
- __init__(logging_config_path, target_model_pack_path, kazu_version, build_dir, maybe_base_configuration_path, skip_tests, zip_pack, *, _ray_trace_ctx=None)[source]¶
A ModelPackBuilder is a helper class to assist in the building of a model pack.
Danger
WARNING! since this class will configure the kazu global cache, executing multiple builds within the same python process could potentially lead to the pollution of the cache. This is because the KAZU_MODEL_PACK env variable is modified by this object, which should normally not happen. Rather than instantiating this object directly, one should instead use
build_all_model_packs()
, which will control this process for you.- Parameters:
logging_config_path (Path | None) – passed to
logging.config.fileConfig()
target_model_pack_path (Path) – path to model pack to process
kazu_version (str) – version of kazu used to generate model pack
build_dir (Path) – build the pack in this directory
maybe_base_configuration_path (Path | None) – if this pack requires the base configuration, specify path
skip_tests (bool) – don’t run any tests
zip_pack (bool) – zip the pack at the end (requires the ‘zip’ CLI tool)
- Return type:
None
- build_caches_and_run_sanity_checks(cfg, *, _ray_trace_ctx=None)[source]¶
Execute all processed required to build model pack caches.
- Parameters:
cfg (DictConfig)
- Returns:
pipeline that was used to run sanity checks
- Return type:
- build_model_pack(*, _ray_trace_ctx=None)[source]¶
Execute the build process.
- Returns:
path of new pack
- Return type:
- clear_cached_resources_from_model_pack_dir(*, _ray_trace_ctx=None)[source]¶
Delete any cached data from the input path.
- Returns:
- Return type:
None
- load_build_configuration(*, _ray_trace_ctx=None)[source]¶
Try to load a build configuration from the model pack root.
The merge configuration should be a json file called build_config.json.
- Raises:
ModelPackBuildError – if the merge config isn’t found at the expected path
- Return type:
- kazu.utils.build_and_test_model_packs.build_all_model_packs(maybe_base_configuration_path, model_pack_paths, zip_pack, output_dir, skip_tests, logging_config_path, max_parallel_build, debug=False, ray_timeout=10800.0)[source]¶
Build multiple model packs.
- Parameters:
maybe_base_configuration_path (Path | None) – Path to the base configuration, if required
model_pack_paths (list[Path]) – list of paths to model pack resources
zip_pack (bool) – should the packs be zipped at the end?
output_dir (Path) – directory to build model packs in
skip_tests (bool) – don’t run any tests
logging_config_path (Path | None) – passed to logging.config.fileConfig
max_parallel_build (int | None) – build at most this many model packs simultaneously. If None, use all available CPUs
debug (bool) – Disables Ray parallelization, enabling the use of debugger tools
ray_timeout (float | None) – A timeout for Ray to complete model pack building within. Defaults to
DEFAULT_RAY_TIMEOUT
- Returns:
- Return type:
None
- kazu.utils.build_and_test_model_packs.wait_for_model_pack_completion(futures, timeout=10800.0)[source]¶
- kazu.utils.build_and_test_model_packs.DEFAULT_RAY_TIMEOUT = 10800.0¶
A default timeout in seconds for Ray to finish building the model packs within. This is equal to 3 hours