chemicalchecker.tool.targetmate.tmsetup.ModelSetup

class ModelSetup(is_classifier, **kwargs)[source]

Bases: TargetMateClassifierSetup, TargetMateRegressorSetup

Set up a TargetMate classifier

Parameters:
  • algo (str) – Base algorithm to use (see /model configuration files) (default=random_forest).

  • model_config (str) – Model configurations for the base classifier (default=vanilla).

  • weight_algo (str) – Model used to weigh the contribution of an individual classifier. Should be fast. For the moment, only vanilla classifiers are accepted (default=naive_bayes).

  • ccp_folds (int) – Number of cross-conformal prediction folds. The default generator used is Stratified K-Folds (default=10).

  • min_class_size (int) – Minimum class size acceptable to train the classifier (default=10).

  • min_class_size_active (int) – Minimum active class size acceptable to train the classifier, if not stated, uses min_class_size (default=None).

  • min_class_size_inactive (int) – Minimum inactive class size acceptable to train the classifier, if not stated, uses min_class_size (default=None).

  • active_value (int) – When reading data, the activity value considered to be active (default=1).

  • inactive_value (int) – When reading data, the activity value considered to be inactive. If none specified, then any value different that active_value is considered to be inactive (default=None).

  • inactives_per_active (int) – Number of inactive to sample for each active. If None, only experimental actives and inactives are considered (default=100).

  • metric (str) – Metric to use to select the pipeline (default=”auroc”).

  • universe_path (str) – Path to the universe. If not specified, the default one is used (default=None).

  • naive_sampling (bool) – Sample naively (randomly), without using the OneClassSVM (default=False).

  • biased_universe (float) – Proportion of closer molecules to sample as putative inactives (default = 0).

Methods

compress_models

Store model in compressed format for persistance

cpu_count

create_models_path

directory_tree

func_hpc

Execute the any method on the configured HPC.

load

Load previously stored TargetMate instance.

load_base_model

Load a base model

load_data

prepare_data

prepare_for_ml

read_data

repath_bases_by_fold

Redefine path of a TargetMate instance.

repath_predictions_by_fold

Redefine path of a TargetMate instance.

repath_predictions_by_fold_and_set

repath_predictions_by_set

Redefine path of a TargetMate instance.

reset_path_bases

reset_path_predictions

Reset predictions path

save

Save TargetMate instance

save_data

waiter

Wait for jobs to finish

wipe

Delete temporary data

compress_models()

Store model in compressed format for persistance

func_hpc(func_name, *args, **kwargs)

Execute the any method on the configured HPC.

Parameters:
  • args (tuple) – the arguments for of the function method

  • kwargs (dict) – arguments for the HPC method.

static load(models_path)

Load previously stored TargetMate instance.

load_base_model(destination_dir, append_pipe=False)

Load a base model

repath_bases_by_fold(fold_number, is_tmp=True, reset=True, only_train=False)

Redefine path of a TargetMate instance. Used by the Validation class.

repath_predictions_by_fold(fold_number, is_tmp=True, reset=True)

Redefine path of a TargetMate instance. Used by the Validation class.

repath_predictions_by_set(is_train, is_tmp=True, reset=True)

Redefine path of a TargetMate instance. Used by the Validation class.

reset_path_predictions(is_tmp=True)

Reset predictions path

save()

Save TargetMate instance

waiter(jobs, secs=3)

Wait for jobs to finish

wipe()

Delete temporary data