chemicalchecker.tool.targetmate.universes.univs.Universe
- class Universe(cc_root=None, molrepo=None, k=None, model_path=None, tmp_path='/tmp/tm/tmp_universe', min_actives_oneclass=10, max_actives_oneclass=1000, representative_mols_per_cluster=10, trials=1000000, only_bioactive=False)[source]
Bases:
object
Initialize the Universe class.
- Parameters:
cc_root (str) – Chemical Checker root directory (default=None).
molrepo (str) – Molrepo to use. Chembl if not specified (default=None)
k (int) – Number of partitions for the k-Means clustering (default=sqrt(N/2)).
model_path (str) – Folder where the universe should be stored (default = .)
tmp_path (str) – Temporary directory (default=/tmp/tm/tmp_universe).
min_actives_oneclass (int) – Minimum number of actives to use in the OneClassSVM (default=10).
max_actives_oneclass (int) – Maximum number of actives to use in the OneClassSVM (default=1000).
representative_mols_per_cluster (int) – Number of molecules to samples for each cluster (default=10).
trials (int) – Number of sampling trials before stop trying (default=1000000).
only_bioactive (bool) – Only include known bioactive compounds in the chemical space i.e. those compounds found in ChemicalChecker.
Methods
calculate_arena
cluster
clusters_dict
fetch_molecules
fit
fit_oneclass_svm
load_universe
- param actives:
Should include (smiles, id, inchikey).
representative_smiles
save
smiles
- predict(actives, inactives, inactives_per_active=100, min_actives=10, naive=False, biased_universe=0, maximum_potential_actives=5, random_state=None)[source]
- Parameters:
actives (list or set) – Should include (smiles, id, inchikey).
inactives (list or set) – Should include (smiles, id, inchikey).
inactives_per_active (int) – Number of inactives to sample from the universe. Can be None (default=100).
min_actives (int) – Minimum number of actives (default=10).
naive (bool) – Sample naively (randomly), without using the OneClassSVM (default=False).
biased_universe (float) – Proportion of closer molecules to sample as putative inactives (default = 0).
maximum_potential_actives (int) – Maximum number of representative molecules within active hyperplane before cluster discarded, used for biased universe (default=5).