chemicalchecker.util.sampler.triplets.TripletSampler
- class TripletSampler(cc, sign0, max_sampled_keys=10000, save=True)[source]
Bases:
object
TripletSampler class.
Initialize a TripletSampler instance.
Methods
Choose from a list of candidates
Map triplets from full to reference indices
Sample triplets from multiple exemplary datasets of the CC.
Sample triplets from multiple exemplary datasets of the CC.
sample_triplets_from_dataset
Save triplets
- sample(datasets=None, num_triplets=1000000, p_pos=0.001, p_neg=0.1, min_pos=10, max_pos=100, max_neg=1000, max_rounds=3, **kwargs)[source]
Sample triplets from multiple exemplary datasets of the CC.
- Parameters:
datasets (list) – Datasets to be used for the triplet sampling. In none specified, all exemplary are used (default=None).
num_triplets (int) – Number of triplets to sample (default=1000000).
p_pos (float) – P-value for positive cases (default=0.001).
p_neg (float) – P-value for negative cases. In order to provide ‘hard’ cases, it is recommended to put a relatively low p-value (default=0.1).
min_pos (int) – Minimum number of neighbors considered to be positives.
max_neg (int) – Maximum number of neighbors considered to be negatives.
max_rounds (int) – Triplets may be sampled redundantly. Number of rounds to be done before stopping trying (default=10).