chemicalchecker.util.splitter.ae_siam_traintest.AE_SiameseTraintest
- class AE_SiameseTraintest(hdf5_file, split, replace_nan=None)[source]
Bases:
object
AE_SiameseTraintest class.
Initialize a AE_SiameseTraintest instance.
We assume the file is containing diffrent splits. e.g. “x_train”, “y_train”, “x_test”, …
Methods
Close the HDF5.
Return the generator function that we can query for batches.
Get random indeces for different splits.
Return the name of the splits.
Get a batch of X.
Get a batch of X.
Return the shpaes of X.
Get a batch of X and Y.
Return the shpaes of X an Y.
Open the HDF5.
Create the HDF5 file with validation splits from an input file.
Attributes
available_splits = self.get_split_names() if split not in available_splits: raise Exception("Split '%s' not found in %s!" % (split, str(available_splits)))
- static generator_fn(file_name, split, batch_size=None, only_x=False, sample_weights=False, shuffle=True, return_on_epoch=False)[source]
Return the generator function that we can query for batches.
- static split_h5_blocks(in_file, out_file, split_names=['train', 'test', 'validation'], split_fractions=[0.8, 0.1, 0.1], block_size=1000, input_datasets=None)[source]
Create the HDF5 file with validation splits from an input file.
- Parameters:
in_file (str) – path of the h5 file to read from.
out_file (str) – path of the h5 file to write.
split_names (list(str)) – names for the split of data.
split_fractions (list(float)) – fraction of data in each split.
block_size (int) – size of the block to be used.
dataset (list) – only split the given dataset and ignore others.
- sw_name_right
available_splits = self.get_split_names() if split not in available_splits:
- raise Exception(“Split ‘%s’ not found in %s!” %
(split, str(available_splits)))