chemicalchecker.database.molrepo.Molrepo

class Molrepo(**kwargs)[source]

Bases: Base

Molrepo table class.

This table offer a mapping between inchikeys and different external compound ids (e.g. chembl, bindigdb, etc.).

Fields:

id(str): primary key, src_id + “_” + molrepo_name. molrepo_name(str): the molrepo name. src_id(str): the download id as in the source file. smiles(str): simplified molecular-input line-entry system (SMILES). inchikey(bool): hashed version of the full InChI (SHA-256 algorithm). inchi(bool): International Chemical Identifier (InChI).

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

Methods

add

Add a new row to the table.

count

Get Molrepo entries associated to the given source name.

from_csv

Add entries from CSV file.

from_molrepo_name

Fill Molrepo table from a molrepo name.

get

Get molrepos associated to the given name.

get_by_molrepo_name

Get Molrepo entries associated to the given name.

get_fields_by_molrepo_name

Get specified column fields.

get_universe_molrepos

Get Molrepo names that are considered universe.

molrepo_hpc

Run HPC jobs importing all molrepos.

to_csv

Write molecules InChI-Key, source_id, InChI and SMILES to CSV file.

Attributes

datasources

description

essential

metadata

molrepo_name

registry

universe

__repr__()[source]

String representation.

static add(kwargs)[source]

Add a new row to the table.

Parameters:

kwargs (dict) – The data in dictionary format.

static count(molrepo_name=None)[source]

Get Molrepo entries associated to the given source name.

Parameters:

molrepo_name (str) – The source name from Datasource.molrepo_name

static from_csv(filename)[source]

Add entries from CSV file.

Parameters:

filename (str) – Path to a CSV file.

static from_molrepo_name(molrepo_name)[source]

Fill Molrepo table from a molrepo name.

Parameters:

molrepo_name (str) – a molrepo name.

static get(name=None)[source]

Get molrepos associated to the given name.

Parameters:

name (str) – The molrepo name, e.g “chebi”

static get_by_molrepo_name(molrepo_name, only_raw=False)[source]

Get Molrepo entries associated to the given name.

Parameters:
  • molrepo_name (str) – The molrepo_name to search for.

  • only_raw (bool) – Only get the raw values without the whole object (default:false)

static get_fields_by_molrepo_name(molrepo_name, fields=None)[source]

Get specified column fields.

Get specified column fields from a molrepo_name in raw format (tuples)

Parameters:
  • molrepo_name (str) – The molrepo_name to search for.

  • fields (list) – List of field names. If None, all fields.

static get_universe_molrepos()[source]

Get Molrepo names that are considered universe.

static molrepo_hpc(tmpdir, only_essential=False, **kwargs)[source]

Run HPC jobs importing all molrepos.

tmpdir(str): Folder (usually in scratch) where the job directory is

generated.

only_essential(bool): Only the essentail molrepos (default:false)

static to_csv(staticmethod, filename)[source]

Write molecules InChI-Key, source_id, InChI and SMILES to CSV file.

Parameters:

filename (str) – Path to a CSV file.