Molecular Property Diagnostic Suite (MPDS):
A Component of Open Source Chemoinformatics Initiative
Council of
Scientific and Industrial Research (CSIR), Govt. of India, has undertaken the
Open Source initiative to establish a novel open source
platform for both computational and experimental technologies to carry out
Drug Discovery for infectious/ neglected diseases. An important challenge of
this is to ensure that the drug molecule remains affordable to the people of
the developing world. The Molecular Property Diagnostic Suite (MPDS) is a component
of Opensource chemoinformatics initiative; conceptualized to assess and estimate the multifarious
aspects of drug-likeliness of any given molecule, in order to diagnose their
potential application as drug. MPDS compound library is updated till 31st December, 2016.
Category
|
Modules
|
Description
|
Data
Library
|
Module
1: Literature
|
Contains
Mtb proteins and its genetic information; FDA approved drug
information and polypharmacological information.
|
Module
2: Target Library
|
Contains
crystal structures and homology models for Mtb proteins.
|
Module
3: Compound Library
|
Contains
a single window interface for searching a compound in MPDS
compound database.
|
Data
Processing
|
Module
4: File Format Conversion
|
Conversions
of files from one chemical format to another chemical format, 2D
to 3D file conversion using Open Babel.
|
Module
5: Descriptor Calculation
|
Calculation
of descriptors and fingerprints using PaDEL and CDK tools.
|
Data
Analysis
|
Module
6: QSAR
|
Generation
of QSAR models using the data mining tools, McQSAR and SVMlight.
|
Module
7: Docking
|
Ligand
Optimization; Conformer Generation and Protein-Ligand docking.
|
Module
8: Screening
|
Prioritization
of compounds for drug-like features using DruLiTo tool;
Biopharmaceutical Classification System (BCS); Identification of
toxicophoric groups in a compound.
|
Module
9: Visualization
|
Visualizing
protein-ligand interactions using Jmol and Ligplot.
|
|
Work Plan
- Database Creation: Storage and linking of molecules and their fragments to in-house databases, in convertible file formats. Upgradation of existing databases (like A2IDB, PLID and CAD of CMM-IICT group). Development of an interface using GALAXY platform to link the databases and softwares onto the common OSDD chemoinformatics portal.
(Groups Involved: IICT, IMTECH, JNU)
- Descriptor calculations: Development of new in-house descriptors for drug metabolism, toxicity, druglikeliness, reactivity etc. Writing codes to interface the available public domain packages with the in-house descriptors. Calculation of descriptors for the in-house database of molecules and fragments. Development of empirical models and QSA(P/T/E)R relationships for metabolism, toxicity and solvation.
(Groups Involved: IICT, JALEEL, CLRI, NCL)
- Chemoinformatics: Generation of fingerprints to represent the structure, interaction and property of a molecule. Use of pattern recognition, machine learning tools and cluster analysis for similarity analysis of a molecule/fragment with those in public domain databases like PubChem (> 37 million entries), ZINC (> 13 million compounds) and FDA approved drugs (4000).
(Groups Involved: IICT, IMTECH NCL)
- Filters:Development of target specific and general filters based on molecular properties to estimate the druggability of the candidate molecule. Use of these filters to design a quick and precise virtual screening protocol.
(Groups Involved: IICT,NCL,BBAU,NIPER)
- Predictive Computing: Employment of heavy duty computing techniques such as QM, QM/MM and MD simulations to address the problems of fundamental interest in Chemistry and Biology.
(Groups Involved: IICT,NIPER,BBAU,CLRI)
Deliverables
- Galaxy platform: Developing
an interface through ′GALAXY′ to link the databases, software
and tools onto the common ′OSDD chemoinformatics portal′.
- Descriptor calculations: Tools to calculate
a large variety of molecular descriptors such as constitutional,
topological, thermodynamic, electrostatic, shape-based, QM-based and DFT-based
etc. will be developed. Apart from these conventional ones, descriptors to
quantify the protein-ligand interactions such as free energy of binding,
calculated with MMPBSA/GBSA methods, will also be calculated. Tools for
exploring the probable modifications in a molecule (e.g., adding or
removing a certain fragment at a certain site) to give rise to better
activity, specificity and reduced toxicity will be developed. Development of QSAR and QSTR models based on the
above calculated descriptors.
- Structure
representation and storage: Representation of
molecular structures in SMILES, 2D and 3D formats alongside the
representation of chemical fingerprints present in the structures.
Subsequent storage of them in a database along with their known
experimental properties, biological activities and spectral properties.
- Similarity/Diversity based approaches: Development of codes and tools to assess the
similarity and diversity among molecules. Subsequent categorization of
them in quantitative terms into appropriate clusters, according to its
property. Here, one of the principle objectives is data reduction without
losing diversity. We plan to employ molecular fingerprints which will
allow one to quantify similarity and diversity. Based on this, large scale
analysis of data is possible.
- Fragment based approaches: Generation of
fragment libraries specific for the active sites of various potential drug
targets such as kinases, GPCRs etc. Development of tools for exploring the
probable modifications in molecules those are already there in clinical
trials. For example, adding or removing a certain fragment at a certain
site in the molecule, to ensure better activity, specificity and reduced
toxicity.
- Target specific filters: Identify target specific filters for screening
large libraries of small molecules, drug-like molecules and natural
products for identifying potential lead compounds for certain target
proteins using docking, MD simulation and QM/MM based parameters.
- Drug metabolism and toxicity: Bond
dissociation energy can be used as a descriptor to quantify the metabolism
of a certain molecules by the Cytochrome P450s. Cytochrome P450s are
the enzymes which take major role
for metabolism of drug molecules by various chemical reactions such as
epoxidation, sulphoxidation etc. All the breakable bonds and reaction
centers in molecules can be identified and calculation of the
corresponding bond dissociation energies will give an idea of the possible
mechanisms of metabolism of the molecule and also the possible
intermediated formed in each mechanism. This knowledge may give an insight
to the toxicity of a particular molecule.
- Predictive computing: Application of QM/MM and Free Energy Perturbation (FEP) methods to
predict the biological reactions mediated by various enzymes and
metallo-enzymes. Prediction of structures of various drug targets using MD
simulations, Monte Carlo sampling etc.
- In silico screening: Screening of all molecules made by OSDD, CSIR or any other team
targeted towards M. tuberculosis.