Using MSDchem to Search the PDB Ligand Dictionary
互联网
- Abstract
- Table of Contents
- Figures
- Literature Cited
Abstract
The PDB ligand dictionary is the chemical reference database of all the small building block molecules (e.g., amino acids, nucleic acids, and bound ligands) in the Protein Data Bank (PDB) referenced by a distinct three?letter code identifier. Since PDB files have only three?dimensional coordinate data, the role of the dictionary that of a reference resource for the actual chemical properties of small molecules, shared consistently across all PDB entries. The ligand dictionary is maintained in all sites of the Worldwide Protein Data Bank (wwPDB), the Research Collaboratory for Structural Bioinformatics (RCSB) in U.S., the Macromolecular Structure Database (MSD) in Europe, and the Protein Data Bank in Japan (PDBj), and it is exchanged on a regular basis. The MSD group at the European BioInformatics Institute (EBI) extends the dictionary into the MSDchem ligand database, which utilizes chemo?informatics packages and incorporates additional curation work. MSDchem is publicly available on the Web through the MSDchem search system, the functionality of which is described in more detail in this unit.
Keywords: ligands; organic chemicals; chemical structure; chemical properties; protein structure databases; macromolecular complexes; amino acids; nucleic acids
Table of Contents
- Basic Protocol 1: Searching for Ligands Using the Three‐Letter PDB Code or Molecular Name
- Basic Protocol 2: Searching for Ligands Using a Formula or Fragment Expression
- Basic Protocol 3: Performing a Chemical Subgraph Search
- Basic Protocol 4: Exporting the Ligand Dictionary
- Commentary
- Literature Cited
- Figures
Materials
Figures
-
Figure 14.3.1 The MSDchem search home page. The figure illustrates how to find the ligand with a three‐letter code of ATP. View Image -
Figure 14.3.2 The MSDchem result page (top), listing the ligand with the three‐letter code of ATP that matches the search criteria, and the ligand details page (bottom) with information about the ligand properties. Links to ligand content and related data, visualization and export functionality, and the PDB nomenclature chemical diagram are provided. View Image -
Figure 14.3.3 MSDchem ligand data at the atomic level that can be accessed from a ligand details page. View Image -
Figure 14.3.4 Three‐dimensional visualizations of a ligand using the Jmol applet for idealized versus representative coordinates from MSDchem. View Image -
Figure 14.3.5 Exporting ligand data with representative heavy‐atom and idealized hydrogen coordinates in the SDF/MDL chemical file format using MSDchem. View Image -
Figure 14.3.6 The formula expression editor screen used in an example to obtain MSDchem ligands with one to four oxygen atoms, at least three nitrogen atoms, no fluorine, and no sulfur. View Image -
Figure 14.3.7 The fragment expression editor screen used in an example to obtain MSDchem ligands with two or more benzimidazole and without any piperazine groups. View Image -
Figure 14.3.8 MSDchem ligands that satisfy particular formula range and fragment expression constraints. View Image -
Figure 14.3.9 List of PDB entries referring to MSD atlas pages that include ligands that satisfy particular formula range and fragment expression constraints. View Image -
Figure 14.3.10 Screen used for loading the molecular structure diagram of DM1 on JME editor and modifying it by removing its noncharacteristic atoms or groups in order to prepare a subgraph search criteria that will match molecules with the same main structure as DM1. View Image -
Figure 14.3.11 Eight of the ten daunomycin‐like ligands that contain the reduced chemical graph of DM1 as a subgraph, retrieved using the MSDchem “has substructure” search functionality. View Image -
Figure 14.3.12 The 40 PDB entries that include the 10 daunomycin‐like ligands and access to their binding site details from MSDsite. View Image -
Figure 14.3.13 Three more hits for DM1 daunomycin‐like ligands revealed using the MSDchem fingerprint similarity searching. View Image -
Figure 14.3.14 The MSDchem ligand index page for the letter A with a list of the 525 ligands that have a three‐letter code starting with the character A. There are links to access and download data for each one of them as well as for the whole ligand collection. View Image -
Figure 14.3.15 The CML file for ligand DM1 as exported through the ligand index page for the letter D. View Image
Videos
Literature Cited
Berman, H., Nakamura, H., and Henrick, K. 2005. The Protein Data Bank (PDB) and the WorldWide PDB http://www.wwpdb.org. | |
In Encyclopedia of Genetics, Genomics, Proteomics and Bioinformatics. Section 4.6. (M. Dunn, L. Jorde, P. Little, and S. Subramaniam, eds.) http://www.mrw.interscience.wiley.com/ggpb/articles/g406303/frame.html. John Wiley & Sons, Hoboken, N.J. | |
Bernstein, F.C., Koetzle, T.F., Williams, G.J.B., Meyer, E.F. Jr., Brice, M.D., Rodgers, J.R., Kennard, O., Shimanouchi, T., and Tasumi, M. 1977. Protein Data Bank: A computer‐based archival file for macromolecular structures. J. Mol. Biol. 112:535‐542. | |
Boutselakis, H., Dimitropoulos, D., Henrick, K., Ionides, J., John, M., Keller, P.A., McNeil, P., Pineda, J., and Suarez‐Uruena. A. 2004. The European Bioinformatics Institute macromolecular structure relational database technology. In Database Annotation in Molecular Biology. pp. 223‐240. John Wiley & Sons, Hoboken, N. J. | |
Gasteiger, J., Rudolph, C., and Sadowski, J. 1990. Automatic generation of 3D‐atomic coordinates for organic molecules. Tetrahedron Comp. Method. 3:537‐547. | |
Golovin, A., Oldfield, T.J., Tate, J.G., Velankar, S., Barton, G.J., Boutselakis, H., Dimitropoulos, D., Fillon, J., Hussain, A., Ionides, J.M.C., John, M., Keller, P.A., Krissinel, E., McNeil, P., Naim, A., Newman, R., Pajon, A., Pineda, J., Rachedi, A., Copeland, J., Sitnov, A., Sobhany, S., Suarez‐Uruena, A., Swaminathan, J., Tagari, M., Tromm, S., Vranken, W., and Henrick, K. 2004. E‐MSD: An integrated data resource for bioinformatics. Nucl. Acids Res. 32:D211‐D216. | |
Golovin, A., Dimitropoulos, D., Oldfield, T., Rachedi, A., and Henrick, K. 2005. MSDsite: A database search and retrieval system for the analysis and viewing of bound ligands and active sites. Proteins 58:190‐199. | |
Ihlenfeldt, W.D., Takahasi, Y., Abe, H., and Sasaki, S. 1992. CACTVS: A chemistry algorithm development environment. In Daijuukagakutouronkai Dainijuukai Kouzoukasseisoukan Shinpojiumu Kouenyoushishuu (K. Machida and T. Nishioka, eds.) pp. 102‐105. Kyoto University Press, Kyoto, Japan. | |
Krissinel, E.B., Winn, M.D., Ballard, C.C., Ashton, A.W., Patel, P., Potterton, E.A., McNicholas, S.J., Cowtan, K.D., and Emsley, P. 2004. The new CCP4 Coordinate Library as a toolkit for the design of coordinate‐related applications in protein crystallography. Acta. Crystallogr. D Biol. Crystallogr. 60:2250‐2255. | |
Weininger, D. 1988. SMILES 1. Introduction and encoding rules. J. Chem. Inf. Comput. Sci. 28:31. | |
Westbrook, J.D., Henrick, K., Ulrich, E., and Berman, H.M. 2005. Classification and use of macromolecular data. Appendix 3.6.2. The Protein Databank exchange dictionary. In International Tables for Crystallography, Vol. G: Definition and Exchange of Crystallographic Data (S. Hall and B. McMahon, eds.) pp. 195‐197. Springer, Dordrecht, The Netherlands. | |
Key References | |
Berman et al., 2005. See above. | |
A description of the wwPDB consortium, its organization, and goals. | |
Dutta, S., Burkhardt, K., Bluhm, W.F., and Helen, B. 2006. Using the tools and resources of the RCSB Protein Data Bank. In Current Protocols in Bioinformatics (A.D. Baxevanis, R.D.M. Page, G.A. Petsko, L.D. Stein, and G.D. Stormo, eds.) pp. 1.9.1‐1.9.40. John Wiley & Sons, Hoboken, N. J. | |
Explains various concepts about the PDB, the wwPDB, and tools that are provided by the RCSB partner, as well as the corresponding Ligand Depot service databases and suite of Web tools. | |
Golovin et al., 2004. See above. | |
A consistent overview of the activities and policies of the MSD group at EBI and of the concepts of the MSD. | |
Westbrook et al., 2005. See above. | |
A description of the process of the wwPDB exchange, which is the basis of the MSDchem database. | |
Internet Resources | |
http://www.ebi.ac.uk/msd‐srv/msdchem | |
The MSDchem search home page. | |
http://www.ebi.ac.uk/msd/index.html | |
Contains information about the MSD group and the MSD suite of tools and services. | |
http://www.ebi.ac.uk/msd‐srv/msdlite | |
The MSDlite search system provides overview atlas pages for PDB entries, using the MSD database. | |
http://www.ebi.ac.uk/msd‐srv/msdsite | |
The MSDsite Web service that provides details about ligand occurrences and binding sites of small molecules in PDB entries. | |
http://www.ebi.ac.uk/msd‐srv/docs/dbdoc | |
Contains information about the MSDSD public search relational database and how to download and use it. | |
http://www.ebi.ac.uk/msd‐srv/docs/moldoc/help.html | |
The molecule subgraph containment package used by the MSDchem search system. | |
http://deposit.pdb.org/public‐component‐erf.cif | |
The Chemical Component Information dictionary that is exchanged in wwPDB. | |
http://www2.chemie.uni‐erlangen.de/software/cactvs | |
The CACTVS chemistry algorithm development environment, the main software package used by MSDchem database and Web service | |
http://www2.chemie.uni‐erlangen.de/software/corina | |
The CORINA Web service for fast and efficient generation of high‐quality 3‐D molecular models used to generate idealized coordinates for ligands. | |
http://www.molinspiration.com/jme | |
The home page of the JME Molecular Editor Java applet used by MSDchem Web service. | |
http://jmol.sourceforge.net | |
The home page of the Jmol, free, open source 3‐D molecule viewer used by MSDchem Web service. | |
http://www.mdli.com | |
Information about the definition of the popular MDL CTfile Formats. | |
http://www.acdlabs.com | |
The ACD‐labs chemical software package used at the time of curation of new ligands. | |
http://users.unimi.it/∼ddl/vega/index_noanim.htm | |
The VEGA Molecular modeling software package used in the back‐end of the MSDchem database. |