SciELO - Scientific Electronic Library Online

 
vol.30 número3DETECCIÓN DE PESTIVIRUS POR REVERSO TRANSCRIPCIÓN ACOPLADA A REACCIÓN EN CADENA DE LA POLIMERASA DE LA REGIÓN 5' cDNA NO TRADUCIDACLONAJE BIOLÓGICO DE UN AISLADO DE CORONAVIRUS BOVINO índice de autoresíndice de materiabúsqueda de artículos
Home Pagelista alfabética de revistas  

Servicios Personalizados

Revista

Articulo

Indicadores

  • No hay articulos citadosCitado por SciELO

Links relacionados

  • No hay articulos similaresSimilares en SciELO

Compartir


Revista de Salud Animal

versión impresa ISSN 0253-570X

Rev Salud Anim. v.30 n.3 La Habana sep.-dic. 2008

 

Trabajo original

 

 

OPTIMIZATION OF SCORING FUNCTION DOCKING FOR ANTIVIRAL DRUG DESIGN AGAINST CORONAVIRUS GENUS

 

OPTIMIZACIÓN DE LAS FUNCIONES DE PUNTUACIÓN DE DOCKING PARA EL DISEÑO DE FÁRMACOS ANTIVIRALES CONTRA EL GÉNERO CORONAVIRUS

 

 

L.J. Pérez and Heidy Díaz de Arce

Centro Nacional de Sanidad Agropecuaria (CENSA), Apartado 10, San José de las Lajas, La Habana, Cuba. Correo electrónico: lesterjosue@censa.edu.cu

 

 


ABSTRACT

The development of a wide-spectrum antiviral drug against pathogenic coronavirus is nowadays a attractive prospect and may provide an effective first line of defense against emerging CoV-related diseases. The most attractive target for the design of anticoronaviral inhibitors is the main protease or 3CLPRO. Ligand docking and screening algorithms are now frequently used in the drug-design process. The aim of this work based on a validation study of scoring functions is to find a consensus function or to select the scoring function that better predicts the orientation of the ligand in the pocket binding site, in order to perform a virtual screening in ligand databases, with a lower computational cost in the intent of finding a leader compound for antiviral drug against coronaviruses. For this purpose, the scoring functions were implemented within the Dock4.0 program suite and the 2ALV main protease three dimensional structures will be used. The protein structure was selected because it was crystallized with a non-peptidic ligand. The main protease coronavirus pocket including the catalytic dyad Cys145 His41 was obtained. The closer RSMD media values to 4.0 Ao were obtained using the contact score function from the rigid ligand dock method. This score function was selected as the best ranking tool for the search based on virtual screening of the potential candidate inhibitors of the important pathogen group belonging to the genus coronavirus.

Key words: docking algorithms; scoring functions; rigid docking; flexible docking; main protease; coronavirus


RESUMEN

El desarrollo de fármacos antivirales de amplio espectro que se puedan utilizar contra diferentes patógenos emergentes del género coronavirus como primera línea de defensa es hoy una perspectiva atractiva. La proteasa principal de los coronavirus 3CLPRO es la diana de mayor atractivo para el diseño de inhibidores anticoronavirus. Los algoritmos de screening y docking son actualmente muy utilizados en los procesos de diseño racional de fármacos. El objetivo de este trabajo es, basados en un estudio de validación de las funciones de score, encontrar una función consenso o seleccionar la función de score que mejor prediga la orientación del ligando en el bolsillo catalítico de la 3CLpro, con el fin de mejorar las búsqueda por screening virtual en las bases de datos de ligandos, compuestos que puedan servir de líderes como candidatos a usarse como fármacos inhibidores anticoronavirus, con un menor costo computacional. Con este propósito se utilizaron las funciones de score implentadas en el paquete de programas del DOCK4.0 y la estructura cristalizada de la 3CLpro 2ALV, esta última se seleccionó por estar cristalizada con un ligando de naturaleza no peptídica. Se diseñó el bolsillo catalítico de la proteasa principal de coronavirus que incluyó los residuos que conforman la díada catalítica Cys145 His41. La función con la cual se obtuvieron los mejores valores de RSMD más cercanos a 4.0Ao, fue la función de contacto, siguiendo un algoritmo de orientación de ligando rígido, esta función se seleccionó como la mejor herramienta en la puntuación de búsquedas basadas en screening virtual de candidatos potenciales como inhibidores del importante grupo de patógenos del género coronavirus.

Palabras clave: algoritmos de docking; funciones de puntuación; docking rígido; docking flexible; proteasa principal; coronavirus


 

 

INTRODUCTION

Coronaviruses (CoVs), a genus containing about 26 known species to date, cause highly prevalent diseases in human and animals of veterinary importance (1,2).The members of this genus are subdivided into three groups based on genetic and serological markers and nowadays may be considered "emerging pathogens" (1,3). Coronaviruses are enveloped, plus strand RNA virus with the largest RNA genome known (on order of 30Kb), the 5' two-thirds of their genome encode a polyprotein that contains all proteins necessary for RNA replication and 3' one-third encodes the structural proteins (1). In common with other RNA viruses employing RNA-dependent RNA polymerase for genome replication, coronaviruses undergo high rates to mutations and recombination (4,5). Despite the efforts for developing an effective vaccine for protection against relevant members of the genus coronaviruses such as, transmissible gastroenteritis virus (TEGV), infection bronchitis virus (IBV), severe acute respiratory syndrome (SARS), either live attenuated vaccines, inactive vaccines or subunit vaccines have been failed, due to, the occurrence of CoV disease at mucosal surfaces, at which, needs the stimulation of local immunity. Further, high rates to mutations and recombination of the CoV are often no cross protective for vaccination (6,7,8). In view of the issue posed above, the development of a wide-spectrum antiviral drug against pathogenic coronavirus is more reasonable and attractive prospect and may provide an effective first line of defense against emerging CoV-related diseases (2). The most attractive target for design of anticoronaviral inhibitors is the main protease or 3CLPRO, due to, closely three dimensional structural conservation among the members of genus, highly conserved substrate binding site, and important pivotal role play in viral gene expression, replication and proteolytic processing (2). Ligand docking and screening algorithms are now frequently used in the drug-design process. The purpose of docking algorithms is now expanding beyond the original goal of fitting a given ligand into a specific protein structure. Newer applications include database screening, lead generation and de novo drug design. A docking procedure consists of three interrelated components: identification of the binding site, a search algorithm to effectively sample the search space (the set of possible ligand positions and conformations on the protein surface) and a scoring function. The number of solved structures of ligand-protein complexes now available for some targets allows the testing and validation of docking algorithms, by comparison of complexes predicted by them with complex ligand-protein extracted from databases (9). The aim of this work based on a validation study of scoring functions, is to find a consensus function or to select the scoring function that better predict the orientation of the ligand in the pocket binding site in order to perform a virtual screening in ligand databases, with a lower computational cost in the intent of finding a leader compound for antiviral drug against coronaviruses. For this purpose, the scoring functions implemented within the Dock 4.0) program suite and the 2ALV main protease three dimensional structures will be used. The protein structure was selected because it was crystallized with a non-peptidic ligand.

 

MATERIALS AND METHODS

Pocket generation

The pocket binding site of the main protease coronavirus was generated from 2ALV available in Protein Data Bank (PDB), at which it had 1.8 Ao of resolution complex with CY6 ligand (10,11). The heteroatoms presents in the crystal structure are not presents in protein native structure, just in the crystal structure gaining in stability. On the other hand, the B monomer has the same composition and conformation that A monomer, the 3 CLPRO is enzymatically active as monomer and dimer, this feature is important biological role, analyzing the possible interaction between the A monomer and ligans the B monomer is including, by this reasons mentioned above the heteroatoms and B monomer were removed using Chimera Version 1.6 program. The ligand and protein were separated in different files. The maximum distances interaction between ligand and protein is 5.0 Ao given by the Van der Waals forces (2.5Ao from ligand and pocket) with the aim to take account the atoms that interacts directly and indirectly with the ligand the pocket was obtained from GET_NEAR_RES Version 2.2.2 and the residues were involved at 6.0 Ao to distances from to ligand.

Grid evaluation

The grid point calculation was carried out by a Grid package program included within Dock (4.0.1). The parameters were adjusted following the instructions of the Dock 4.0 manual (12). The critical points are exposed briefly in Table 1. The box distance parameters were estimated at 5.0 Ao and grid spacing parameters at 0.2 Ao (black letters). The other parameters were accepted from default.

Docking process

The orientation of ligand at the binding site surfaces to the main protease coronavirus was carried out using the suite package program Dock4.0. The three scoring functions included within the program were evaluated adjusting the parameters for rigid and flexible ligand methods. Both methods were computed taken account the Dock 4.0 manual instructions briefly, the critical parameters were obtained throughout the pre-screening assays, and the values selected by the best orientation results (12). The rigid ligand method case, 100 conformational orientations selected of 500 orientations were ranked, these orientations were generated throughout the automated search algorithms used to refine affinity estimates of predicted complexes ligand-protein, a maximum 2 bump filter and contact clash penalty value to 50. The flexible ligand method was evaluated like the rigid ligand method critical point parameters, but other parameters needed for this type of orientations exposed in Table 2 are included, the other parameters were accepted from default.

Evaluation of the relation to score values from dock process and RSMD values

The values from dock process using the three different scoring functions of Dock4.0 were tabulated and analyzed using the Statgrafics Version 5.0 program.

Statistic Analysis

The statistic analysis for all RSMD values from different scoring functions and both methods to ligand orientation were made using the SAS version 7.0. program, with a Kruskal-Wallis method and p<0.05. The values obtained were analyzed individually and were grouped by scoring functions method for each method and between both methods.

 

RESULTS AND DISCUSION

Pocket binding site of coronavirus main protease

The main protease coronavirus pocket defined using GET_NEAR_RES Version 2.2.2 program was obtained. The pocket includes catalytic dyad Cys145 His41, the complementary catalytic residues Asp187 replace the water molecule within the pocket and form a catalytic triad (13). Other residues play a role to give stability at tridimensional conformation pocket structure, and loops associated with the flexibility of the substrate-binding pocket, such as Phe140, His163, Glu166 and His172 were included, each one less to 6.0 Ao to distance at substrate, all residues obtained are show in Table 3, and them 3D conformation is shown in Figure 1, these residues define the 3CLPRO pocket (14).

Grid evaluation

Potential energy grids are used for various docking programs to improve representations and energetic contributions on grid points so that they only need to be read during ligand scoring basic idea is to store information about the receptor energetic contribution. In the most basic form, grid points store two types of potentials: electrostatic and Van der Waals interactions (Box) (15). Figure 2 shows a representative grid for electrostatic potentials and steric interactions. Grid values are shown in Table 4. Grid-based tools that allow the identification of cavities on proteins are available. Among all the cavities accessible on a protein, one of them is the active site. Several functions have been proposed to molecular mechanic approaches such as DOCK.

Validation of scoring functions for coronavirus main protease

The evaluation and ranking of predicted ligand conformations are crucial aspects for structure-based virtual screening (16). The most common mean of estimating a binding affinity is to use a scoring function, by partitioning the free energy into recognizable components. The number and type of terms vary between scoring functions, but in general there are terms for hydrogen bonding, van der Waals, electrostatic and hydrophobic interactions, and entropy penalties (13). The success of a docking algorithm in predicting a ligand binding pose is normally measured in terms of the root-mean-square deviation (RMSD) between the experimentally observed heavy-atom positions of the ligands and the one(s) predicted by the algorithm (17). All scoring functions encompassed into Dock4.0 program suite used for coronavirus main protease were compared using RMSD media values, either rigid ligand dock method or analyzed ligand flexibility between different functions within the method and different methods (Figures 3 and 4).

The closest RSMD media values to 4.0 Ao were obtained using the contact score function from the rigid ligand dock method. This scoring function might be used in virtual screening at ligand database as proposal in the search of potential inhibitor to coronavirus main protease. The best 10 ligands oriented within pocket binding site surface of coronavirus main protease are shown in Figure 5. In the Figure 5 is shown that high cavities of the pocket binding site allows that different conformation of the ligand can be interacts with different protease residues included within the pocket selected, this propriety will facility to find various candidates with antiviral activity.

Analysis to relation of score values and ligand orientation.

The scoring function is expressed in terms of free energy, to provide an estimate of the binding affinity between receptor and ligand molecules (9). The score values obtained from the evaluation by scoring function Dock4.0 for the orientation of the ligand within the binding site of coronavirus main protease were plotted against RMSD individual values of orientation. For the three different scoring functions, either the rigid ligand method or the flexible one, were used (Figure 6). In all cases, a mathematic direct relation between score values and RMSD values cannot be found. These imperfections in the scoring function continue to be the major limiting factor. Scoring functions normally used in docking programs make a number of simplifications and assumptions to allow a more computationally efficient evaluation of ligand affinity (18). Thus, the implementation of a consensus function is needed for carrying out a virtual screening at the ligand databases for coronavirus main protease.

Consensus scoring combines the information obtained from different scores to compensate errors from individual scoring functions, therefore improving the probability of finding the correct solution (19,20). This approach involves obtaining an output list of dockings with some search engine and primary score functions, and then re-scoring the final list with various secondary score functions and finally taking the intersection of a set of re-scored lists. The relationship between the performance of a given score functions in these two roles remains to be established (21). In our case a secondary score functions was not found, by this reason was selected for a posterior studies the contact scoring function, because of, it was with the best orientation results were obtained.

 

CONCLUSION

The docking process involves the prediction of ligand conformation and orientation within a targeted binding site. In this work, the scoring functions of Dock4.0 program suite were validated. The contact score function was selected as the best ranking tool for the search based on virtual screening of the potential candidate inhibitors of the important pathogen group belonging to the genus coronavirus.

 

REFERENCES

1. Weiss SR and Martin SN. Coronavirus pathogenesis and emerging pathogen severe acute respiratory syndrome coronavirus. Microbiol Mol Biol Rev. 2005; 69(4):635-64.

2. Yang H, Xie W, Xue X, Yang K, Wenxue JM, Zhao LO, et al. Design of Wide-Spectrum Inhibitors Targeting Coronavirus Main Proteases. PLoS Biol. 2005;3(10):324-36.

3. Stark CJ and Atreya CD. Molecular advances in the cell biology of SARS-CoV and current diseases prevention strategies. Virol. J. 2005;2:35-43.

4. Domingo E and Holland JJ. A virus mutations and fitness for survival. Annu Rev Microbiol. 1997;51:151-78.

5. Rest JS and Mindell DP. SARS associated coronavirus has a recombinant polymerase and coronaviruses have a history of hostshifting. Infect Genet Evol. 2003;3:219-25.

6. Saif LJ, van Cott JL, Brim TA. Immunity to transmissible gastroenteritis virus and porcine respiratory coronavirus infections in swine. Vet. Immunol. Immunopathol. 1994;43:89-97.

7. Cavanagh D. Severe acute respiratory syndrome vaccine development: experiences of vaccination against avian infectious bronchitis coronavirus. Avian Pathology. 2003;32(6):567-82.

8. Saif LJ. Animal coronavirus vaccines: lessons for SARS. Dev Biol. 2004; 119:129-40.

9. McConkey BJ, Sobolev V, Edelman M. The performance of current methods in ligands-protein docking. Current Science. 2002;83(7):845-56.

10.Ghosh AK, Xi K, Ratia K, Santarsiero BD, Fu W, Harcourt BH, et al. X-ray structural analysis of SARS coronavirus 3CL proteinase in complex with designed anti-viral inhibitors. 2006; [cited 2007 10 Dec] available from: http://www.rcsb.org/pdb/explore.do?structureId=2ALV.

11.Ghosh, AK, Xi K, Ratia K, Santarsiero BD, Fu W, Harcourt BH, et al. Design and synthesis of peptidomimetic severe acute respiratory syndrome chymotrypsin-like protease inhibitors. J.Med.Chem. 2005;48:6767-71.

12.http://www.dock.compbio.ucsf.edu/Old_Versions /Dock4.0_manual.pdf

13.Tan J, Verschueren K, Anand K, Shen J, Yang M, Xu Y, et al. PH-dependent Conformational Flexibility of the SARS-CoV Main Protease (Mpro) Dimer : Molecular Dynamics Simulation and Multiple X-Ray Structure Analyses. J.Mol.Biol. 2005;354: 35-40.

14.Chou CY, Chang HC, Hsu WC, Lin TZ, Lin CH, Chang GG. Quaternary structure of the severe acute respiratory syndrome (SARS) coronavirus main protease. Biochemistry. 2004;43:14958-70.

15.Kitchen D, Decornez H, Furr J, Bajorath J. Docking and scoring in virtual screening for drug discover: Methods and applications. Drug Dicovery. 2004;3: 935-49.

16.Simonson T, Archontis G, Karplus M. Free energy simulations come of age: protein_ligand recognition. Acc. Chem. Res. 2002 ;35:430-37.

17.Sousa SF, Fernández PA, Ramos JM. Protein_Ligand Docking: Current Status and Future Challenges. Proteins: Structure, Function, and Bioinformatics. 2006;65:15-26.

18.Taylor RD, Jewsbury PJ, Essex JW. A review of protein-small molecule docking methods. J Comput Aided Mol Des. 2002;16:151-66.

19.Charifson PS, Corkery JJ, Murcko MA, Walters WP. Consensus scoring: a method for obtaining improved hit rates from docking databases of three-dimensional structures into proteins. J Med Chem. 1999;42:5100-09.

20.Halperin I, Ma B, Wolfson H, Nussinov R. Principles of docking: an overview of search algorithms and a guide to scoring functions. Proteins. 2002;47:409-43.

21.Mohan V, Gibbs AC, Cummings MD, Jaeger EP, DesJarlais RL. Docking: Successes and Challenges. Current Pharmaceutical Design. 2005;11: 323-33.

 

 

(Recibido 10-1-2008; Aceptado 22-6-2008)

Creative Commons License Todo el contenido de esta revista, excepto dónde está identificado, está bajo una Licencia Creative Commons