- Corresponding Author:
- N. B. Ramachandra
Department of Biotechnology, PES Institute of Technology, BSK III Stage, Bengaluru-560 085,India
E-mail: [email protected]
|Date of Submission||04 July 2014|
|Date of Revision||15 February 2015|
|Date of Acceptance||22 November 2015|
|Date of Web Publication|
This is an open access article distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License, which allows others to remix, tweak, and build upon the work non-commercially, as long as the author is credited and the new creations are licensed under the identical terms.
Congenital heart disease is the most common type of birth defect. The single nucleotide polymorphism in GATA4 is associated with various congenital heart disease phenotypes. In the present study, we analysed the nonsynonymous single nucleotide polymorphism of GATA4, which are involved in congenital heart disease by predicting the changes in protein structures. Total of 49 nonsynonymous single nucleotide polymorphisms of GATA4 was screened from congenital heart disease patients of Mysore and also globally reported nonsynonymous single nucleotide polymorphisms. To understand the role of nonsynonymous single nucleotide polymorphisms, we mutated the sequence and translated into amino acids. Further the mutated protein secondary structure is predicted and tertiary structure is predicted using homology modeling. The quantitative evaluation of protein structure quality was verified with Volume Area Dihedral Angle Reporter server. Results revealed the secondary, tertiary structural changes along with changes in free energy of folding, volume and accessible surface area. Thus, the structural changes in the mutated proteins impaired the normal function of GATA4.
Congenital heart disease, single nucleotide polymorphism, GATA4, homology modeling, free energy folding
Congenital heart disease (CHD), affects 1% of all live births and is the leading noninfectious cause of death in the 1st year of life. With the progress in molecular genetics and developmental biology, genes associated with heart disease have been identified. It is found that single nucleotide polymorphisms (SNPs) in a variety of single genes are associated with congenital heart defects and genetic syndromes. Data from mouse and human demonstrated that the zinc finger containing transcription factor (GATA4) plays a critical role in heart disease development. GATA4 interacts directly with NKX2.5 via the homeodomain of NKX2.5 and zinc finger domain of GATA4. Further, regulates genes involved in embryogenesis and in myocardial differentiation and function. Mutations in GATA4 has been identified in multiplex families like septal defects, such as atrial septal defect (ASD), and ventricular septal defect (VSD).
Molecular genetic testing of GATA4 mutations is more widespread and applied clinically. These testing are the basis for improved diagnosis and therapy of human CHD. To discriminate between synonymous SNPs (sSNPs), which constitute the majority of genetic variation and nonsynonymous SNPs (nsSNPs) located in coding regions, an analysis of DNA and protein sequence would result in amino acid variation in the protein products of genes.
Earlier studies from our laboratory revealed the prevalence of CHDs ranging from 6.6 to 13.06 per 1000 live births in India. Among the different CHDs and VSDs were the most common phenotype observed with high rate of consanguinity in the patient families. Studies from our laboratory also reported numerical and structural chromosomal anomalies. However, molecular studies on CHDs are limited. In light of this, we have screened SNPs of GATA4 in 100 CHD patients and 50 controls in south Indian population and found 11 nsSNPs namely c.1129A>G (S377G), c.1180C>A (P394T), c.1081A>G (M361V), c.1138G>A (V380M), c.1273G>A (D425N), c.716A>G (N239S), c.278G>C (G93A), c.1207C>A (L403M), c.1232C>T (A411V), c.1295T>C (L432S) and novel SNP, c.1180C>G (P394A).
Further, in silico analysis of these SNPs is very limited in exploring the molecular mechanisms of CHD. In view of this, an attempt to analyse the nsSNPs of GATA4, which are involved in CHD was made using in silico tools to predict structural changes in the protein as well as quantitative and qualitative evaluation of the protein.
Materials and Methods
The present study was conducted at Genomics Laboratory, Department of Studies in Zoology, University of Mysore, Karnataka, India between September 2007 and March 2013.
Secondary structure analysis
Forty nine nsSNPs of GATA4 cited in literature was used in this study, of which 11 nsSNPs reported from our laboratory is also included[4,10-21]. All 49 sequences were mutated in silico to generate nsSNPs. These sequences were translated to identify amino acid polymorphisms. The differences in protein sequences were compared through multiple sequence alignment. The secondary structures for all the 49 nsSNPs have been predicted by Chou Fasman secondary structure algorithm by using Accelrys gene software.
Comparative modeling of GATA4 nsSNPs
Mutational effect on protein structure was determined by using http://swissmodel.expasy.org/workspace. Template structures N-terminal zinc finger of murine GATA-1 protein (PDB Id: 1GNF_A) between 213-251 residues and opposite GATA DNA binding protein (PDB Id: 3DFX_B) between 265-320 residues was used to generate 3-D structures. The 19 nsSNPs of these regions were analysed. The root-mean-square deviation (RMSD) calculation for all the modeled structure to the template found the difference from 0.01 to 0.015.
Evaluation and validation of three-dimensional structure
Evaluation and validation of the 3-D structure was done using VADAR server (Alberta, Canada). The overall quality of the modeled protein was assessed by Ramachandran plot. The structure superimposition of both control and mutant proteins was performed using Pymol.
Only 16 nsSNPs of GATA4 showed positive in their secondary structure change (Table 1). These changes with reference to control are represented in fig. 1. The proteins derived from nsSNPs of GATA4, c.755G>C, c.1037C>T, c.1129A>G and c.1130G>A showed changes in their turn, c.155C>T, c.487C>T and c.1220C>A showed changes in their sheet and mutation c.278G>C showed changes only in helix. The nsSNP of GATA4, c.687G>T was responsible to change either into sheet or helix. The GATA4 nsSNPs, c.779G>A and c.855T>C may change to helix or turn. The changes in helix or sheet was observed at c17C>T, c.82C>T, c.905A>G and c.1207C>A. Mutation at c.1295T>C showed changes in their helix, sheet or turn.
Two X-ray crystallographic structures showed homology to GATA4 protein. The N-terminal Zinc finger protein of GATA4 had homology with N-terminal zinc finger of murine GATA-1 protein (PDB Id: 1GNF_A) and with opposite GATA DNA binding protein (PDB Id: 3dfxB). For validation of modeling Ramachandran plot was applied using VADAR server (Alberta, Canada) (fig. 2).
Tertiary structural changes are showed in figs. 3 and 4. Since the tertiary structural are modeled using Homology modeling only side chain differences compared to control protein were observed (figs. 3 and 4). We observed differences in the side chain atoms such as, Delta carbon (CD), epsilon nitrogen (NE), zeta carbon (CZ), gamma carbon (CG), gamma sulphur (SG), gamma oxygen (OG1), delta sulphur (SD). The Quantitative and qualitative analysis of GATA4-mutated proteins showed, differences in free energy of folding, volume and accessible surface area (ASA) with control protein (Table 2).
|SNP||Amino acid||Secondary structure of|
Table 1: Nonsynonymous Snps Of Gata4 And Their Impact On Secondary Structure Of Protein.
Human SNPs represent the most frequent type of DNA variation. The main goals of SNP research is to understand the genetics of the human phenotype variation and especially the genetic basis of human complex diseases. The nsSNPs comprise a group of SNPs that together with SNPs in regulatory regions are believed to have the highest impact on phenotype. The nsSNPs also known as single amino acid polymorphism (SAPs) that causes amino acid changes in proteins, which have the potential to affect both protein structure and function. Some of the mutations in SAP sites are not associated with any changes in phenotype and are considered functionally neutral, but others bringing deleterious effects to protein function and are responsible for many human genetic diseases. By the analysis of the new incoming data on SNPs by mapping them at sequence and structural would address problems concerning population, medical and evolutionary genetics.
Reamon-Buettner and Borlak reported that, zinc finger transcription factor, GATA4 is a master regulator of heart development. Zinc finger mutation identified in this gene affects a zinc coordinating cysteine in the C-terminal finger and is strongly associated with VSDs. GATA4 zinc finger mutations are shown to affect DNA binding, contacts on zinc ion and protein secondary structure. The impaired GATA4 interaction with the third helix of the homeodomain of NKX2.5 results in septation defects The structural changes in the analysed GATA4 mutated protein might be responsible for non functional GATA4 proteins. However, we found only side chain differences of the amino acids at the tertiary structure level of protein. Theoretical accuracy of prediction of the tertiary structure of a protein from a sequence is 90%. Local conformation of a protein varies under the native conditions. These limitations are also imposed by secondary structure prediction’s inability to account for tertiary structure. The only one amino acid difference occurs by particular nsSNP of the human heart.
The structural changes in the analysed GATA4 mutated protein might be responsible for non functional GATA4 proteins. However, we found only side chain differences of the amino acids at the tertiary structure level of protein. Theoretical accuracy of prediction of the tertiary structure of a protein from a sequence is 90%. Local conformation of a protein varies under the native conditions. These limitations are also imposed by secondary structure prediction’s inability to account for tertiary structure. The only one amino acid difference occurs by particular nsSNP and modeling tool considers the same template for all nsSNPs. Hence, all models have same structure except side chain difference for the mutated.
Side chain of an amino acid is specific to each amino acid of a protein. The side chain can make an amino acid as a weak acid or a weak base, and a hydrophile if the side chain is polar or a hydrophobe if it is nonpolar. The distribution of hydrophilic and hydrophobic amino acids determines the tertiary structure of the protein. There physical location of proteins influences their quaternary structure. These properties are important in protein structure and protein–protein interactions.
Figure 2: Ramachandran plot.
Ramachandran plot was analysed using Volume Area Dihedral Angle Reporter server (Alberta, Canada) for the evaluation and validation of 3-D reference structures: (a) n-terminal zinc finger of murine GATA-1 protein (PDB Id: 1gnfA): fully allowed region (41 residues, 78.85%), additionally allowed region (10 residues, 19.23%), outside region (1 residue GLY24, 1.92%) (b) opposite GATA DNA binding protein (PDB Id: 3dfxB): Fully allowed region (20 residues, 4.05%), additionally allowed region (14 residues, 37.84%), generously allowed region (2 residues GLY2, GLY37, 5.41%), outside region (1 residue ASN27, 2.70%).
Figure 3: 3-D models of mutated proteins derived from 9 nsSNPs of zinc finger region of GATA4.
Each model was superimposed by control (left side of each image) and mutated protein (right side of each image). Models were done using solution structure of the n-terminal zinc finger of murine GATA-1 (PDB Id: 1gnfA) and also for mutant. Ball and stick representation in both the models showed the side chain structural differences between control and mutated proteins. (a) E216D: In control- Delta Carbon (CD), Epsilon Oxygen (OE1, OE2) and in mutant OD1, OD2. (b) G214S: In control-no changes and in mutant Gamma Carbon (CG), Gamma Oxygen (OG). (c) M223T: In control- CB, CG, Delta Sulphur (SD), CE and in mutant CG2, OG1. (d) R229S: In control- CG, CD, Epsilon Nitrogen (NE), CE2, Zeta Carbon (CZ). NH1, NH2 and in mutant OG. (e) G234S: In control- no changes and in mutant CB, OG. (f) N239D: In control-ND2, OD1 and in mutant OD1, OD2. (g) N239S: In control- CG, ND2, OD1 and in mutant OG. (h) Y244C: In control- CG, CD1, CD2, CE1, CZ, OH and in mutant Gamma Sulphur (SG). (i) N248S: In control- CG, ND2, OD1and in mutant OG.
Figure 4: 3-D models of mutated proteins derived from 10 nsNPs of GATA4.
Each model was superimposed by control (left side of each image) and mutated protein (right side of each image). Models were done using solution structure of the opposite GATA DNA binding protein (PDB Id: 3dfxB) and also for mutant. Ball and stick representation in both the models showed the side chain structural differences between control and mutated proteins. (a) T277I: In control-CG2, OG1and in mutant CG1, CG2, CD1. (b) R283H: In control- CD, NE, CZ, NH1, NH2 and in mutant CD2, ND1, CE1, NE2. (c) Q316E: In control-NE2, OE1 and in mutant OE1, OE2. (d) N285K: In control-ND2, OD1 and in mutant CD, CE, NZ. (e) N273S: In control-CG, ND2, OD1 and in mutant OG. (f) C292R: In control-SG and in mutant CG, CD, NE, CZ, NH1, NH2. (g) A294V: In control- no changes and in mutant CG1, CG2. (h) G296C: In control-no changes and in mutant CB, SG (i) G296S: In control-no changes and in mutant CB, OG. (j) H302R: In control-CD2, ND1, CE1, NE2 and in mutant CD, NE, CZ, NH1, NH2.
|SAP||ASA in Angs 2||Volume of the||Free energy of|
|Control (213–251)=||protein (Angs 3)||folding (kcal/mol)|
|ASA in Angs 2||Volume of the||Free energy of|
|Control (265–320)=||protein (Angs 3)||folding (kcal/mol)|
Table 2: Summary Of Free Energy Of Folding And Volume And Accessible Surface Area.
Accessible surface area (ASA) is the exposed surface area of the protein (or residue) that a water molecule could access or touch. ASA of the side chain values are also calculated for polar (N, O, S) atoms, charged atoms (N, O) and for nonpolar atoms (C) to permit the calculation of polar, charged and nonpolar surface area. These ASA values can be quite useful in structure assessment and thermodynamic calculations. ASA is highly dependent on the choice of atomic or van der Waals radii. Protein structures are stabilized by hydrophobic and van der Waals forces, and by hydrogen bonds. Hydrophobic energy is gained by the reduction of surface in contact with water. In the present study, we observed differences in free energy of folding and volume of the protein. There is a linear relationship between the solvation free energy of folding and the protein size and misfolded structures showed higher solvation free energies.
Usually nsSNP do not inactivate protein functionality completely, instead, nsSNPs change the protein activity at some level, either directly or indirectly through interactions with other proteins in the pathway. Such information has to be considered mutually. Therefore, side chain differences of the amino acids in the 3-D structure of GATA4 mutated protein may be responsible for nonfunctioning of GATA4 proteins. This study will facilitate to further study structural changes and distinguish the CHD-causing nsSNPs from neutral SNPs. This information will explore further for probing and utilization in pharmacogenetics study and also in biomedical applications.
We are grateful to all the patients families participated in this investigation, Doctors and PG students for their kind support, Professor and Chairman, DOS in Zoology, University of Mysore, Mysore, Unit on Evolution and Genetics for the laboratory facilities and also Prof. H. A. Ranganath for his encouragement.
Financial support and sponsorship
We thank Council for Scientific and Industrial Research (CSIR), New Delhi, Government of India [No.27 (0156)/06/EMR-II dated 19.10.2006] for the financial support.
Conflicts of interest
There are no conflicts of interest.
- Hoffman JI, Kaplan S. The incidence of congenital heart disease. J Am CollCardiol 2002;39:1890-900.
- Huang JB, Liu YL, Sun PW, Lv XD, Du M, Fan XM. Molecular mechanisms of congenital heart disease. CardiovascPathol 2010;19:e183-93.
- Sander TL, Klinkner DB, Tomita-Mitchell A, Mitchell ME. Molecular and cellular basis of congenital heart disease. PediatrClin North Am 2006;53:989-1009.
- Garg V, Kathiriya IS, Barnes R, Schluterman MK, King IN, Butler CA, et al. GATA4 mutations cause human congenital heart defects and reveal an interaction with TBX5. Nature 2003;424:443-7.
- Ramensky V, Bork P, Sunyaev S. Human non-synonymous SNPs: Server and survey. Nucleic Acids Res 2002;30:3894-900.
- Smitha R, Karat SC, Narayanappa D, Krishnamurthy B, Prasanth SN, Ramachandra NB. Prevalence of congenital heart diseases in Mysore. Int J Hum Genet 2006;12:12-7.
- Ramegowda S, Ramachandra NB. Parental consanguinity increases congenital heart diseases in South India. Ann Hum Biol 2006;33:519-28.
- Smitha R, Harshavardhan MG, Hyderi A, Savitha MR, Krishnamurthy B, Karat SC, et al. Chromosomal anomalies and congenital heart disease in Mysore, South India. Int J Hum Genet 2010;10:131-9.
- Dinesh SM, Lingaiah K, Savitha MR, Krishnamurthy B, Narayanappa D, Ramachandra NB. GATA4 specific nonsynonymous single-nucleotide polymorphisms in congenital heart disease patients of Mysore, India. Genet Test Mol Biomarkers 2011;15:715-20.
- Hirayama-Yamada K, Kamisago M, Akimoto K, Aotsuka H, Nakamura Y, Tomita H, et al. Phenotypes with GATA4 or NKX2.5 mutations in familial atrial septal defect. Am J Med Genet A 2005;135:47-52.
- Reamon-Buettner SM, Borlak J. GATA4 zinc finger mutations as a molecular rationale for septation defects of the human heart. J Med Genet 2005;42:e32.
- Sarkozy A, Conti E, Neri C, D’Agostino R, Digilio MC, Esposito G, et al. Spectrum of atrial septal defects associated with mutations of NKX2.5 and GATA4 transcription factors. J Med Genet 2005;42:e16.
- Nemer G, Fadlalah F, Usta J, Nemer M, Dbaibo G, Obeid M, et al. A novel mutation in the GATA4 gene in patients with tetralogy of fallot. Hum Mutat 2006;27:293-4.
- Zhang L, Tümer Z, Jacobsen JR, Andersen PS, Tommerup N, Larsen LA. Screening of 99 Danish patients with congenital heart disease for GATA4 mutations. Genet Test 2006;10:277-80.
- Rajagopal SK, Ma Q, Obler D, Shen J, Manichaikul A, Tomita-Mitchell A, et al. Spectrum of heart disease associated with murine and human GATA4 mutation. J Mol Cell Cardiol 2007;43:677-85.
- Reamon-Buettner SM, Cho SH, Borlak J. Mutations in the 3’-untranslated region of GATA4 as molecular hotspots for congenital heart disease (CHD). BMC Med Genet 2007;8:38.
- Schluterman MK, Krysiak AE, Kathiriya IS, Abate N, Chandalia M, Srivastava D, et al. Screening and biochemical analysis of GATA4 sequence variations identified in patients with congenital heart disease. Am J Med Genet A 2007;143A:817-23.
- Posch MG, Perrot A, Schmitt K, Mittelhaus S, Esenwein EM, Stiller B, et al. Mutations in GATA4, NKX2.5, CRELD1, and BMP4 are infrequently found in patients with congenital cardiac septal defects. Am J Med Genet A 2008;146A:251-3.
- Zhang W, Li X, Shen A, Jiao W, Guan X, Li Z. GATA4 mutations in 486 Chinese patients with congenital heart disease. Eur J Med Genet 2008;51:527-35.
- Hamanoue H, Rahayuningsih SE, Hirahara Y, Itoh J, Yokoyama U, Mizuguchi T, et al. Genetic screening of 104 patients with congenitally malformed hearts revealed a fresh mutation of GATA4 in those withatrial septal defects. Cardiol Young 2009;19:482-5.
- Zhang WM, Li XF, Ma ZY, Zhang J, Zhou SH, Li T, et al. GATA4 and NKX2.5 gene analysis in Chinese Uygur patients with congenital heart disease. Chin Med J (Engl) 2009;122:416-9.
- Al-Ali H, Khachfe HM. The N-terminal domain of apolipoprotein B-100: Structural characterization by homology modeling. BMC Biochem 2007;8:12.
- Kopp J, Schwede T. The SWISS-MODEL Repository of annotated three-dimensional protein structure homology models. Nucleic Acids Res 2004;32:D230-4.
- Guex N, Peitsch MC. SWISS-MODEL and the Swiss-PdbViewer: An environment for comparative protein modeling. Electrophoresis 1997;18:2714-23.
- Willard L, Ranjan A, Zhang H, Monzavi H, Boyko RF, Sykes BD, et al. VADAR: A web server for quantitative evaluation of protein structure quality. Nucleic Acids Res 2003;31:3316-9.
- Bramucci E, Paiardini A, Bossa F, Pascarella S. PyMod: Sequence similarity searches, multiple sequence-structure alignments, and homology modeling within PyMOL. BMC Bioinformatics 2012;13Suppl 4:S2.
- Våge J, Lingaas F. Single nucleotide polymorphisms (SNPs) in coding regions of canine dopamine- and serotonin-related genes. BMC Genet 2008;9:10.
- Hu J, Yan C. Identification of deleterious non-synonymous single nucleotide polymorphisms using sequence-derived information. BMC Bioinformatics 2008;9:297.
- Sunyaev S, Lathe W 3rd, Bork P. Integration of genome data and protein structures: Prediction of protein folds, protein interactions and “molecular phenotypes” of single nucleotide polymorphisms. CurrOpinStructBiol 2001;11:125-30.
- Saunders R, Deane CM. Protein structure prediction begins well but ends badly. Proteins 2010;78:1282-90.
- Creighton TE. Proteins: Structures and Molecular Properties. Ch. 1. San Francisco: WH Freeman; 1993.
- Miller S, Lesk AM, Janin J, Chothia C. The accessible surface area and stability of oligomeric proteins. Nature 1987;328:834-6.
- Chiche L, Gregoret LM, Cohen FE, Kollman PA. Protein model structure evaluation using the solvation free energy of folding. ProcNatlAcadSci U S A 1990;87:3240-3.
- Uzun A, Leslin CM, Abyzov A, Ilyin V. Structure SNP (StSNP): A web server for mapping and modelingnsSNPs on protein structures with linkage to metabolic pathways. Nucleic Acids Res 2007;35:W384-92.