BMC Genomics,2016年
Mohammed H. Al Qahtani, Mourad Assidi, Riadh Benmarzoug, Sabrine Belmabrouk, Najla Kharrat, Ahmed Rebai, Rania Abdelhedi
LicenseType:CC BY |
BackgroundThe identification of charge clusters (runs of charged residues) in proteins and their mapping within the protein structure sequence is an important step toward a comprehensive analysis of how these particular motifs mediate, via electrostatic interactions, various molecular processes such as protein sorting, translocation, docking, orientation and binding to DNA and to other proteins. Few algorithms that specifically identify these charge clusters have been designed and described in the literature. In this study, 197 distinctive human viral proteomes were screened for the occurrence of charge clusters (CC) using a new computational approach.ResultsThree hundred and seventy three CC have been identified within the 2549 viral protein sequences screened. The number of protein sequences that are CC-free is 2176 (85.3 %) while 150 and 180 proteins contained positive charge (PCC) and negative charge clusters (NCC), respectively. The NCCs (211 detected) were more prevalent than PCC (162). PCC-containing proteins are significantly longer than those having NCCs (p = 2.10-16). The most prevalent virus families having PCC and NCC were Herpesviridae followed by Papillomaviridae. However, the single-strand RNA group has in average three times more NCC than PCC. According to the functional domain classification, a significant difference in distribution was observed between PCC and NCC (p = 2. 10−8) with the occurrence of NCCs being more frequent in C-terminal region while PCC more often fall within functional domains. Only 29 proteins sequences contained both NCC and PCC. Moreover, 101 NCC were conserved in 84 proteins while only 62 PCC were conserved in 60 protein sequences. To understand the mechanism by which the membrane translocation functionalities are embedded in viral proteins, we screened our PCC for sequences corresponding to cell-penetrating peptides (CPPs) using two online databases: CellPPd and CPPpred. We found that all our PCCs, having length varying from 7 to 30 amino-acids were predicted as CPPs. Experimental validation is required to improve our understanding of the role of these PCCs in viral infection process.ConclusionsScreening distinctive cluster charges in viral proteomes suggested a functional role of these protein regions and might provide potential clues to improve the current understanding of viral diseases in order to tailor better preventive and therapeutic approaches.
BMC Genomics,2015年
Mourad Assidi, Mohamed H Al-Qahtani, Jonathan L King, Bobby L Larue, Seung Bum Seo, Xiangpei Zeng, Bruce Budowle, Antti Sajantila
LicenseType:Unknown |
BackgroundMassively parallel sequencing (MPS) technologies have the capacity to sequence targeted regions or whole genomes of multiple nucleic acid samples with high coverage by sequencing millions of DNA fragments simultaneously. Compared with Sanger sequencing, MPS also can reduce labor and cost on a per nucleotide basis and indeed on a per sample basis. In this study, whole genomes of human mitochondria (mtGenome) were sequenced on the Personal Genome Machine (PGMTM) (Life Technologies, San Francisco, CA), the out data were assessed, and the results were compared with data previously generated on the MiSeqTM (Illumina, San Diego, CA). The objectives of this paper were to determine the feasibility, accuracy, and reliability of sequence data obtained from the PGM.Results24 samples were multiplexed (in groups of six) and sequenced on the at least 10 megabase throughput 314 chip. The depth of coverage pattern was similar among all 24 samples; however the coverage across the genome varied. For strand bias, the average ratio of coverage between the forward and reverse strands at each nucleotide position indicated that two-thirds of the positions of the genome had ratios that were greater than 0.5. A few sites had more extreme strand bias. Another observation was that 156 positions had a false deletion rate greater than 0.15 in one or more individuals. There were 31-98 (SNP) mtGenome variants observed per sample for the 24 samples analyzed. The total 1237 (SNP) variants were concordant between the results from the PGM and MiSeq. The quality scores for haplogroup assignment for all 24 samples ranged between 88.8%-100%.ConclusionsIn this study, mtDNA sequence data generated from the PGM were analyzed and the output evaluated. Depth of coverage variation and strand bias were identified but generally were infrequent and did not impact reliability of variant calls. Multiplexing of samples was demonstrated which can improve throughput and reduce cost per sample analyzed. Overall, the results of this study, based on orthogonal concordance testing and phylogenetic scrutiny, supported that whole mtGenome sequence data with high accuracy can be obtained using the PGM platform.
BMC Genomics,2016年
Muhammad Abu-Elmagd, Mourad Assidi, Mohammed H. Al Qahtani, Faten Hachani Ben Ali, Touhami Mahjoub, Faouzi Janhai, Assila Ben Salem, Fatma Megdich, Malek Souayeh, Sondes Hizem, Olfa Kacem, Mounir Ajina
LicenseType:CC BY |
BackgroundPolycystic ovary syndrome (PCOS) is characterized by the growth of a number of small cysts on the ovaries which leads to sex hormonal imbalance. Women who are affected by this syndrome suffer from irregular menstrual cycles, decline in their fertility, excessive hair growth, obesity, acne and most importantly cardiac function problems. The vascular endothelial growth factor (VEGF) plays a pivotal role in tissue vascularization in general and in the pathogenesis of many diseases. The PCOS was found to be associated with high expression levels of VEGF. In women who undergo assisted reproductive procedures (ART), VEGF was found to be a key mediator of other factors to control ovary angiogenesis. Here, we set out to examine the association of VEGFA gene polymorphism with PCOS and its components in a population of Tunisia women to enhance our understanding of the genetic background leading angiogenesis and vascularization abnormalities in PCOS.MethodsThe association of VEGFA gene with PCOS and its components was examined in a cohort of 268 women from Tunisia involving 118 PCOS patients and 150 controls. VEGFA gene variations were assessed through the analysis of the following SNPs rs699947 (A/C), rs833061 (C/T), rs1570360 (G/A), rs833068 (G/A), rs3025020 (C/T), and rs3025039 (C/T). The linkage disequilibrium between SNPs was assessed using HAPLOVIEW software while combination of SNPs into haplotypes in the population and the reconstruction of the cladogram were carried-out by PHASE and ARLEQUIN programs, respectively. Genetic association and genotype-phenotype correlations were calculated by logistic regression and non-parametric tests (Kruskall-Wallis and Mann–Whitney tests), respectively, using StatView program.ResultsWe observed 10 haplotypes in our studied cohort whereH1 (ACGG), H2 (ACAG), H7 (CTGG) and H8 (CTGA) were the most frequent. We observed the association of the genotype CT of the SNP rs30225039 with PCOS phenotype (P = 0.03; OR 95 % CI = 2.05 [1.07–3.90]) and a trend for correlation of the pair of haplotypes H2/H2 with prolactin levels in plasma (P = 0.077; 193.5 ± 94.3 vs 45.7 ± 7.2). These data are consistent with literature and highlight one more time the role of vascularization in the pathogeny of PCOS.ConclusionsLD pattern in VEGF locus showed a similar LD pattern between the Tunisian population and the CEU. More haplotypes in the Tunisian population than in CEU was observed (22 haplotypes vs 16 haplotypes) suggesting higher recombination rate in Tunisians. The study showed that there was any advantage of using haplotypes compared with SNPs taken alone.
BMC Genomics,2015年
Mourad Assidi, Mohamed H Al-Qahtani, Jonathan L King, Bobby L Larue, Seung Bum Seo, Xiangpei Zeng, Bruce Budowle, Antti Sajantila
LicenseType:Unknown |
BackgroundMassively parallel sequencing (MPS) technologies have the capacity to sequence targeted regions or whole genomes of multiple nucleic acid samples with high coverage by sequencing millions of DNA fragments simultaneously. Compared with Sanger sequencing, MPS also can reduce labor and cost on a per nucleotide basis and indeed on a per sample basis. In this study, whole genomes of human mitochondria (mtGenome) were sequenced on the Personal Genome Machine (PGMTM) (Life Technologies, San Francisco, CA), the out data were assessed, and the results were compared with data previously generated on the MiSeqTM (Illumina, San Diego, CA). The objectives of this paper were to determine the feasibility, accuracy, and reliability of sequence data obtained from the PGM.Results24 samples were multiplexed (in groups of six) and sequenced on the at least 10 megabase throughput 314 chip. The depth of coverage pattern was similar among all 24 samples; however the coverage across the genome varied. For strand bias, the average ratio of coverage between the forward and reverse strands at each nucleotide position indicated that two-thirds of the positions of the genome had ratios that were greater than 0.5. A few sites had more extreme strand bias. Another observation was that 156 positions had a false deletion rate greater than 0.15 in one or more individuals. There were 31-98 (SNP) mtGenome variants observed per sample for the 24 samples analyzed. The total 1237 (SNP) variants were concordant between the results from the PGM and MiSeq. The quality scores for haplogroup assignment for all 24 samples ranged between 88.8%-100%.ConclusionsIn this study, mtDNA sequence data generated from the PGM were analyzed and the output evaluated. Depth of coverage variation and strand bias were identified but generally were infrequent and did not impact reliability of variant calls. Multiplexing of samples was demonstrated which can improve throughput and reduce cost per sample analyzed. Overall, the results of this study, based on orthogonal concordance testing and phylogenetic scrutiny, supported that whole mtGenome sequence data with high accuracy can be obtained using the PGM platform.
BMC Genomics,2016年
Mohammed H. Al Qahtani, Mourad Assidi, Riadh Benmarzoug, Sabrine Belmabrouk, Najla Kharrat, Ahmed Rebai, Rania Abdelhedi
LicenseType:CC BY |
BackgroundThe identification of charge clusters (runs of charged residues) in proteins and their mapping within the protein structure sequence is an important step toward a comprehensive analysis of how these particular motifs mediate, via electrostatic interactions, various molecular processes such as protein sorting, translocation, docking, orientation and binding to DNA and to other proteins. Few algorithms that specifically identify these charge clusters have been designed and described in the literature. In this study, 197 distinctive human viral proteomes were screened for the occurrence of charge clusters (CC) using a new computational approach.ResultsThree hundred and seventy three CC have been identified within the 2549 viral protein sequences screened. The number of protein sequences that are CC-free is 2176 (85.3 %) while 150 and 180 proteins contained positive charge (PCC) and negative charge clusters (NCC), respectively. The NCCs (211 detected) were more prevalent than PCC (162). PCC-containing proteins are significantly longer than those having NCCs (p = 2.10-16). The most prevalent virus families having PCC and NCC were Herpesviridae followed by Papillomaviridae. However, the single-strand RNA group has in average three times more NCC than PCC. According to the functional domain classification, a significant difference in distribution was observed between PCC and NCC (p = 2. 10−8) with the occurrence of NCCs being more frequent in C-terminal region while PCC more often fall within functional domains. Only 29 proteins sequences contained both NCC and PCC. Moreover, 101 NCC were conserved in 84 proteins while only 62 PCC were conserved in 60 protein sequences. To understand the mechanism by which the membrane translocation functionalities are embedded in viral proteins, we screened our PCC for sequences corresponding to cell-penetrating peptides (CPPs) using two online databases: CellPPd and CPPpred. We found that all our PCCs, having length varying from 7 to 30 amino-acids were predicted as CPPs. Experimental validation is required to improve our understanding of the role of these PCCs in viral infection process.ConclusionsScreening distinctive cluster charges in viral proteomes suggested a functional role of these protein regions and might provide potential clues to improve the current understanding of viral diseases in order to tailor better preventive and therapeutic approaches.
BMC Genomics,2016年
Mourad Assidi, Abdelbaset Buhmeida, Muhammad Abu-Elmagd, Harrell Gill-King, Angie D. Ambers, Jonathan L. King, Jennifer D. Churchill, Bruce Budowle, Monika Stoljarova
LicenseType:CC BY |
BackgroundAlthough the primary objective of forensic DNA analyses of unidentified human remains is positive identification, cases involving historical or archaeological skeletal remains often lack reference samples for comparison. Massively parallel sequencing (MPS) offers an opportunity to provide biometric data in such cases, and these cases provide valuable data on the feasibility of applying MPS for characterization of modern forensic casework samples. In this study, MPS was used to characterize 140-year-old human skeletal remains discovered at a historical site in Deadwood, South Dakota, United States. The remains were in an unmarked grave and there were no records or other metadata available regarding the identity of the individual. Due to the high throughput of MPS, a variety of biometric markers could be typed using a single sample.ResultsUsing MPS and suitable forensic genetic markers, more relevant information could be obtained from a limited quantity and quality sample. Results were obtained for 25/26 Y-STRs, 34/34 Y SNPs, 166/166 ancestry-informative SNPs, 24/24 phenotype-informative SNPs, 102/102 human identity SNPs, 27/29 autosomal STRs (plus amelogenin), and 4/8 X-STRs (as well as ten regions of mtDNA). The Y-chromosome (Y-STR, Y-SNP) and mtDNA profiles of the unidentified skeletal remains are consistent with the R1b and H1 haplogroups, respectively. Both of these haplogroups are the most common haplogroups in Western Europe. Ancestry-informative SNP analysis also supported European ancestry. The genetic results are consistent with anthropological findings that the remains belong to a male of European ancestry (Caucasian). Phenotype-informative SNP data provided strong support that the individual had light red hair and brown eyes.ConclusionsThis study is among the first to genetically characterize historical human remains with forensic genetic marker kits specifically designed for MPS. The outcome demonstrates that substantially more genetic information can be obtained from the same initial quantities of DNA as that of current CE-based analyses.