Comprehensive Evolutionary Study of Disease-Causing Amino Acid Substitutions Using Computational Analysis

Research Presented at the Intel International Science and Engineering Fair (ISEF) Phoenix, AZ, 2005

Winner, First Place & Best in Category, Medicine & Health


Large scale sequencing has resulted in public databases receiving a monumental amount of new data. Comparative evolutionary studies of species at the genetic level can now be done. Disease-causing sequences in humans have been incidentally reported to exist as wild type (normal) sequences in other organisms. No published study has compared to what extent this occurs. This study compared known human disease-causing amino acid substitutions to wild type sequences across all available vertebrate organisms. No published software existed to perform this task; custom programming was developed to accomplish high-throughput computational analysis techniques. Of 1,081 genes identified, 257 (24%) were found to have one or more occurrence of a human disease-causing sequence present as wild type (normal) sequence in other·organisms. The gene that had the most mutations (codon locations) was the gene causing Retinitis Pigmentosa, with 243 codons, followed by Von Willebrand Disease (140 codons) and Cystic Fibrosis (74 codons). This study discovered functionally relevant codon locations, as they indicate where a change has occurred between species, underscoring a biologic difference. Major questions have yet to be answered about how a particular allele goes from being neutral or beneficial in one organism to disease-causing in another. This study documented the prevalence of human Mendelian disease sequences present as normal sequences in other vertebrates. This information can lead to finding suitable new animal models of human disease. These findings additionally permit further study of these now identified substitutions, a necessary step in the development of effective disease treatment.