We have produced a reference set of the whole genome (Panu_3.0; GenBank accession GCA_000264685.2) for an olive baboon (Papio anubis), the most commonly used baboon species in biomedical research (Fig. S1 and tables S1 and S2) (7). To investigate genetic differentiation in the genus, sequences of the entire genome of 16 additional individuals were analyzed, from 2 to 4 individuals representing each of the six species within Papyrus, and an ice cream (Thermophilus frozen), a member of a closely related genre that serves as an external group (fig. S2 and table S3). This panel of diversity produced> 54.6 million single nucleotide variants (SNV), of which> 42.4 million are variable in Papyrus (fig. S3 and table S4). To develop a second independent perspective on the differentiation of the genome, we have identified a novel Alu Insertions, a type of genetic variation that results in a fundamentally different mutational mechanism. Unexpectedly, we've found a very high number recently Alu insertions in babuixos (and in rhesus macacos) relative to human genomes and other primates (Fig. 2 and table S5). There are 192,889 sheets in length AluElements And in the P. anubis genome The accumulation rate of specific lineages AluInserts Y have been, therefore, more than four times higher (Fig. 2) in baboons and rhesus macaques than in hominoids (humans, chimpanzees or orangutan) and three times larger than in the African green monkey ( genre Clorocebus), another OWM (22).
Our phylogenetic analyzes provide several new visions about the population of the baboon and the history of the genome. The maximized probability (ML) analyzes of concatenated SNV show that individual baboons group correctly with their conspecifics while separating the six existing species in clades other than the north and south (Fig. 3 and Fig. S4A). Conversely, the Bayesian analysis of the same SNV data suggests that P. kindae She is a sister of the Northern Clado instead of doing it P. cynocephalus i P. ursinus (figure S4B). The existence of multiple hybrid zones and discrepancies documented between relationships based on the mtDNA and phenotypes (Figure 1) (12, 15) argue that the incomplete listening of lines (ILS) and the combination of lineages have influenced the genetic relationships between these species. When we used a phylogenetic approach aware of the polymorphism, PoMo (23, 24), we again obtained a baseline divergence between the north and the south, with P. kindae located south of the group. However, the relationships between the three southern species differ from the ML result (Figure 3). PoMo also incorporates long terminals for longer terminals P. ursinus i P. papio than for other lineages. The simulations (Fig. S5 and Table S6) show that the mix between divergent lineages can affect the lower inferiors and that the lineages that have experienced a mixture exhibit lengths of artificially shorter branches due to the sharing of # 39; alleles between lines. This suggests that the other four lineages may have been more affected by the mix than P. ursinus i P. papio, which is consistent with the fact that these two species are found at the southern and southern extremities of the bumblebee distributions, respectively (Fig. 1).
To test the mixture explicitly between the six existing baboon species, an analysis was performed using f-Statistics, followed by modeling with hidden Markov coagent methods (Fig. 4A, table S7 and fig.S6). The most appropriate model (see Materials and methods) indicates that the history of P. kindae It includes an old addition event that includes a line related to the existing one P. ursinus (Contribution of 52% to extinct P. kindae) and a non-sampled (possibly extinct) line belonging to the northern clade (48% contribution). He fStatistics suggest they exist P. papio is very related to P. Anubis, but received ~ 10% of genetic income from a northern ancestral line that has not yet been shown, possibly extinct.
Our results also reveal a new light on the historical dynamics of hybridization between P. anubis (a northern clade species) i P. cynocephalus (a kind of southern clade), which has previously been reported in southern Kenya near the National Park of Amboseli (17). Behavior observations and microsatellite-based analyzes support recent introgressions starting from P. anubis inside P. cynocephalus since the eighties (25, 26). Our analysis of the exchange of haplotype blocks throughout the genome indicates that P. anubis An individual from Kenya's Aberdare region, more than 200 km north of Amboseli, is also confused with P. cynocephalus, carrying ~ 546MB of nuclear DNA derived from P. cynocephalus (figure S7). If we assume that this was derived from a unique addition event, it is estimated that it has produced around 21 generations (~ 220 years ago). However, there are also more complex explanations. The second person of the P. anubis The population of Aberdare also carries P. cynocephalus Haplotypes, but these shared genomic segments are shorter and shorter and are probably a more invasive one. Consistent with other studies (27), our results suggest that there have been multiple episodes of gene flow involving these two species over a considerable period of time and that the effects of hybridization last expand far beyond the current hybrid zone. This complexity can be very representative of the complexity of other hybrid areas of known baboons (10, 12, 15, 18, 19, 28).
Motivated by the results of the fStatistics and the exchange of haplotypes have performed two additional tests around the world Papyrus Diversity to examine the hypothesis of the ancient mixture by means of independent methods. Alu Insertion polymorphisms are valuable phylogenetic characters because the polarity of specific mutational changes can be established unequivocally for a given genomic segment (figure S8) (29). The haplotype that carries a novel Alu The insertion is derived from the orthographic haplotype that does not have it Alu Repeat and reverse are rare. Rule of most parsimonious analyzes of Dollo's babuixos using the novel Alu The inserts revealed a difference between the north and the south. However, the descendant lineages are poorly resolved, which present apparent homoplasia (Figure S8). In a phylogeny constructed with characters with well-defined polarity, it would not be expected that homoplasia unless a species radiation had significant ILS and / or gene flow between divergent lineages (30).
Next we analyze differences in the evolutionary history of different segments through the genome of the baboon. The reference genome was divided into 808 discrete regions without gene (possibly neutral). Using BUCKy (31) and the SNV genotypes of the diversity panel, the Bayesian matching analysis (BCA) was performed. Individual animals again, as expected, were grouped by species. The North-South basal divergence is supported, but the matching factors (CF) for the relationships within each of these two geographic clades are low (Figure 3). P. hamadryas She is more frequent sister of a anubis-papio clade, but the other two possible topologies [(papio-(ham-anubis)) and (anubis-(ham-papio))] They do not appear the same frequency (figure S9), as you would expect ILS. In the same way, P. kindae It's more frequent than the sister a cynocephalus-ursinus Clade, which matches ML results but not fStatistical data or PoMo results. Again, the two minor BCA topologies are not in equal proportions (figure S9). Together, the Alu The insertion and BCA results support the conclusion that cross-linking in ILS-free counts without crosslinking has influenced the genomic divergence of baboon (Table 1).
Table 1Summary of various types of data and analytical approaches used to investigate the phylogeny of baboon species.
The line divergence calendar and mixing events were estimated using a Coalescent Markov hidden model (CoalHMM; S10 to S15 and tables S8 and S9) (32, 33). Using an estimated mutation rate of 0.9 × 10-8 for base pair per generation and generation time of 11 years[seeMaterialsimètodesi([seeMaterialsandMethodsand([veureMaterialsimètodesi([seeMaterialsandMethodsand(11, 34)], we obtain the results that are presented in figure 4A. To rebuild demographic history, we generated sequential pairs of Markovian coalescent (PSMC) pairs (35), assuming the generation time and the mutation rate mentioned above (Figure 4B). With the exception of P. papio, which has a truncated plot, the remaining five species are very similar in size to the population (Ne) for 4 Ma up to ~ 1.4 Ma, supporting the conclusion that all baboon species share the same demographic history (that is, they were indeed a lineage) before ~ 1.4 Ma. All Ne Plots show a tendency to rise after ~ 1.5M, but species-specific increases occur at different rates, possibly corresponding to the growth and dispersion of the population once the ecological conditions allow L & Demographic expansion (14). Given the paleontological evidence of a southern origin of this genus (36), we speculate that the apparent decay more pronounced in Ne For northern clade species related to Southern lineages, about 700,000 to 800,000 years ago, they can reflect bottlenose collisions related to dispersion as the geographic range of baboons spread northward. In the same way, the CoalHMM suggests that the North-South mix that it produces exists P. kindae It occurred about 100,000 years ago, and PSMC results suggest an increase in Ne to P. kindae about that time.
To examine the possible functional consequences of the mixture of baboon, 2201 appropriate genetic regions (local genomic segments containing a protein coding gene annotated each) were investigated and show sufficient phylogenetic signal to support a phylogenetic tree in particular on all alternative trees ). We identify individual loci that present phylogenetic relationships (genetic trees) that are concordant or discordant with the phylogeny of consensus of the species that separates the three northern species from the three southern species. Cluster 1 contains 1143 generic regions with phylogenys that closely match this result (figure S16). Cluster 2 is composed of 629 genetic regions for which P. cynocephalus It carries haplotypes that are not closely related to other clade southern haplotypes (Figure S17). The genes in these regions are enriched by the terms "learning and memory" of the gene ontology (GO) (P = 0.012), "cognition" (P = 0.012), "main development" (P = 0.014) and "development of the brain" (P = 0.017), as well as several GO categories related to the reproduction (see table S10). Cluster 3 includes 429 generic regions that show phylogenetic relationships between southern clad species according to the phylogeny of Figure 3a. However, the cluster 3 haplotypes from the northern clade P. anubis They are more related to southern clade haplotypes, whereas haplotypes in the northern clade P. papio Generally, the sister forms all the other baboon haplotypes (fig. S18). The genes found in cluster 3 regions are enriched by GO terms related to the ontogenetic development of various systems of organs (kidney, heart, circulatory and endocrine systems, all significantly enriched with P <0.03) (table S10). We observe that the two species that present the clearest discrepancies in the genealogical tree in relation to the phylogeny of the species level (that is to say, the species that carry haplotypes that apparently cross the limits of species) are P. anubis i P. cynocephalus, a northern clade and a southern clade species, respectively, that actively hybridize south of Kenya (17) and display evidence of nuclear DNA marshes (12).
Acknowledgments: We recognize the contributions of the production personnel of sequences from the Sequencing Center of the Human Genome: KA Abraham, HA Akbar, SA Ali, UA Anosike, PA Aqrawi, FA Arias, TA Attaway, RA Awwad, CB Babu, DB Bandaranaike, PB Battles, AB Bell, BB Beltran, DB Berhane-Mersha, CB Bess, CB Bickham, TB Bolden, K. Cardenas, KC Carter, M. Cavazos, A. Chandrabose, S. Chao, DC Chau, AC Chávez, R. Chu, KC Clerc -Blankenburg, A. Cockrell, MC Coyle, A. Cree, MD Dao, ML Davila, LD Davy-Carroll, SD Denson, S. Dugan, V. Ebong, S. Elkadiri, SF Fernandez, PF Fernando, N. Flagg , LF Forbes, G. Fowler, CF Francis, LF Francisco, QF Fu, R. Gabisi, RG Garcia, T. Garner, TG Garrett, SG Gross, SG Gubbala, K. Hawkins, B. Hernandez, KH Hirani, MH Hogues , BH Hollins, LJ Jackson, MJ Javaid, JC Jayaseelan, AJ Johnson, BJ Johnson, JJ Jones, VJ Joshi, D. Kalra, JK Kalu, NK Khan, L. K Isamo, LL Lago, Y. Lai, FL Lara, T.-K. Le, F. L. Legall-Iii, S. L. Lemon, L. Lewis, J. L. Liu, Y.-S. Liu, DL Liyanage, P. London, JL Lopez, LL Lorensuhewa, E. Martinez, RM Mata, TM Mathew, T. Matskevitch, CM Mercado, IM Mercado, KM Morales, MM Morgan, MM Munidasa, LN Nazareth, IN Newsham, DN Ngo, LN Nguyen, P. Nguyen, TN Nguyen, NN Nguyen, M. Nwaokelemeh, MO Obregon, GO Okwuonu, FO Ongeri, CO Onwere, IO Osifeso, AP Parra, SP Patil, AP Perez, YP Pérez, CP Pham, E. Primus, L.-L. Pu, M. P. Puazo, J. Q. Quiroz, S. Richards, J. R. Rouhana, M. R. Ruiz, S.-J. Ruiz, N. S. Saada, J. S. Santibanez, M. S. Scheel, S. Scherer, B. S. Schneider, D. S. Simmons, I. S. Sisson, E. S. Skinner, N. Tabassum, L.-Y. Tang, A. Taylor, RT Thornton, JT Tisius, GT Toledanes, ZT Trejos, KU Usmani, RV Varghese, SV Vattathil, VV Vee, DW Walker, GW Weissenberger, CW White, K. Wilczek-Boney, AW Williams, K. Wilson, I. Woghiren, JW Woodworth, RW Wright, Y.-Q. Wu, Y. Xin, Y. Zhang, Y. Z. Zhu, and X. Zou. Biomaterials for reference DNA sequencing P. anubis The baboon and several of the diversity panels were provided by the Southwest National Primate Research Center, San Antonio, TX, with the support of a NIH Sub-Infrastructure Programs grant of Research (P51-OD011133). The research mentioned herein complies with the governmental norms and guidelines and the IACUC. J.R. It is also associated with the Wisconsin National Primate Research Center, Madison, WI. C.K. He is also affiliated with the Institut für Populationsgenetik, Vetmeduni Vienna, Austria and D.S. He is recently affiliated with Eötvös Lorand University in Budapest and Max Perutz Laboratories in Vienna. Financing: The sequencing and analysis activities of the Sequencing Center for Human Genomes, the Baylor Medical School, were supported by the NIH (NHGRI) U54-HG003273 and U54-HG006484 subsidies to R.A.G. and the GAC 1 S10 RR026605 grant to J.G. Reid. This research was also supported by the NIH R01-GM59290 aid to M.A.B .; Scholarships from the Austrian Science Fund (FWF-P24551 and FWF-W1225) and the Vienna Science and Technology Fund (WWTF-MA16-061) at C.K .; Wellcome Trust scholarships (WT108749 / Z / 15 / Z) and EMBL to B.A., F.J.M., and M.M.; Grants VEGA 1/0719/14 and APVV-14-0253 to T. Vinar (Member of the Consortium); MINECO / ERDF Grant, NIH Grant U01-MH106874, Howard Hughes International Career Award and "La Caixa" Social Work Award to T.M.-B.; NSF grants BNS83-03506 to J.P.-C.; NSF1029302 to J.P.-C., J.R., and C.J.J .; BNS96-15150 to J.P.-C., C.J.J., and T.D .; The National Geographic Society and Leakey Foundation give J.P.-C. and C.J.J. E.E.E. He is a researcher at the Howard Hughes Medical Institute. This work was supported, in part, by the NIH grant from the United States HG002385 to E.E.E. Competing interests: The authors declare that they do not have competitive interests. Availability of data and materials: Unprocessed reading data, sample metadata and other data of this genome assembly project are available in Bioproject PRJNA260523 at www.ncbi.nlm.nih.gov. Additional information about the RNA sequencing data is available in the Transcriptome Nonhuman Primate Reference Project (http://nhprpr.org/). You can learn more about the variation of SNV and indel as a monitor of the UCSC browser (https://hgsc.bcm.edu/non-human-primates/baboon-genome-project). Additional data related to this article by the authors may be requested.Full member of the Consortium for the analysis of Baboon genomes:Bronwen Aken1Nicoletta Archidiacono2, Georgios Athanasiadis3, Mark A. Batzer4, Thomas O. Beckstrom4Christina Bergey5.6, Konstantinos Billis1, Andrew Burrell5, Oronzo Capozzi2, Claudia R. Catacchio2, Jade Cheng3, Laura A. Cox7.8, Huyen H. Dinh9, Todd Disotell5, HarshaVardhan Doddapaneni9, Evan E. Eichler10.11, James Else12, Richard A. Gibbs9,13, Matthew W. Hahn14, Yi Han9, R. Alan Harris9,13, John Huddleston10, Shalini N. Jhangiani9, Clifford J. Jolly5, Vallmer E. Jordan4, Anis Karimpour-Fard15, Miriam K. Konkel32, Gisela H. Kopp16.17, Viktoriya Korchina9, Carolin Kosiol18, Maximillian Kothe19, Christie L. Kovar9, Lukas Kuderna20, Sandra L. Lee9, Kalle Leppälä3, Xiaoming Liu21, Yue Liu9, Thomas Mailund3, Tomas Marques-Bonet20,22,23,33, Alessia Marra-Campanale2, Fergal J. Martin1, Christopher E. Mason24, Marc de Manuel Montero20, Matthieu Muffato1Kasper Munch3, Shwetha Murali9, Donna M. Muzny9,13, Angela Noll19, Kymberleigh A. Pagel25, Antonio Palazzo2, Jera Pecotte7, Vikas Pejaver25, Jane Phillips-Conroy26, Lenore Pipes24, Veronica Searles Quick15, Predrag Radivojac25, Archana Raja10, Brian J. Raney27, Muthuswamy Raveendran9, Karen Rice7, Mariano Rocchi2, Jeffrey Rogers9,13, Christian Roos19, Mikkel Heide Schierup3, Dominik Schrempf28, James M. Sikela15, Roscoe Stanyon29, Cody J. Steely4, Gregg W. C. Thomas14, Jenny Tung30, Mario Ventura2, Tauras P. Vilgalys30, Tomás Vinar31, Jerilyn A. Walker4, Lutz Walter19, Kim C. Worley9,13, and Dietmar Zinner16.1Laboratory of European Molecular Biology, European Bioinformatics Institute, Hinxton, United Kingdom. 2Department of Biology, University of Bari, Bari, Italy. 3Center for Research in Bioinformatics, University of Aarhus, Aarhus, Denmark. 4Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, USA. 5Department of Anthropology, University of New York, New York, NY, USA. 6Department of Biological Sciences, University of Notre Dame, South Bend, IN, USA. 7Southwest National Primate Research Center, Texas Biomedical Research Institute, San Antonio, TX, USA. 8Department of Genetics, Texas Biomedical Research Institute, San Antonio, TX, USA. 9Center for the Sequencing of Human Genomes, Baylor Medical College, Houston, TX, USA. 10Department of Genome Sciences, University of Washington, Seattle, WA, USA. 11Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA. 12Department of Pathology and Medicine of the Laboratory and Center for Primary Research Yerkes, Emory University, Atlanta, GA, USA. 13Department of Human and Molecular Genetics, Medical School of Baylor, Houston, TX, USA. 14Department of Biology, University of Indiana, Bloomington, IN, USA. 15Department of Biochemistry and Molecular Genetics, University of Colorado Anschutz Medical Campus, Denver, CO, USA. 16Laboratory of Cognitive Ethology, German Primate Center, Leibniz Institute for the search of primates, Göttingen, Germany. 17Department of Biology, University of Constance, Constance, Germany. 18Center for Biological Diversity, School of Biology, St. Andrews, United Kingdom. 19Primate Genetics Laboratory, German Primate Center, Leibniz Research Institute, Primate, Göttingen, Germany. 20Institute of Evolutionary Biology (UPF-CSIC), PRBB, Barcelona, Spain. 21School of Public Health, Health Sciences Center, University of Texas, Houston, TX, USA. 22Catalan Institution of Research and Advanced Studies (ICREA), Barcelona, Spain. 23CNAG-CRG, Center for Genomic Regulation, Institute of Science and Technology of Barcelona, Barcelona, Spain. 24Department of Physiology and Biophysics, Weill Cornell Medical College, New York, NY, USA, and Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute of Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA. 25Department of Computer Science and Information Technology, University of Indiana, Bloomington, IN, USA. 26Department of Neurosciences, Faculty of Medicine, University of Washington, Sant Lluís. MO, USA, and the Department of Anthropology of the University of Washington, Seattle, WA, USA. 27Institute of Genomics, University of California, Santa Cruz, CA, USA. 28Institut für Populationsgenetik, Veterinärmedizinische Universität Wien, Vienna, Austria. 29Department of Biology, University of Florence, Florence, Italy. 30Department of Evolutionary Anthropology, Duke University, Durham, NC, USA. 31Faculty of Mathematics, Physics and Computing, Comenius University, Bratislava, Slovakia. 32Department of Genetics and Biochemistry, Clemson University, Clemson, SC, USA. 33Catalan Institute of Palaeontology Miquel Crusafont, Autonomous University of Barcelona, Barcelona, Spain.Contributions of the members of the consortium:He designed the study and supervised the analysis: J. Rogers *, K. C. Worley and R. A. Gibbs. Managed or supervised for the production of the sequence: D.M. Muzny *, C.L. Kovar, H. H. Dinh, and Y. Han. Managed or supervised for the preparation of sequencing libraries: H. Doddapaneni *, S. Lee and D.M. Muzny. Produced the assembly: K. C. Worley *, Y. Liu, S. Murali, and R. A. Harris. Project and data management: D. M. Muzny *, M. Raveendran, R. A. Harris, K. C. Worley, S. N. Jhangiani, V. Korchina, C. Kovar. Genomic Annotation: B. Aken *, F. J. Martin, M. Muffato, K. Billis and X. Liu. Alu Repetition analysis: M. A. Batzer *, J. A. Walker, M. K. Konkel, V. E. Jordan, C. J. Steely, and T. O. Beckstrom. SNV and Indel analysis: R. A. Harris and M. Raveendran. Admixture and phylogenetic analysis: T. Mailund, M. H. Schierup, K. Leppälä, J. Cheng, K. Munch and G. Athanasiadis. Phylogenetic and population analysis: C. Bergey, A. Burrell, A. Noll, D. Schrempf, C. Kosiol, GH Kopp, G. Athanasiadis, K. Munch, J. Phillips-Conroy, M. Kothe, T. Disotell, J. Tung, J. Rogers, CJ Jolly, D. Zinner and C. Roos. Cytogenetic and assembly validation: M. Rocchi *, R. Stanyon, E. E. Eichler, N. Archidiacono, A. Palazzo and O. Capozzi. Family Analysis of Genes: M. W. Hahn *, J. Sikela *, G. W. C. Thomas, V. Searles Quick, A. Karimpour-Fard and L. Walter. Methylation analysis: J. Tung * and T. P. Vilgalys. Positive selection analysis: C. Kosiol *, T. Vinar *, and B. J. Raney. Post-translational modifications: P. Radivojac *, K. A. Pagel and V. Pejaver. Segmented duplication analysis: E. E. Eichler *, M. Ventura, A. Raja, C. Catacchio, A. Marra-Campanale and J. Huddleston. Variation of copy number: T. Marques-Bonet *, L. Kuderna and M. d. M. Montero. Transcriptome analysis: C. E. Mason * and L. Pipes. Provided the essential biomaterials: K. Rice, J. Pecotte, J. Phillips-Conroy, C. J. Jolly, J. Rogers, J. Else and L. A. Cox. Texts and / or data provided: D. Zinner, C. Roos, T. Mailund, K. Leppälä, E. Eichler, G. Athanasiadis, J. Cheng, K. Munch, C. Kosiol, C. Bergey, A. Burrell, MK Konkel, JA Walker, M. Batzer and J. Tung. The document was written: J. Rogers *, C. J. Jolly, J. Tung, M. Hahn, D. Zinner, C. Roos, T. Marques-Bonet and K. C. Worley. * Group leader.