COMPUTATIONAL EVOLUTIONARY GENOMICS

Group leader: Jaime Huerta-Cepas - Researcher INIA

This email address is being protected from spambots. You need JavaScript enabled to view it. 913364556 / 913364556 ( Lab B30A)


Lab Web Page: http://compgenomics.org

Personnel:

Research description:

The Computational Evolutionary Genomics group at CBGP develops comparative (meta-)genomic methods to decipher what makes each organism and microbial ecosystem unique. In particular, we use phylogenomic techniques to study processes such as gene sub/neo-functionalization, duplication, horizontal transfer, domain conservation or orthology detection. At the metagenomic scale, we are interested in the functional characterization of microbial communities as a whole, aiming at the the identification of functional modules associated with environmental or host conditions. For this, we combine theoretical knowledge in evolutionary biology, sequencing data, and high performance computational resources.

Research lines:

1. Comparative metagenomics

We analyze shotgun metagenomics data (soil, ocean, gut, etc.) to identify functional modules within microbial communities that might differentiate sample or environmental conditions. We are particularly interested in exploring the unknown fraction of the those data (i.e. sequences with no homologs), currently accounting for 20-50% of the sequenced genes and transcripts. Our ultimate goals are i) understanding the interactions of microbial communities with their environments, ii) identifying functional modules that can function as predictors for specific environmental conditions (Fig.1) , and iii) discovering novel gene functions with potential applications in biotechnology (i.e. novel enzymes).
 

Fig1. Correlation between nitrogen concentration and relative abundance of a novel gene family found in ocean metagenomic samples.
 

Phylogenetic diversity within microbial communities

Metagenomics data are incomplete, noisy and quite challenging for classic evolutionary analysis. We pursue a better insight on microbial (prok- and eukaryotic) biodiversity, as well as the implementation of bioinformatic methods to identify pathogenic organisms in both agricultural environments and human health. To do so we work on the implementation and further application of phylogenetic methods for taxonomic identification of metagenomic species (Fig. 2), integration of pan-genomic data, and strain resolution.
 

Fig 2. Phylogenetic tree based on markers genes from ~7000 prokaryotic organisms, including known and unknown metagenomic species. Different colors identify distinct clades. The phylogenetic distribution of several ecological traits is shown in the outer circles, some of them correlating with specific taxonomic clades.
 

Evolution at the gene family level

We are interested in different aspects of gene family evolution, such as dating the emergence of specific functions, studying gene duplication, identifying horizontal gene transfers, or characterizing gene fusion events. We are specialized in large scale phylogenomic analysis, where hundreds of genomes can be compared at once. We apply those techniques to gain insights about the evolution of gene function and its practical application in establishing genotype-phenotype associations in plant breeding programs.
 

Fig 3. Example of different evolutionary analyses at the gene family level. A) Gene phylogeny showing conserved regions in the protein alignment B) Adaptation test detecting branch and site selection pressure in gene family evolution C) Species tree where rate of gene duplications (blue bubbles) are shown for each internal lineage D) Orthology prediction using phylogenetic and domain analysis.
 

Phylogenomic Methods and Tools

We develop functional prediction methods, metagenomic analysis frameworks, orthology resources and genomic databases. Those tools are the result of our own needs, but we also work on providing open source implementations for the researcher community.
 

Representative Publications

Bahram, M; Hildebrand, F; Forslund, SK; Anderson, JL; Soudzilovskaia, NA; Bodegom, PM; Bengtsson-Palme, J; Anslan, S; Coelho, LP; Harend, H; Huerta-Cepas, J; Medema, MH; Maltz, MR; Mundra, S; Olsson, PA; Pent, M; Põlme, S; Sunagawa, S; Ryberg, M; Tedersoo, L; Bork, P. 2018. "Structure and function of the global topsoil microbiome". Nature. DOI: 10.1038/s41586-018-0386-6".

Forslund, K; Pereira, C; Capella-Gutierrez, S; Da Silva, AS; Altenhoff, A; Huerta-Cepas, J; Muffato, M; Patricio, M; Vandepoele, K; Ebersberger, I; Blake, J; Fernández Breis, JT; Boeckmann, B; Gabaldón, T; Sonnhammer, E; Dessimoz, C; Lewis, S. 2018. "Gearing up to handle the mosaic nature of life in the quest for orthologs". Bioinformatics. DOI: 10.1093/bioinformatics/btx542".

Mende, DR; Letunic, I; Huerta-Cepas, J; Li, SS; Forslund, K; Sunagawa, S; Bork, P. 2017. "ProGenomes: A resource for consistent functional and taxonomic annotations of prokaryotic genomes". Nucleic Acids Research. DOI: 10.1093/nar/gkw989".

Jouhten, P; Huerta-Cepas, J; Bork, P; Patil, KR. 2017. "Metabolic anchor reactions for robust biorefining". Metabolic Engineering. DOI: 10.1016/j.ymben.2017.02.010".

Huerta-Cepas, J; Forslund, K; Coelho, LP; Szklarczyk, D; Jensen, LJ; Von Mering, C; Bork, P. 2017. "Fast genome-wide functional annotation through orthology assignment by eggNOG-mapper". Molecular Biology and Evolution. DOI: 10.1093/molbev/msx148".

Czech, L; Huerta-Cepas, J; Stamatakis, A. 2017. "A critical review on the use of support values in tree viewers and bioinformatics toolkits". Molecular Biology and Evolution. DOI: 10.1093/molbev/msx055".

Costea, PI; Coelho, LP; Sunagawa, S; Munch, R; Huerta-Cepas, J; Forslund, K; Hildebrand, F; Kushugulova, A; Zeller, G; Bork, P. 2017. "Subspecies in the global human gut microbiome". Molecular Systems Biology. DOI: 10.15252/msb.20177589".

Li, SS; Zhu, A; Benes, V; Costea, PI; Hercog, R; Hildebrand, F; Huerta-Cepas, J; Nieuwdorp, M; Salojärvi, J; Voigt, AY; Zeller, G; Sunagawa, S; De Vos, WM; Bork, P. 2016. "Durable coexistence of donor and recipient strains after fecal microbiota transplantation". Science. DOI: 10.1126/science.aad8852".

Kultima, JR; Coelho, LP; Forslund, K; Huerta-Cepas, J; Li, SS; Driessen, M; Voigt, AY; Zeller, G; Sunagawa, S; Bork, P. 2016. "MOCAT2: A metagenomic assembly, annotation and profiling framework". Bioinformatics. DOI: 10.1093/bioinformatics/btw183".

Huerta-Cepas, J; Szklarczyk, D; Forslund, K; Cook, H; Heller, D; Walter, MC; Rattei, T; Mende, DR; Sunagawa, S; Kuhn, M; Jensen, LJ; Von Mering, C; Bork, P. 2016. "EGGNOG 4.5: A hierarchical orthology framework with improved functional annotations for eukaryotic, prokaryotic and viral sequences". Nucleic Acids Research. DOI: 10.1093/nar/gkv1248".

Huerta-Cepas, J; Serra, F; Bork, P. 2016. "ETE 3: Reconstruction, Analysis, and Visualization of Phylogenomic Data". Molecular Biology and Evolution. DOI: 10.1093/molbev/msw046".

Altenhoff, AM; Boeckmann, B; Capella-Gutierrez, S; Dalquen, DA; DeLuca, T; Forslund, K; Huerta-Cepas, J; Linard, B; Pereira, C; Pryszcz, LP; Schreiber, F; Da Silva, AS; Szklarczyk, D; Train, CM; Bork, P; Lecompte, O; Von Mering, C; Xenarios, I; Sjölander, K; Jensen, LJ; Martin, MJ; Muffato, M; Gabaldón, T; Lewis, SE; Thomas, PD; Sonnhammer, E; Dessimoz, C. 2016. "Standardized benchmarking in the quest for orthologs". Nature Methods. DOI: 10.1038/nmeth.3830".

Szklarczyk, D; Franceschini, A; Wyder, S; Forslund, K; Heller, D; Huerta-Cepas, J; Simonovic, M; Roth, A; Santos, A; Tsafou, KP; Kuhn, M; Bork, P; Jensen, LJ; Von Mering, C. 2015. "STRING v10: Protein-protein interaction networks, integrated over the tree of life". Nucleic Acids Research. DOI: 10.1093/nar/gku1003".

Minguez, P; Letunic, I; Parca, L; Garcia-Alonso, L; Dopazo, J; Huerta-Cepas, J; Bork, P. 2015. "PTMcode v2: A resource for functional associations of post-translational modifications within and between proteins". Nucleic Acids Research. DOI: 10.1093/nar/gku1081".

Djuika, CF; Huerta-Cepas, J; Przyborski, JM; Deil, S; Sanchez, CP; Doerks, T; Bork, P; Lanzer, M; Deponte, M. 2015. "Prokaryotic ancestry and gene fusion of a dual localized peroxiredoxin in malaria parasites". Microbial Cell. DOI: 10.15698/mic2015.01.182".

Boeckmann, B; Marcet-Houben, M; Rees, JA; Forslund, K; Huerta-Cepas, J; Muffato, M; Yilmaz, P; Xenarios, I; Bork, P; Lewis, SE; Gabaldón, T. 2015. "Quest for Orthologs Entails Quest for Tree of Life: In Search of the Gene Stream". Genome Biology and Evolution. DOI: 10.1093/gbe/evv121".

Powell, S; Forslund, K; Szklarczyk, D; Trachana, K; Roth, A; Huerta-Cepas, J; Gabaldón, T; Rattei, T; Creevey, C; Kuhn, M; Jensen, LJ; Von Mering, C; Bork, P. 2014. "EggNOG v4.0: Nested orthology inference across 3686 organisms". Nucleic Acids Research. DOI: 10.1093/nar/gkt1253".

Morente, V; Pérez-Sen, R; Ortega, F; Huerta-Cepas, J; Delicado, EG; Miras-Portugal, MT. 2014. "Neuroprotection elicited by P2Y13 receptors against genotoxic stress by inducing DUSP2 expression and MAPK signaling recovery". Biochimica et Biophysica Acta - Molecular Cell Research. DOI: 10.1016/j.bbamcr.2014.05.004".

Jarvis, ED; Mirarab, S; Aberer, AJ; Li, B; Houde, P; Li, C; Ho, SYW; Faircloth, BC; Nabholz, B; Howard, JT; Suh, A; Weber, CC; Da Fonseca, RR; Li, J; Zhang, F; Li, H; Zhou, L; Narula, N; Liu, L; Ganapathy, G; Boussau, B; Bayzid, MS; Zavidovych, V; Subramanian, S; Gabaldón, T; Capella-Gutiérrez, S; Huerta-Cepas, J; Rekepalli, B; Munch, K; Schierup, M; Lindow, B; Warren, WC; Ray, D; Green, RE; Bruford, MW; Zhan, X; Dixon, A; Li, S; Li, N; Huang, Y; Derryberry, EP; Bertelsen, MF; Sheldon, FH; Brumfield, RT; Mello, CV; Lovell, PV; Wirthlin, M; Schneider, MPC; Prosdocimi, F; Samaniego, JA; Velazquez, AMV; Alfaro-Núñez, A; Campos, PF; Petersen, B; Sicheritz-Ponten, T; Pas, A; Bailey, T; Scofield, P; Bunce, M; Lambert, DM; Zhou, Q; Perelman, P; Driskell, AC; Shapiro, B; Xiong, Z; Zeng, Y; Liu, S; Li, Z; Liu, B; Wu, K; Xiao, J; Yinqi, X; Zheng, Q; Zhang, Y; Yang, H; Wang, J; Smeds, L; Rheindt, FE; Braun, M; Fjeldsa, J; Orlando, L; Barker, FK; Jønsson, KA; Johnson, W; Koepfli, KP; O'Brien, S; Haussler, D; Ryder, OA; Rahbek, C; Willerslev, E; Graves, GR; Glenn, TC; McCormack, J; Burt, D; Ellegren, H; Alström, P; Edwards, SV; Stamatakis, A; Mindell, DP; Cracraft, J, et al. 2014. "Whole-genome analyses resolve early branches in the tree of life of modern birds". Science. DOI: 10.1126/science.1253451".

Huerta-Cepas, J; Marcet-Houben, M; Gabaldón, T. 2014. "A nested phylogenetic reconstruction approach provides scalable resolution in the eukaryotic Tree Of Life". PeerJ Preprints. DOI: 10.7287/peerj.preprints.223v1".

Huerta-Cepas, J; Capella-Gutiérrez, S; Pryszcz, LP; Marcet-Houben, M; Gabaldón, T. 2014. "PhylomeDB v4: Zooming into the plurality of evolutionary histories of a genome". Nucleic Acids Research. DOI: 10.1093/nar/gkt1177".

Bock, T; Chen, WH; Ori, A; Malik, N; Silva-Martin, N; Huerta-Cepas, J; Powell, ST; Kastritis, PL; Smyshlyaev, G; Vonkova, I; Kirkpatrick, J; Doerks, T; Nesme, L; Baßler, J; Kos, M; Hurt, E; Carlomagno, T; Gavin, AC; Barabas, O; Müller, CW; Noort, VV; Beck, M; Bork, P. 2014. "An integrated approach for genome annotation of the eukaryotic thermophile Chaetomium thermophilum". Nucleic Acids Research. DOI: 10.1093/nar/gku1147".

Jiménez-Guri, E; Huerta-Cepas, J; Cozzuto, L; Wotton, KR; Kang, H; Himmelbauer, H; Roma, G; Gabaldón, T; Jaeger, J. 2013. "Comparative transcriptomics of early dipteran development". BMC Genomics. DOI: 10.1186/1471-2164-14-123".

Huerta-Cepas, J; Dopazo, J; Huynen, MA; Gabaldón, T. 2011. "Evidence for short-time divergence and long-time conservation of tissue-specific expression after gene duplication". Briefings in Bioinformatics. DOI: 10.1093/bib/bbr022".

Huerta-Cepas, J; Gabaldón, T. 2011. "Assigning duplication events to relative temporal scales in genome-wide studies". Bioinformatics. DOI: 10.1093/bioinformatics/btq609".

Huerta-Cepas, J; Marcet-Houben, M; Pignatelli, M; Moya, A; Gabaldón, T. 2010. "The pea aphid phylome: A complete catalogue of evolutionary histories and arthropod orthology and paralogy relationships for Acyrthosiphon pisum genes". Insect Molecular Biology. DOI: 10.1111/j.1365-2583.2009.00947.x".

Huerta-Cepas, J; Dopazo, H; Dopazo, J; Gabaldón, T. 2007. "The human phylome". Genome Biology. DOI: 10.1186/gb-2007-8-6-r109".

 

Centre for Plant Biotechnology and Genomics UPM – INIA Parque Científico y Tecnológico de la U.P.M. Campus de Montegancedo
Autopista M-40, Km 38 - 28223 Pozuelo de Alarcón (Madrid) Tel.: +34 91 4524900 ext. 1806 / +34 91 3364539 Fax: +34 91 7157721. Location and Contact

Síguenos en: