BIOLOGICAL INFORMATICS

Group leader: Mark Wilkinson - Investigador senior programa Isaac Peral

This email address is being protected from spambots. You need JavaScript enabled to view it. +34 913364592 / 914524900 Ext:25550 ( Lab Bioinformática )

Personnel:

 

Visit our lab homepage for more details


 

Mad about Metadata
Meddling in Microbiomes
Peeking into Polyadenylation
 

FAIR Data - Findable Accessible Interoperable and Reusable Our lab are lead participants in the FAIR Data initiative. In addition to being lead-authors of the FAIR Principles, we are also lead authors on the first end-to-end implementation of those principles over an agriculturally relevant data source, we are lead authors on a set of objective Metrics for measuring the FAIRness of a resource, and we are also the lead laboratory creating software capable of autonomously executing FAIR evaluations, and scoring resourcesbased on the level of FAIRness they have achieved. In addition, we are exploring how these principles can be used to make science more transparent. When data and knowledge is FAIR, it becomes easier to find, and therefore easier to validate against prior biological knowledge and data. We examine how FAIR publication of scientific assertions might be automatically compared to similar assertions in the scholarly literature, providing a means to both explorethe liklihood of truth of a given assertion, as well as provide a richer collection of citations, ensuring that all relevant scholars are properly credited.

On top of the natural complexity of biological data, the tools we use to analyze those data must also be FAIR. Our two technologies - SADI and SHARE - are the 'bridge' between FAIR and traditional bioinformatics analysis tools and pipelines.

 

 

 

 

Our SADI (Semantic Automated Discovery and Integration) project extends these core principles into the domain of analytical algorithms and tools. SADI requires that every analytical tool must describe (using semantic technologies) the kinds of biological entities it is capable of analysing, and what inter-entity relationships it discovers as a result of its analysis.This then allows machines to automatically match any given dataset, with a set of tools capable of analysing that dataset to generate a biological relationship of interest to the researcher. Our SHARE (Semantic Health and Research Environment) proposes that scientific hypotheses can be formally modeled using the OWL languge. SHARE then automatically "pipelines" analytical tools together to generating a result dataset that attempts to find data that support the modeled hypothesis. SADI and SHARE are the components we are offering as partial solutions to the FARM Data Train and Personal Health Train initiatives, being led from our collaborators in the Netherlands.

 

 

 

MDPBiome - Artificial Intelligence applied to Microbiome Engineering Our recently published algorithm MDPBiome is a methodology for guiding the evolution of amicrobiome through successive perturbations, where the algorithm calculates the most likely response of the microbiometo that intervention. The latest studies on the dynamics of the microbiome highlight that it is currently not possibleto predict the effect on a complex microbial community of a specific external perturbation. MDPbiome contributes to addressing this challenge, modeling the effect of perturbations in a microbiome over timeas a Markov Decision Process (MDP). Given an initial microbial composition, in any ecological niche or cavity, MDPbiome suggests the sequence of external disturbances that will guide/modulate the microbiome towards an objective state, such as a healthier or more performant composition; as well as avoiding undesirable states, such asthose associated with a pathology. The study demonstrates the flexibility of MDPbiome applied tovarying sets of longitudinal microbiome data where meta-data on disturbances were known (knowledge that is not usually collected and / or published). Measures are also provided to evaluate the performance in terms of reliability and universality of the recommendations proposed by MDPbiome in each case. The potential of MDPbiome will improve in the coming years, as the availability of longitudinal microbiome datasets, and the rich metadata associated with them, increases. Microbial communities associated with plants are also amenable to this approach, to improve their health or nutrition through MDPbiome recommendations, for example, by optimizing soil fertility or proposing low impactpolicies to develop a sustainable agriculture.

 

 

 

Fungal Polyadenylation - evolution and relationship to virulence Our recently published paper describes the association between alternative polyadenylation sites and fungal pathogenicity for the important rice and wheat patogen Magnaporhe oryza. The study has been executed in collaboration with the laboratory of Dr. Ane Sesma at the CBGP-UPM. The results show that variation of the 3' ends of mRNAs of the rice blast fungus Magnaporthe oryzaecan alter the biology of the fungus. Alternative polyadenylation regulates the 3' UTR lengths of cellular RNAs, and consequently the presence of regulatory elements that can control rice infection pathways in the case of the phytopathogenic fungus Magnaporthe oryzae. Using a genome-wide sequencing approach to execute a global characterization of 3' mRNA ends, we have identified polyadenylation signals for the majority of expressed genes, including a large number of transcripts that could be alternatively polyadenylated. In particular, we examine how the non-coding 3' end of the messenger RNA of the 14-3-3 protein regulates M. oryzae virulence. We also describe how alternative polyadenylation can control turnover and translation rates of messenge rRNAs involved in development and environmental adaptation in M. oryzae. Our characterization of the canonical polyadenylation signalin Magnaporthe provide useful information forenhancing genome annotations and for cross-species comparisons of polyadenylation sites (PAS) and PAS usage within the fungal kingdom and the tree of life.

 

Representative Publications

Marconi, M., Sesma, A., Rodríguez-Romero, J.L., Rosano González, M.L., Wilkinson, M.D. 2018. Genome-wide polyadenylation site mapping datasets in the rice blast fungus Magnaporthe oryzae. Scientific Data 5, 180271. DOI: 10.1038/sdata.2018.271

García-Jiménez, B; de la Rosa, T; Wilkinson, MD. 2018. "MDPbiome: microbiome engineering through prescriptive perturbations". Bioinformatics. DOI: 10.1093/bioinformatics/bty562".

Rodríguez-Romero, J; Marconi, M; Ortega-Campayo, V; Demuez, M; Wilkinson, MD; Sesma, A. 2018. "Virulence- and signaling-associated genes display a preference for long 3′UTRs during rice infection and metabolic stress in the rice blast fungus". New Phytologist. DOI: 10.1111/nph.15405".

Wilkinson, MD; Sansone, S-A; Schultes, E; Doorn, P; Bonino da Silva Santos, LO; Dumontier, M. 2018. "A design framework and exemplar metrics for FAIRness". Scientific Data. DOI: 10.1038/sdata.2018.118".

Townend, GS; Ehrhart, F; Kranen, HJ; Wilkinson, M; Jacobsen, A; Roos, M; Willighagen, EL; Enckevort, D; Evelo, CT; Curfs, LMG. "MECP2 variation in Rett syndrome ‐ an overview of current coverage of genetic and phenotype data within existing databases". Human Mutation. DOI: 10.1002/humu.23542".

Roos, M; López Martin, E; Wilkinson, MD. 2017. "Preparing Data at the Source to Foster Interoperability across Rare Disease Resources", p. 165-179. In M. Posada de la Paz, D. Taruscio, and S. C. Groft (eds.), Rare Diseases Epidemiology: Update and Overview. Springer International Publishing, Cham. DOI: 10.1007/978-3-319-67144-4_9".

Illana, A; Marconi, M; Rodríguez-Romero, J; Xu, P; Dalmay, T; Wilkinson, MD; Ayllón, MÁ; Sesma, A. 2017. "Molecular characterization of a novel ssRNA ourmia-like virus from the rice blast fungus Magnaporthe oryzae". Archives of Virology. DOI: 10.1007/s00705-016-3144-9".

Wilkinson, MD; Verborgh, R; Bonino da Silva Santos, LO; Clark, T; Swertz, MA; Kelpin, FDL; Gray, AJG; Schultes, EA; van Mulligen, EM; Ciccarese, P; Kuzniar, A; Gavai, A; Thompson, M; Kaliyaperumal, R; Bolleman, JT; Dumontier, M. 2017. "Interoperability and FAIRness through a novel combination of Web technologies". PeerJ Computer Science. DOI: 10.7717/peerj-cs.110".

Mons, B; Neylon, C; Velterop, J; Dumontier, M; da Silva Santos, LOB; Wilkinson, MD. 2017. "Cloudy, increasingly FAIR; revisiting the FAIR Data guiding principles for the European Open Science Cloud". Information Services & Use. DOI: 10.3233/isu-170824".

Rodríguez Iglesias, A; Rodríguez González, A; Irvine, AG; Sesma, A; Urban, M; Hammond-Kosack, KE; Wilkinson, MD. 2016. "Publishing FAIR Data: an exemplar methodology utilizing PHI-base". Frontiers in Plant Science. DOI: 10.3389/fpls.2016.00641".

Wilkinson, MD; Dumontier, M; Aalbersberg, IJ; Appleton, G; Axton, M; Baak, A; Blomberg, N; Boiten, J-W; da Silva Santos, LB; Bourne, PE; Bouwman, J; Brookes, AJ; Clark, T; Crosas, M; Dillo, I; Dumon, O; Edmunds, S; Evelo, CT; Finkers, R; Gonzalez-Beltran, A; Gray, AJG; Groth, P; Goble, C; Grethe, JS; Heringa, J; ’t Hoen, PAC; Hooft, R; Kuhn, T; Kok, R; Kok, J; Lusher, SJ; Martone, ME; Mons, A; Packer, AL; Persson, B; Rocca-Serra, P; Roos, M; van Schaik, R; Sansone, S-A; Schultes, E; Sengstag, T; Slater, T; Strawn, G; Swertz, MA; Thompson, M; van der Lei, J; van Mulligen, E; Velterop, J; Waagmeester, A; Wittenburg, P; Wolstencroft, K; Zhao, J; Mons, B. 2016. "The FAIR Guiding Principles for scientific data management and stewardship". Scientific Data. DOI: 10.1038/sdata.2016.18".

Aranguren, ME; Wilkinson, MD. 2015. "Enhanced reproducibility of SADI web service workflows with Galaxy and Docker". GigaScience. DOI: 10.1186/s13742-015-0092-3".

Nakada, T; Boyd, JH; Russell, JA; Aguirre-Hernández, R; Wilkinson, MD; Thair, SA; Nakada, E; McConechy, MK; Fjell, CD; Walley, KR. 2015. "VPS13D gene variant is associated with altered IL-6 production and mortality in septic shock". Journal of Innate Immunity. DOI: 10.1159/000381265".

Pawluczyk, M; Weiss, J; Links, MG; Egaña Aranguren, M; Wilkinson, MD; Egea-Cortines, M. 2015. "Quantitative evaluation of bias in PCR amplification and next-generation sequencing derived from metabarcoding samples". Analytical and Bioanalytical Chemistry. DOI: 10.1007/s00216-014-8435-y".

Marconi, M; Rodriguez-Romero, J; Sesma, A; Wilkinson, MD. 2014. "Bioinformatics tools for Next-Generation RNA sequencing analysis ", p. 371-391. In A. Sesma and T. von der Haar (eds.), Fungal RNA Biology. Springer International Publishing Switzerland. DOI: 10.1007/978-3-319-05687-6_15".

Katayama T; Wilkinson M; Aoki-Kinoshita K; Kawashima S; Yamamoto Y; Yamaguchi A; Okamoto S; Kawano S; Kim J-D; Wang Y; Wu H; Kano Y; Ono H; Bono H; Kocbek S; Aerts J; Akune Y; Antezana E; Arakawa K; Aranda B; Baran J; Bolleman J; Bonnal R; Buttigieg P; Campbell M; Chen Y; Chiba H; Cock P; Cohen K; Constantin A. 2014. "BioHackathon series in 2011 and 2012: penetration of ontology and linked data in life science domains". J. Biomed. Semantics 5:5.

Dumontier, M; Baker, C; Baran, J; Callahan, A; Chepelev, L; Cruz-Toledo, J; Del Rio, N; Duck, G; Furlong, L; Keath, N; Klassen, D; McCusker, J; Queralt-Rosinach, N; Samwald, M; Villanueva-Rosales, N; Wilkinson, M; Hoehndorf, R. 2014. "The Semanticscience Integrated Ontology (SIO) for biomedical research and knowledge discovery". Journal of Biomedical Semantics 5:14.

Samadian S; McManus B; Wilkinson M. 2014. "Automatic detection and resolution of measurement-unit conflicts in aggregated data". BMC Med. Genomics 7:S12.

Egana Aranguren, M; Rodriguez Gonzalez, A; Wilkinson, MD. 2014. "Executing SADI services in Galaxy". Journal of Biomedical Semantics. DOI: 10.1186/2041-1480-5-42".

Luciano, JS; Cumming, GP; Kahana, E; Wilkinson, MD; Brooks, EH; Jarman, H; McGuinness, DL; Levine, MS. 2014. "Health Web Science". Foundations and Trends® in Web Science. DOI: 10.1561/1800000019".

Rodríguez González, A; Callahan, A; Cruz-Toledo, J; Garcia, A; Egaña Aranguren, M; Dumontier, M; Wilkinson, M. 2014. "Automatically exposing OpenLifeData via SADI semantic Web Services". Journal of Biomedical Semantics. DOI: 10.1186/2041-1480-5-46

Egaña Aranguren, M; Fernández-Breis, JT; Antezana, E; Mungall, C; Rodríguez González, A; Wilkinson, MD. 2013. "OPPL-Galaxy, a Galaxy tool for enhancing ontology exploitation as part of bioinformatics workflows". Journal of Biomedical Semantics. DOI: 2041-1480-4-2 [pii] 10.1186/2041-1480-4-2".

Katayama, T; Wilkinson, MD; Micklem, G; Kawashima, S; Yamaguchi, A; Nakao, M; Yamamoto, Y; Okamoto, S; Oouchida, K; Chun, HW; Aerts, J; Afzal, H; Antezana, E; Arakawa, K; Aranda, B; Belleau, F; Bolleman, J; Bonnal, RJ; Chapman, B; Cock, PJ; Eriksson, T; Gordon, PM; Goto, N; Hayashi, K; Horn, H; Ishiwata, R; Kaminuma, E; Kasprzyk, A; Kawaji, H; Kido, N; Kim, YJ; Kinjo, AR; Konishi, F; Kwon, KH; Labarga, A; Lamprecht, AL; Lin, Y; Lindenbaum, P; McCarthy, L; Morita, H; Murakami, K; Nagao, K; Nishida, K; Nishimura, K; Nishizawa, T; Ogishima, S; Ono, K; Oshita, K; Park, KJ; Prins, P; Saito, TL; Samwald, M; Satagopam, VP; Shigemoto, Y; Smith, R; Splendiani, A; Sugawara, H; Taylor, J; Vos, RA; Withers, D; Yamasaki, C; Zmasek, CM; Kawamoto, S; Okubo, K; Asai, K; Takagi, T. 2013. "The 3rd DBCLS BioHackathon: improving life science data integration with Semantic Web technologies". Journal of Biomedical Semantics. DOI: 2041-1480-4-6 [pii] 10.1186/2041-1480-4-6".

Luciano, JS; Cumming, GP; Wilkinson, MD; Kahana, E. 2013. "The emergent discipline of health web science". Journal of Medical Internet Research. DOI: v15i8e166 [pii] 10.2196/jmir.2499".

McCarthy, L; Vandervalk, B; Wilkinson, M. 2012. "SPARQL Assist language-neutral query composer" BMC bioinformatics, vol. 13 Suppl 1, no. Suppl 1, p. S2.

Samadian, S; McManus, B; Wilkinson, M.D. 2012. "Extending and encoding existing biological terminologies and datasets for use in the reasoned semantic web" Journal of biomedical semantics, vol. 3, no. 1, p. 6, Jul.

Rodríguez-González, A; Torres-Niño, J; Mayer, M. A; Alor-Hernandez, G; Wilkinson, M.D. 2012. "Analysis of a multilevel diagnosis decision support system and its implications: a case study" Computational and Mathematical Methods in Medicine, vol. 2012, pp. 1-9.

Wood, I; Vandervalk, B; McCarthy, L; Wilkinson, M. 2012. "OWL-DL Domain-Models as Abstract Workflows" in Leveraging Applications of Formal Methods, Verification and Validation. Applications and Case Studies, T. Margaria and B. Steffen, Eds. Berlin/Heidelberg: Springer, pp. 56-66.

Centre for Plant Biotechnology and Genomics UPM – INIA Parque Científico y Tecnológico de la U.P.M. Campus de Montegancedo
Autopista M-40, Km 38 - 28223 Pozuelo de Alarcón (Madrid) Tel.: +34 91 4524900 ext. 1806 / +34 91 3364539 Fax: +34 91 7157721. Location and Contact

Síguenos en: