BIOLOGICAL INFORMATICS
Group leader: Mark Wilkinson - Investigador senior programa Isaac Peral
This email address is being protected from spambots. You need JavaScript enabled to view it.
910679196 (Office B31 )
910679117 (Lab B30)
Personnel:
- Alarcón Moreno, Pablo - Postdoctoral Fellow
- Camara Ballesteros, Alberto - PhD Student
- Cuevas Zuviría, Bruno - Postdoctoral Fellow
- Curiel Manzanas, Sara - PhD Student
- Tokareva, Victoria - Visiting Scientist
Visit our lab homepage for more details
FAIR Data
In the ~decade since their publication, the FAIR Data Principles have changed the global landscape for scholarly data sharing and publication. We were lead-authors of the FAIR Principles, and of the first end-to-end implementation of those principles over an agriculturally relevant data source; we were lead authors on the first set of objective, automatable Metrics for measuring the FAIRness of a resource; and we were also the lead laboratory creating the first software capable of autonomously executing FAIR evaluations, and scoring digital objects based on the level of “FAIRness” they have achieved. Our laboratory is now entirely focused on facets of FAIRness, from FAIR-enabled federation, to FAIR assessment and guidance, to novel approaches to data modelling and interface building.

OSTrails – Open Science Plan Track Assess (Grant ID: 101130187)

OSTrails (Open Science Trails) is a European Horizon funded initiative aimed at fostering FAIR-oriented research through integrated planning, monitoring, and evaluation tools.
- Plan: Machine actionable Data Management Plans (maDMPs)
Transforms static DMPs into dynamic, interoperable “living” documents—machine readable and integrated throughout the data lifecycle—to enhance quality and usability - Track: Scientific Knowledge Graphs (SKGs)
Creates an open, interoperable ecosystem of SKGs that capture connections among datasets, publications, and workflows—serving as evidence of FAIR uptake across research communities - Assess: FAIR Evaluation
Delivers a toolkit of FAIR assessment modules and metrics (maFAIRTests) with embedded user guidance. These can be invoked at any research stage, making evaluation more actionable and transparent.
ERDERA – European Rare Disease Research Alliance

ERDERA (European Rare Diseases Research Alliance) is a Horizon Europe-funded project designed to strengthen the European Rare Disease Research Ecosystem by enhancing collaboration among national and European infrastructures. It builds on the foundation laid by the European Joint Programme on Rare Diseases (EJP RD) and related initiatives to advance research, foster patient involvement, and enable cross-sector collaboration.
- Translate: From Discovery to Medical Interventions
Ensures research outputs lead to actionable medical interventions and improved understanding of rare diseases. - Access: FAIR Data & Advanced Analytics Ecosystem
Aims to advance the integration, discovery, and reuse of rare disease data in accordance with FAIR principles, and promotes advanced data analysis, AI, and statistical tools to foster data use for scientific and regulatory evaluation and healthcare delivery. Builds on infrastructures such as ERDRI, the EJP RD Virtual Platform, and RD-Connect. - Build: Capacity, Knowledge & Support
Provides guidance and technical support for institutions and projects joining the rare disease research ecosystem. Delivers training, resources, and technical support to researchers and data managers, promoting the adoption of FAIR-enabling tools and workflows throughout the rare disease research lifecycle. - Empower: Inclusive Engagement of Rare Disease Communities
Establishes frameworks for patient involvement as equal partners in research. Supports patient training, engagement, and participation in governance to ensure equitable representation and meaningful contribution. - Integrate: A Multi-Stakeholder Ecosystem
Brings together key stakeholders—researchers, clinicians, data stewards, and policy makers—to accelerate data-driven innovation and improve outcomes for rare disease patients. Supports the expansion of rare disease networks and the adoption of good practices through National Mirror Groups and coordinated governance.
FLAIR-GG: FAIRification, Linking And Integrated Reuse of Global ex situ plant Germplasm resources (TED2021-130788B-I00)

- We created the FLAIR GG (FAIRification, Linking And Integrated Reuse of Global ex situ plant Germplasm resources) to address the fragmentation and poor accessibility of wild and crop relative germplasm data. This was accomplished by implementing four core components: local FAIR Data Points at each seedbank (allowing for machine-traversable metadata publication), a dynamic network index (that keeps track of all members of the network, enabling data visiting), the FLAIR-GG Virtual Platform (a portal for federated search and query dispatch), and an optional transformation pipeline (that converts the heterogeneous datasets from different germplasm banks into RDF). This infrastructure allows users to discover, compare, and integrate germplasm metadata across multiple germplasm banks in Spain (with plans for global expansion), as well as integrating data from third-party public sources, like meteorological and soil agencies. This enhances conservation strategy design, and supports European targets like those of EURISCO, while transforming seedbanks into germplasm data centers that are prepared to support scientific research and start collaborations.
THE ROLE OF MULTIPLE INFECTIONS IN PLANT VIRUS INFECTION RISK (MULVIRISK: No: PID2021-124671OB-I00)
The MULVIRISK project aims to explore the impact of multiple infections on the emergence of new diseases within plant ecosystems. Pathogen emergence poses a significant threat to global food security, driven by factors such as climate change, biodiversity loss, and increased habitat connectivity. These ecological shifts alter the evolutionary dynamics of pathogen-host interactions, potentially leading to the development of new diseases. In this context, the interactions between pathogens through co-infection and co-occurrence are crucial yet historically understudied factors.
To comprehensively understand co-infection, a holistic approach is required—one that integrates data from diverse sources. Consequently, the MULVIRISK project encompasses both the analysis and simulation of co-infection and co-occurrence data and models. Additionally, it involves the development of a new database focused on plant-pathogen interactions, designed to adhere to FAIR principles to ensure the data is Findable, Accessible, Interoperable, and Reusable.
Institutional Infrastructure
We collaborate with the CBGP Data Management and Administrative Teams on the development of FAIR-centric, ontology-based databases for managing institutional data—including personnel profiles, research projects, and publications—thereby improving interoperability and accessibility of the centre’s research outputs. We have created a novel ontology-driven approach to database design and user-interface creation that takes ideas such as the dynamic creation of models in, for example, Ruby on Rails, and pushes all aspects of the functionality into an OWL-based knowledgebase, thus enabling software modifications entirely through edits to the knowledgebase.
Representative Publications
van Karnebeek, C.D.M., Müller, A.R., Benkemoun, L., Boussaad, I., Cornel, M.C., IntHout, J., de Kort, M., de Oliveira Martins, S., Prigione, A., Rigter, T., Roes, K.C.B., Sanchez, A., Schipper, R., Wilkinson, M.D., ’t Hoen, P.A.C. 2025. SIMPATHIC: Accelerating drug repurposing for rare diseases by exploiting SIMilarities in clinical and molecular PATHology. Molecular Genetics and Metabolism 144, 109073. DOI: 10.1016/j.ymgme.2025.109073
Cámara Ballesteros, A., Aguayo Jara, E., Verykaki, E.S., Pastor del Olmo, G., Moreno Vázquez, S., Torres, E., Wilkinson, M.D. 2024. The FLAIR-GG federated network of FAIR germplasm data resources. Scientific Data 11, 1386. DOI: 10.1038/s41597-024-04243-7
Hayn, D., Sandner, E., Vengadeswaran, A., Taru, E.-A., Wilkinson, M., Hanauer, M., Kreiner, K., Schreier, G. 2024. Privacy-Preserving Linkage of Distributed Pseudonymised Datasets in a Virtual European Rare Disease Platform. Digital Health and Informatics Innovations for Sustainable Health Care Systems 1442–1446. DOI: 10.3233/SHTI240683
Poza-Viejo, L., Payá-Milans, M., Wilkinson, M.D., Piñeiro, M., Jarillo, J.A., Crevillén, P. 2024. Brassica rapa CURLY LEAF is a major H3K27 methyltransferase regulating flowering time. Planta 260, 27. DOI: 10.1007/s00425-024-04454-7
Jeanson, F., Gibson, S.J., Alper, P., Bernier, A., Woolley, J.P., Mietchen, D., Strug, A., Becker, R., Kamerling, P., Sanchez Gonzalez, M. del C., Mah, N., Novakowski, A., Wilkinson, M.D., Benhamed, O.M., Landi, A., Krog, G.P., Müller, H., Riaz, U., Veal, C., Holub, P., van Enckevort, E., Brookes, A.J. 2024. Getting your DUCs in a row - standardising the representation of Digital Use Conditions. Scientific Data 11, 464. DOI: 10.1038/s41597-024-03280-6
Sanchez Gonzalez, M. del C., Kamerling, P., Iermito, M., Casati, S., Riaz, U., Veal, C.D., Maini, M., Jeanson, F., Benhamed, O.M., van Enckevort, E., Landi, A., Mimouni, Y., Le Cornec, C., Coviello, D.A., Franchin, T., Fusco, F., Ramírez García, J.A., van der Zanden, L.F.M., Bernier, A., Wilkinson, M.D., Mueller, H., Gibson, S.J., Brookes, A.J. 2024. Common conditions of use elements. Atomic concepts for consistent and effective information governance. Scientific Data 11, 465. DOI: 10.1038/s41597-024-03279-z
Wright, A., Wilkinson, M.D., Mungall, C., Cain, S., Richards, S., Sternberg, P., Provin, E., Jacobs, J.L., Geib, S., Raciti, D., Yook, K., Stein, L., Molik, D.C. 2024. FAIR Header Reference genome: a TRUSTworthy standard. Briefings in Bioinformatics 25, bbae122. DOI: 10.1093/bib/bbae122
Atalaia, A., Wandrei, D., Lalout, N., Thompson, R., Tassoni, A., ’t Hoen, P.A.C., Athanasiou, D., Baker, S.-A., Sakellariou, P., Paliouras, G., D’Angelo, C., Horvath, R., Mancuso, M., van der Beek, N., Kornblum, C., Kirschner, J., Pareyson, D., Bassez, G., Blacas, L., Jacoupy, M., Eng, C., Lamy, F., Plançon, J.-P., Haberlova, J., Brusse, E., Hoeijmakers, J.G.J., de Visser, M., Claeys, K.G., Paradas, C., Toscano, A., Silani, V., Gyenge, M., Reviers, E., Hamroun, D., Vroom, E., Wilkinson, M.D., Lochmuller, H., Evangelista, T. 2024. EURO-NMD registry: federated FAIR infrastructure, innovative technologies and concepts of a patient-centred registry for rare neuromuscular disorders. Orphanet Journal of Rare Diseases 19, 66. DOI: 10.1186/s13023-024-03059-3
Bernabé, C.H., Thielemans, L., Kaliyaperumal, R., Carta, C., Zhang, S., van Gelder, C.W.G., Benis, N., da Silva Santos, L.O.B., Cornet, R., Vieira, B. dos S., Lalout, N., Henriques, I., Ballesteros, A.C., Burger, K., Kersloot, M.G., Ehrhart, F., van Enckevort, E., Evelo, C.T., Gray, A.J.G., Hanauer, M., Hettne, K., de Ligt, J., Pereira, A., Queralt-Rosinach, N., Schultes, E., Taruscio, D., Waagmeester, A., Wilkinson, M.D., Willighagen, E.L., Jansen, M., Mons, B., Roos, M., Jacobsen, A. 2023. Building expertise on FAIR through evolving Bring Your Own Data (BYOD) workshops: describing the data, software, and management-focused approaches and their evolution. Data Intelligence 1–23. DOI: 10.1162/dint_a_00236
Alarcon, P., Braun, I., Hartley, E., Olson, D., Benis, N., Cornet, R., Wilkinson, M., Walls, R.L. 2023. Leveraging Biolink as a “Rosetta Stone” Between C-Path and EJP-RD Semantic Models Provides Emergent Interoperability. Journal of the Society for Clinical Data Management 2. DOI: 10.47912/jscdm.130
Mandakovic, D., Aguado-Norese, C., García-Jiménez, B., Hodar, C., Maldonado, J.E., Gaete, A., Latorre, M., Wilkinson, M.D., Gutiérrez, R.A., Cavieres, L.A., Medina, J., Cambiazo, V., Gonzalez, M. 2023. Testing the stress gradient hypothesis in soil bacterial communities associated with vegetation belts in the Andean Atacama Desert. Environmental Microbiome 18, 24. DOI: 10.1186/s40793-023-00486-w
dos Santos Vieira, B., Bernabé, C.H., Zhang, S., Abaza, H., Benis, N., Cámara, A., Cornet, R., Le Cornec, C.M.A., ’t Hoen, P.A.C., Schaefer, F., van der Velde, K.J., Swertz, M.A., Wilkinson, M.D., Jacobsen, A., Roos, M. 2022. Towards FAIRification of sensitive and fragmented rare disease patient data: challenges and solutions in European reference network registries. Orphanet Journal of Rare Diseases 17, 436. DOI: 10.1186/s13023-022-02558-5
Benhamed, O.M., Burger, K., Kaliyaperumal, R., da Silva Santos, L.O.B., Suchánek, M., Slifka, J., Wilkinson, M.D. 2022. The FAIR Data Point: Interfaces and Tooling. Data Intelligence 1–18. DOI: 10.1162/dint_a_00161
da Silva Santos, L.O.B., Burger, K., Kaliyaperumal, R., Wilkinson, M.D. 2022. FAIR Data Point: A FAIR-oriented approach for metadata publication. Data Intelligence 1–21. DOI: 10.1162/dint_a_00160
Ortiz-García, P., Pérez-Alonso, M.-M., González Ortega-Villaizán, A., Sánchez-Parra, B., Ludwig-Müller, J., Wilkinson, M.D., Pollmann, S. 2022. The Indole-3-Acetamide-Induced Arabidopsis Transcription Factor MYB74 Decreases Plant Growth and Contributes to the Control of Osmotic Stress Responses. Frontiers in Plant Science 13. DOI: 10.3389/fpls.2022.928386
Kaliyaperumal, R., Wilkinson, M.D., Moreno, P.A., Benis, N., Cornet, R., dos Santos Vieira, B., Dumontier, M., Bernabé, C.H., Jacobsen, A., Le Cornec, C.M.A., Godoy, M.P., Queralt-Rosinach, N., Schultze Kool, L.J., Swertz, M.A., van Damme, P., van der Velde, K.J., Lalout, N., Zhang, S., Roos, M. 2022. Semantic modelling of common data elements for rare disease registries, and a prototype workflow for their deployment over registry data. Journal of Biomedical Semantics 13, 9. DOI: 10.1186/s13326-022-00264-6
Poza-Viejo, L., Payá-Milans, M., Martín-Uriz, P.S., Castro-Labrador, L., Lara-Astiaso, D., Wilkinson, M.D., Piñeiro, M., Jarillo, J.A., Crevillén, P. 2022. Conserved and distinct roles of H3K27me3 demethylases regulating flowering time in Brassica rapa. Plant, Cell & Environment n/a. DOI: 10.1111/pce.14258
García-Jiménez, B., Muñoz, J., Cabello, S., Medina, J., Wilkinson, M.D. 2020. Predicting microbiomes through a deep latent space. Bioinformatics. DOI: 10.1093/bioinformatics/btaa971
Pérez-Alonso, M.-M., Ortiz-García, P., Moya-Cuevas, J., Lehmann, T., Sánchez-Parra, B., Björk, R.G., Karim, S., Amirjani, M.R., Aronsson, H., Wilkinson, M.D., Pollmann, S. 2020. Endogenous indole-3-acetamide levels contribute to the crosstalk between auxin and ABA, and trigger plant stress responses in Arabidopsis thaliana. Journal of Experimental Botany eraa485. DOI: 10.1093/jxb/eraa485
Prieto, M., Deus, H., Waard, A. de, Schultes, E., García-Jiménez, B., Wilkinson, M.D. 2020. Data-driven classification of the certainty of scholarly assertions. PeerJ 8, e8871. DOI: 10.7717/peerj.8871
Payá-Milans, M., Poza-Viejo, L., Martín-Uriz, P.S., Lara-Astiaso, D., Wilkinson, M.D., Crevillén, P. 2019. Genome-wide analysis of the H3K27me3 epigenome and transcriptome in Brassica rapa. GigaScience 8. DOI: 10.1093/gigascience/giz147
Wilkinson, M.D., Dumontier, M., Sansone, S.-A., Santos, L.O.B. da S., Prieto, M., Batista, D., McQuilton, P., Kuhn, T., Rocca-Serra, P., Crosas, M., Schultes, E. 2019. Evaluating FAIR maturity through a scalable, automated, community-governed framework. Scientific Data 6, 1–12. DOI: 10.1038/s41597-019-0184-5
García-Jiménez, B., Wilkinson, M.D. 2019. Robust and automatic definition of microbiome states. PeerJ 7, e6657. DOI: 10.7717/peerj.6657

