As part of the Severo Ochoa objectives for CBGP, we proposed to increase the quality and reusability of our data outputs, as well as increasing the number of high-quality publications from CBGP, through pursuing pure-data publications such as this one.
Scientific research is experiencing a methodological crisis, where the results presented in a large number of papers fail to meet a core requirement of the scientific method: reproducibility. In the “hard sciences” such as in the biomedical and systems-biology domains, at least part of this problem is due to (a) the lack of contextual and methodological annotation around the data gathering and processing steps, which leads to (b) an inability of peer review to identify potential errors in these early stages of the experiment, which then mask downstream misinterpretations at the analytical stage. To address this, a variety of journals are now able to accept “pure data” publications - that is, a peer-reviewed publication describing only the data gathering and cleansing steps, accompanied by a richly described deposit of the raw data in a public repository. The purpose of these publications is to (a) ensure that these early steps in the scholarly process are equally rigorously reviewed, and (b) ensure that the data deposit can be reused by other researchers with full transparency, detailed protocols, and no ambiguity regarding the meaning or interpretation of the data elements. Journals like Nature Publishing Group’s
Scientific Data (a Q1 journal) provide "descriptions of data sets relevant to the natural sciences, which are provided as machine-readable data, complemented with a human oriented narrative".As part of the Severo Ochoa objectives for CBGP, we proposed to increase the quality and reusability of our data outputs, as well as increasing the number of high-quality publications from CBGP, through pursuing pure-data publications such as the one described in this News article "Genome-wide polyadenylation site mapping datasets in the rice blast fungus Magnaporthe oryzae". In this paper we present the datasets derived from a genome-wide polyadenylation profiling obtained from the rice pathogen M. oryzae using a novel approach, for both the wild-type fungus and a relevant mutant in four different growing conditions in three replicates. The data have been published accompanied by an extensive, standardized, and highly structured metadata descriptor, in a globally-accepted format (ISA-Tab), with experimental descriptions based in widely used biomedical ontologies. The data itself is published in both raw formats, and as FAIR data.
We encourage other CBGP researchers to examine this publication, so that they understand the requirements for publishing a pure-data article. We also note that, in the next 2-3 months, CBGP will have a Web portal that helps researchers manage both their data and analytical tools, making it easier to create these kinds of publications.
Original Paper:
Marconi, M., Sesma, A., Rodríguez-Romero, J.L., Rosano González, M.L., Wilkinson, M.D. 2018. Genome-wide polyadenylation site mapping datasets in the rice blast fungus Magnaporthe oryzae. Scientific Data 5, 180271. DOI: 10.1038/sdata.2018.271