Data producers deserve citation credit, says Nature Genetics

Datasets released to public databases in advance of (or with) research publications should be given digital object identifiers to allow databases and journals to give quantitative citation credit to the data producers and curators, according to the October Editorial of Nature Genetics (41, 1045; 2009) .

After reviewing the arguments for assigning a citable credit to data, particularly those which are released publicly before formal publication in a journal, as is increasingly the case in some fields (and required by some funders), the Editorial asks: "What form should citable data identifiers take? They must work with existing unique resource identifier conventions and with the existing well-funded stable repositories used by research communities. However, these identifiers are not just for locating data but are for stably identifying the data units and versions with particular data producers, curators, funders and affiliations in a citable form. Because publications are currently the main source of scientific credit and because publishers have already developed citable digital object identifiers (DOI), it would seem to be their opportunity to grasp or to fumble. We propose citing DOIs that tag a combination of repository, database, accession, version, contributor and funder.

Of course, precise citation of all research output represents the bare minimum of respect for colleagues and competitors. This journal also endorses communication between data producers and data users. Whereas it is impossible for journals to restrict the use of data already in the public domain, we can show evidence of communication between producers and users to referees. Many funders of large resource projects now require a data release policy and plan for global analysis by the data producers. These parts of the successfully refereed grant should be published as a ‘marker paper’ or deposited in a citable preprint archive such as Nature Precedings. At very least, the details of the producers’ work and intents should be available to users in a citable form in the database holding the data. Data users can submit an email demonstrating that they have contacted the data producers with their plan for use of the data and showing that they have read the producers’ data release policy, conditions and plan for analysis."

Please see also the continuing Nature Network online discussions about pre-publication and post-publication data release. We welcome your views there.


