3 December 2012

Data citation Q&A with CSIRO

Although an old hack at many data management related topics, I'm a newbie when it comes to thinking about how to plan and implement data citation infrastructure and support. At the recent eResearch Australasia conference it became obvious that CSIRO was full of people I could learn a lot from! Sue Cook from CSIRO's Information Management and Technology Research Data Service (CSIRO IM&T RDS Support) talked about citation practices in her session on data self-deposit. Sue's colleague Ann Stevenson also contributed to the ANDS-led panel on a similar topic. I highly recommend that you have a look at both these presentations if you are at all interested in not just the tech, but in how a culture of data deposit and citation can be fostered in your organisation.

After the conference, I asked the IM&T RDS Support team if they would be happy to contribute some thoughts to this blog.

Q: CSIRO has been assigning Digital Object Identifiers (DOIs) for a while now. Can you describe how the minting of a DOI happens at CSIRO? Do you have any rules in place around what does and does not get a DOI?
A: We use the ANDS Cite My Data service and automate the assigning of DOIs. As the CSIRO Data Access Portal enables different access levels for deposited data collections only collections that are published to the Public have a DOI generated. All collections with Public data get a DOI.
Q: You have done a lot of thinking about maintaining DOIs over time, and procedures for deciding when changes to a dataset represent a new version requiring a new DOI. Can you share some of your thinking about that?
A: Any changes to the data itself will automatically generate a new DOI (with the previous version still being accessible with a note letting people know that there is a later version). Any changes to the Attribution fields (Creator, Contributors, Title, Publication Year) in the metadata need to be checked by a Data Administrator for a decision about whether a DOI is to be retained or a new DOI generated (depending on whether typos are involved or complete changes to these fields).
Q: At the conference, you said that researchers responded positively to the idea of getting a DOI and being cited, and that this was a 'carrot' for self-deposit of data. I wondered if you could elaborate on that, and say a bit more about why you think researchers have responded so well.
A: The DOI is familiar to researchers as they are issued for journal publications, and generally available from the “advanced online publication” stage. We think therefore that the possibility of getting a DOI for a data collection helps them see the data collection as a valid published citable scientific output akin to their or others' journal articles. We are also citing the as yet small body of evidence suggesting a link between sharing and citing data and citations to articles.
Q: CSIRO researchers operate in a very different environment from university researchers. Do you think those differences will be significant in terms of how data citation feeds into things like rewards and promotions for researchers?
A: We are not sure that CSIRO researchers do operate in a very different environment to University researchers. Although our reporting metrics are directed by different authorities they are very similar to those that the University sector uses. Our researchers still operate within many of  the same disciplines as university researchers, and there is significant collaboration between CSIRO and the university sector. Data citation and sharing practices seem to come from the discipline, rather more than from the institution, so until there is broad external practice and acceptance there may not be significant impact on rewards and promotions. This is why we promote the link between sharing data and enhancing citation rates.
Q: Is there anything else you'd like to highlight about data citation at CSIRO?
A: Like other institutions, we are in the early days of promoting this practice.  We are keen to share our ideas, as well as grab ideas from others.

Thanks to CSIRO for sharing their experiences so far - I look forward to carrying on this conversation over the coming months as Griffith starts to address some of these questions.


11 November 2012

Data Citation: Stories from the Trenches

A panel session at last week's eResearch Australasia conference provided an overview of where data citation is at currently within Australian institutions and where things will be heading in the next 6-12 months.

Cynthia Love, Director, Public Sector Data and National Collections at the Australian National Data Service, kicked off the session by highlighting why data citation was worthy of our attention (verifiability, visibility and rewards). Cynthia emphasised the growing number of high profile institutions world-wide that have become members of the DataCite consortium.

As a DataCite member, ANDS is well-placed to ensure that developments in Australia are aligned with global initiatives, and is working closely with a number of organisations that have an interest in data citation, including the Australian Antarctic Division and CSIRO.

Dave Connell, the Scientific Data Coordinator of the AAD's Data Centre noted that data citation activities at AAD were very much user-driven. Dave said that researchers are asking for data to be made more citable (preferably with a DOI) for a range of reasons: journals are asking for publication details on acceptance of papers; scientists have a need to reference their own data (which in the case of Antarctic data is expensive to collect and irreplaceable); and increasingly, they see the benefit of having data re-used by other scientists. Dave was encouraged by the international and national support coming from DataCite and ANDS.

Ann Stevenson is an Information Specialist in CSIRO's Data Access Portal team. As CSIRO have been minting DOIs for datasets and talking to researchers about citation for some time, Ann was able to give some timely practical advice about embedding data citation within a large research organisation. Of particular interest to me was the extent to which CSIRO have clarified a number of policy issues, such as when a DOI will be minted and how to maintain DOIs over time as collections are updated and superseded. They have also done great work on outreach, with strategies including phone calls to authors with recently approved but yet-to-be-published papers, tapping into professional writing sessions held within CSIRO, and distributing a leaflet to staff.

Karen Visser, Program Leader for Skills, Resources & Policy wrapped up the session with an overview of ANDS's work in this area. ANDS hopes to make data citation metrics easier by working with Thomson Reuters to ensure Australian content is included in the Data Citation Index; they will also work with Elsevier on similar developments in the Scopus citation product suite as opportunity arises. ANDS hope to increase impact by enabling Research Data Australia records to be shared via social media, and they are continuing to provide advice and to facilitate webinars, events and discussions.

Overall I found the session very useful and will look forward to replaying it when the recording is available. In the meantime, slideshows are available on the eResearch Australia website.

26 October 2012

About the project: citation, impact and advocacy

Griffith University has commenced a new project supported by the Australian National Data Service (ANDS). The project's formal name - Data Citation Infrastructure Establishment Program: Griffith University - is a bit of a mouthful, so most of the people involved have been calling this 'the data citation project'. As the title of this blog reflects though, we are very interested in citation as well as other ways of assessing impact that may give us a more holistic understanding of the impact of our researchers' data collections out in the wider world.

This project builds on previous work at Griffith University, including:

This investment in capturing collections, creating high quality metadata records and assigning DOIs has standardised things that (in theory!) should make some of our project easier, like the ability to measure data citations in publications through the use of bibliometrics. Like many others, we are hoping that the new Web of Knowledge Data Citation Index may provide a starting point for overcoming some of the difficulties that have been faced in the past with tracking citations of research data.

Unfortunately the timeframe for the project is not really long enough to really get a sense of benefits that might be accruing through the scholarly publishing process, which is why we also want to explore less traditional ways of measuring impact. Many of Griffith's research focus areas have stakeholders and audiences outside of academia, so the emergence of altmetrics tools like ImpactStory is something we are keen to delve into.

The project also aims to raise awareness of citation and impact with data collection owners and to work with them to develop strategies that will maximise the re-use of their collection over time. This is an important part of the project, but potentially also the most challenging, because this is not about the technology or methods of measuring impacts, but about culture change within the research community. This part of the project will involve a group of librarians from our Academic Services Unit. These librarians have existing relationships and an understanding of how publication, citation and research impact currently work within their disciplines. Their contribution will be essential if we are to get the message out to researchers in the most effective ways, and a number of willing volunteers have already put their hands up to find out more about the project in the coming weeks.

An important aspect of this project is to document what we are learning and to share our experiences. Natasha and I are attending eResearch Australasia next week, where we hope to catch up with other institutions who are minting DOIs and promoting the benefits of data citation to their researchers. I am especially looking forwards to the session Data Citation: Stories from the Trenches, in the hope that I can avoid the pitfalls that may be right around the corner...