Skip all navigation and go to page content
NN/LM Home About MAR | Contact MAR | Feedback |Site Map | Help | Bookmark and Share

Archive for the ‘Open Access’ Category

Biomedical Big Data Science Notices of Intent to Publish

Saturday, February 15th, 2014

Registration Now Open for Research Data Management Symposium (Space is Limited)

Friday, February 7th, 2014

Doing It Your Way:  Approaches to Research Data Management for Libraries

When:   April 28-29, 2014

Where:  The Rockefeller University, New York, NY

MAR is offering a 2-day symposium to help your library find its unique approach to research data management.  We have a great cast of speakers to talk about issues you may be facing in your library.

  • Registration opens today for NN/LM MAR Network members (from DE, NJ, NY or PA)
  • For everyone else, registration opens February 21, 2014
  • If registration fills up, registrants will be placed on a waiting list and notified if a space becomes available

Details / Registration:

MORE Details: Join Us for Our Data Management Symposium

Friday, January 24th, 2014

Doing It Your Way:  Approaches to Research Data Management for Libraries

When:   April 28-29, 2014

Where:  The Rockefeller University, New York, NY

NN/LM MAR is planning a 2-day symposium to help your library take a  “Frank Sinatra” approach to research data management.

Day One Keynote Speakers

  • Paul Harris / Director, Office of Research Informatics, Vanderbilt University
  • Jared Lyle / Inter-university Consortium for Political and Social Research (ICPSR) , University of Michigan
  • Keith Webster / Dean of Libraries, Carnegie Mellon University

Day One Breakout Session Speakers

  • Heather Coates / Digital Scholarship & Data Management Librarian, IUPUI University Library
  • Barrie Hayes / Bioinformatics & Translational Science Librarian, Health Sciences Library, University of North Carolina
  • Lisa Johnston / Research Data Management/Curation Lead, Science/Engineering Library, University of Minnesota
  • Wendy Kozlowski / Scientific Data Curation Specialist, Olin Library, Cornell University
  • Stephen Morales / Director, Digital Preservation Network, University of Virginia
  • Alisa Surkis / Translational Science Librarian, Health Sciences Libraries, New York University
  • Ryan Womack / Data Librarian, Alexander Library, Rutgers, The State University of New Jersey

Day Two Workshop Speakers

  • Andrea Horne Denton / Research and Data Services Manager, University of Virginia
  • Sherry Lake / Senior Data Consultant, University of Virginia

We will provide more details about the symposium in coming weeks.  So start spreading the news!

Publisher Goes After Authors of Its Own Journals

Friday, January 17th, 2014

An article in the new issue of The Economist reported on Elsevier’s efforts to stop authors from posting papers they have written on their own web pages.

Public Access to Scientific Research Advances in Omnibus

Friday, January 17th, 2014


Omnibus Appropriations Bill Codifies White House Directive

Washington, DC – Progress toward making taxpayer-funded scientific research freely accessible in a digital environment was reached today with congressional passage of the FY 2014 Omnibus Appropriations Act.  The bill requires federal agencies under the Labor, Health and Human Services, and Education portion of the Omnibus bill with research budgets of $100 million or more to provide the public with online access to articles reporting on federally funded research no later than 12 months after publication in a peer-reviewed journal.

“This is an important step toward making federally funded scientific research available for everyone to use online at no cost,” said Heather Joseph, Executive Director of the Scholarly Publishing and Academic Resources Coalition (SPARC).  “We are indebted to the members of Congress who champion open access issues and worked tirelessly to ensure that this language was included in the Omnibus.  Without the strong leadership of the White House, Senator Harkin, Senator Cornyn, and others, this would not have been possible.”

The additional agencies covered would ensure that approximately $31 billion of the total $60 billion annual US investment in taxpayer funded research is now openly accessible.

SPARC strongly supports the language in the Omnibus bill, which affirms the strong precedent set by the landmark NIH Public Access Policy, and more recently by the White House Office of Science and Technology Policy (OSTP) Directive on Public Access.  At the same time, SPARC is pressing for additional provisions to strengthen the language – many of which are contained in the Fair Access to Science and Technology Research (FASTR) Act – including requiring that articles are:

·      Available no later than six months after publication;

·      Available through a central repository similar to the National Institutes for Health’s (NIH) highly successful PubMed Central, a2008 model that opened the gateway to the human genome project and more recently the brain mapping initiative.  These landmark programs demonstrate quite clearly how opening up access to taxpayer funded research can accelerate the pace of scientific discovery, lead to both innovative new treatments and technologies, and generate new jobs in key sectors of the economy; and

·      Provided in formats and under terms that ensure researchers have the ability to freely apply cutting-edge analysis tools and technologies to the full collection of digital articles resulting from public funding.

“SPARC is working toward codifying the principles in FASTR and is working with the Administration to use PubMed Central as the implementation model for the President’s directive,” said Joseph.  “Only with a central repository and the ability to fully mine and reuse data will we have the access we need to really spur innovation and job creation in broad sections of the economy.”


Every year, the federal government uses taxpayer dollars to fund tens of billions of dollars of scientific research that results in thousands upon thousands of articles published in scientific journals.  The government funds this research with the understanding that it will advance science, spur the economy, accelerate innovation, and improve the lives of our citizens.  Yet most taxpayers – including academics, students, and patients – are shut out of accessing and using the results of the research that their tax dollars fund, because it is only available through expensive and often hard-to-access scientific journals.

By any measure, 2013 was a watershed year for the Open Access movement:  in February, the White House issued the landmark Directive; a major bill,  FASTR, was introduced in Congress; a growing number of higher education institutions – ranging from the University of California System, Harvard University, MIT, the University of Kansas, and Oberlin College – actively worked to maximize access to and sharing of research results; and, for the first time, state legislatures around the nation have begun debating open access policies supported by SPARC.

Details of the Omnibus Language

The Omnibus language (H.R. 3547) codifies a section of the White House Directive requirements into law for the Department of Labor, Health and Human Services, the Centers for Disease Control (CDC), the Agency for Healthcare Research and Quality (AHRQ), and the Department of Education, among other smaller agencies.

Additional report language was included throughout the bill directing agencies and OSTP to keep moving on the Directive policies, including the US Department of Agriculture, Department of the Interior, Department of Commerce, and the National Science Foundation.

President Obama is expected to sign the bill in the coming days.

CLIR Report on Research Data Management

Friday, January 3rd, 2014

A new CLIR report on research data management –  the series of articles discuss the various aspects of data management and activities at universities:

NIH Big Data to Knowledge

Friday, December 20th, 2013

BD2K RFA that we should bring to the attention of our communities

The NIH Big Data to Knowledge (BD2K, ) initiative announces the release of an RFA to support a U24 resource award for “Development of an NIH BD2K Data Discovery Index Coordination Consortium”.

The purpose of this Funding Opportunity Announcement (FOA) is to create a consortium to begin development of an NIH Data Discovery Index (DDI) to allow discovery, access, and citation of biomedical data. As part of the NIH Big Data to Knowledge (BD2K) initiative, the DDI seeks to fulfill the recommendation from the Data and Informatics Working Group (DIWG) report to the Advisory Council of the Director ( to “Promote Data Sharing Through Central and Federated Catalogues.”

The awardee in response to this FOA will constitute a DDI Coordination Consortium (DDICC, U24) to conduct outreach, fund small pilot projects, manage communication with stakeholders, constitute and coordinate Task Forces to study relevant questions related to access, discoverability, citation for all biomedical data and assure community engagement in the development, testing and validation of an NIH DDI. Part of this effort will be to assemble a user interface (website) through which the results of development and testing of models for an NIH DDI may be communicated. It is anticipated that a successful DDICC will work with the NIH to overcome obstacles in the way of better use and application of biomedical big data by developing a working concept for a DDI.

Draft Declaration of Data Citation Principles – For Comment

Friday, December 6th, 2013

The Data Citation Synthesis Group <> has released a draft Declaration of Data Citation Principles<>and invites comment.

This has been a very interesting and positive collaborative process and has involved a number of groups and committed individuals. Encouraging the practice of data citation, it seems to me, is one of the key steps towards giving research data its proper place in the literature.

As the preamble to the draft principles states:

Sound, reproducible scholarship rests upon a foundation of robust, accessible data. For this to be so in practice as well as theory, data must be accorded due importance in the practice of scholarship and in the enduring scholarly record. In other words, data should be considered legitimate, citable products of research. Data citation, like the citation of other evidence and sources, is good research practice.

In support of this assertion, and to encourage good practice, we offer a set of guiding principles for data citation.

Please do comment on these principles. We hope that with community feedback and support, a finalised set of principles can be widely endorsed and adopted.

Discussion on a variety of lists is welcome, of course. However, *if you want the Synthesis Group to take full account of your views, please be sure to post your comments on the discussion forum <>.*

Some notes and observations on the background to these principles I would like to add here some notes and observations on the genesis of these principles. As has been widely observed there have been a number of groups and interested parties involved in exploring the principles of data citation for a number of years. Mentioning only some of the sources and events that affected my own thinking on the matter, there was the 2007 Micah Altman and Gary King article, in DLib, which offered ‘A Proposed Standard for the Scholarly Citation of Quantitative Data’ <> and Toby Green’s OECD White Paper ‘We need publishing standards for datasets and data tables’ <>in 2009. Micah Altman and Mercè Crosas organised a workshop at Harvard in May 2011 on Data Citation Principles <>.

Later the same year, the UK Digital Curation Centre published a guide to citing data <> in 2011.

The CODATA-ICSTI Task Group on Data Citation Standards and Practices <> (co-chaired by Christine Borgman (replacing Bonnie Carroll as Co-chair in January of this year), Jan Brase and Sara Callaghan) has been in existence since 2010.

In collaboration with the US National CODATA Committee and the Board on Research Data and Information<>, a major workshop was organised in August 2011<>, which was reported in ‘For Attribution: Developing Data Attribution and Citation Practices and Standards’<>

The CODATA-ICSTI Task Group then started work on a report covering data citation principles, eventually entitled ‘Out of Cite, Out of Mind’ <>- drafts were circulated for comment in April 2013 and the final report was released in September 2013.

Following the first ‘Beyond the PDF’ Meeting <> in Jan 2011 participants produced the Force11 Manifesto ‘Improving Future Research Communication and e-Scholarship’ <> which places considerable weight on the availability of research data and the citation of those data in the literature. At ‘Beyond the PDF II’ <> in Amsterdam, March 2013, a group comprising Mercè Crosas, Todd Carpenter, David Shotton and Christine Borgman produced ‘The Amsterdam Manifesto on Data Citation Principles’. <> In the very same week, in Gothenburg, an RDA Birds of a Feather group <> was discussing the more specific problem of how to support, technologically, the reliable and efficient citation of dynamically changing or growing datasets and subsets thereof. And the broader issues of the place of data and research publication were being considered in the ICSU World Data Service Working Group on Data Publication<>. This group has, in turn, formed the basis for an RDA Interest Group<>.

From June 2013, as the Force11 Group was preparing its website and activities to take forward the work on the Amsterdam Manifesto, calls came in from a number of sources for these various groups and initiatives to coordinate and collaborate. This was admirably well-received and from July the ‘Data Citation Synthesis Group’ had come into being with an agreed mission statement  <>:

The data citation synthesis group is a cross-team committee leveraging the perspectives from the various existing initiatives working on data citation to produce a consolidated set of data citation principles (based on the Amsterdam Manifesto, the CODATA and other sets of principles provided by others) in order to encourage broad adoption of a consistent policy for data citation across disciplines and venues. The synthesis group will review existing efforts and make a set of recommendations that will be put up for endorsement by the organizations represented by this synthesis group.

The synthesis group will produce a set of principles, illustrated with working examples, and a plan for dissemination and distribution. This group will not be producing detailed specifications for implementation, nor focus on technologies or tools.

As has been noted elsewhere , the group comprised 40 individuals and brought together a large number of organisations and initiatives <>.

What followed over the summer was a set of weekly calls to discuss and align the principles. I must say, I thought these were admirably organised and benefitted considerably from participants’ efforts to prepare documents comparing the various groups’ statements. The face-to-face meeting of the group, in which a lot of detailed discussion to finalise the draft was undertaken, was hosted (with a funding contribution from CODATA) at the US National Academies of Science between the 2nd RDA Plenary <> and the DataCite Summer Meeting <> (which CODATA also co-sponsored). It has been intellectually stimulating and a real pleasure to contribute to these discussions and to witness so many informed and engaged people bashing out these issues.

The principles developed by the Synthesis Group are now open for comment and I urge as many people, researchers, editors and publishers as possible who believe that data has a place in scholarly communications to comment on them and, in due course, to endorse them and put them into practice.

Are we finally at the cusp of real change in practice? Will we now start seeing the practice of citing data sources become more and more widespread?

It’s soon to say for sure, but I hope these principles, and the work on which they build, have got us to a stage where we can start really believing the change is well underway.

From CODATA Blog via IFTTT <>

Tweeting Biomedicine: An Analysis of Tweets and Citations in the Biomedical Literature

Friday, November 29th, 2013 (subscription required)

Data collected by social media platforms have been introduced as new sources for indicators to help measure the impact of scholarly research in ways that are complementary to traditional citation analysis.  Data generated from social media activities can be used to reflect broad types of impact.  This article aims to provide systematic evidence about how often Twitter is used to disseminate information about journal articles in the biomedical sciences.  The analysis is based on 1.4 million documents covered by both PubMed and Web of Science and published between 2010 and 2012.

The number of tweets containing links to these documents was analyzed and compared to citations to evaluate the degree to which certain journals, disciplines, and specialties were represented on Twitter and how far tweets correlate with citation impact.  With less than 10% of PubMed articles mentioned on Twitter, its uptake is low in general but differs between journals and specialties.  Correlations between tweets and citations are low, implying that impact metrics based on tweets are different from those based on citations.  A framework using the coverage of articles and the correlation between Twitter mentions and citations is proposed to facilitate the evaluation of novel social-media-based metrics.

Managing Scientific Data as Public Assets: Data Sharing Practices and Policies among Full-Time Government Employees

Friday, November 29th, 2013 (requires subscription)

This paper examines how scientists working in government agencies in the U.S. are reacting to the “ethos of sharing” government-generated data.  For scientists to leverage the value of existing government data sets, critical data sets must be identified and made as widely available as possible.  However, government data sets can only be leveraged when policy makers first assess the value of data, in much the same way they decide the value of grants for research outside government.

We argue that legislators should also remove structural barriers to interoperability by funding technical infrastructure according to issue clusters rather than administrative programs.  As developers attempt to make government data more accessible through portals, they should consider a range of other nontechnical constraints attached to the data.  We find that agencies react to the large number of constraints by mostly posting their data on their own websites only rather than in data portals that can facilitate sharing.  Despite the nontechnical constraints, we find that scientists working in government agencies exercise some autonomy in data decisions, such as data documentation, which determine whether or not the data can be widely shared.  Fortunately, scientists indicate a willingness to share the data they collect or maintain.  However, we argue further that a complete measure of access should also consider the normative decisions to collect (or not) particular data.