National Network of Libraries of Medicine
English Arabic Chinese (Simplified) French Hindi Japanese Korean Persian Portuguese Russian Spanish

MAR Data Science

Subscribe to MAR Data Science feed
News Highlights from the NNLM Middle Atlantic Region
Updated: 27 min 21 sec ago

Big Data for Hospital Librarians – Are We There, Yet?

Tue, 2018-05-01 08:00

In the NNLM Big Data in Healthcare: Exploring Emerging Roles course, we asked participants, as they progressed through the course to consider the following questions: Do you think health sciences librarians should get involved with big data in healthcare? Where should librarians get involved, if you think they should? If you think they should not, explain why. You may also combine a “should/should not” approach if you would like to argue both sides. NNLM will feature responses from different participants over the coming weeks.

Written by Elanor Pickens, Medical Librarian, Portsmouth Regional Hospital, Portsmouth, NH

“Big data” is a term that has been recently appearing in the literature of academic librarianship. However, as a hospital librarian, may I consider this justification enough to explore its applicability to my current position? The definition of big data is multi-faceted and covered elsewhere, and I will refer readers on (Gandomi & Haider, 2015). Here I simply seek to understand my potential role in this field.

Proposed by Martin (2016) is a framework of five basic categories, within which librarians may find at least one opportunity in supporting data science activities. One such role that the hospital librarian may engage in might be found within the “Literacy” domain, through teaching. This may include remaining informed about the various programming languages and software (R, Python, SAS, etc.), and their strengths and weaknesses, so that we may assist in the research process by educating about data management options. Additionally, we might lead researchers to resources that can help them visualize their data, to enhance their own comprehension of relationships and to enable them to better present their findings to others. If we also understand the form of data that they are collecting (structured, semi-structured, or unstructured), we may be able to help them discover studies that have utilized similar data so that they may anticipate any barriers of organization and analysis that they might encounter.

An example of a concern that researchers may have with respect to big data analysis is that a question needs to be relatively clear before the examination of any data (Brennan). Although slight revisions may be necessary, or new questions might arise, which could be further examined, there should not really be a significant change in direction of the original question. There are too many data and methods of analysis to proceed in big data research without a clear understanding of where it leads. Librarians are proficient at refining questions in order to get at the core of a research query as part of their reference interview skillset. We also have at least some basic knowledge of the different types of data analysis methods that are described in the literature, although much of this exposure does not include research conducted with big data. Iwashyna and Liu (2014) state that, in contrast to the formulation of one hypothesis and advance selection of the data that will need to be carefully collected to support or refute this hypothesis (traditional epidemiology), the multitude and variety of big data are able to be custom fit to epidemiological studies to identify patterns. Similarly, the authors discuss that a number of analytic methods can be adapted and combined when using big data, whereas traditional epidemiology is usually restricted to one analytic method per study. For hospital librarians, an awareness of research question modifications and understanding of methods of data analysis may not necessarily yield additional support to experienced researchers, but it may still help guide those individuals who are subject specialists in their field but new to the research process.

Groeneveld and Rumsfeld (2015) note that big data certainly has predictive power in clinical decision-making; however, big data cannot determine which associations are due to random events, nor can they identify causal associations. In addition, the authors point to a lack of large-scale analytical methodology for scientific comparison studies comparable to the level currently available in big data analysis. It is necessary for researchers, especially those seeking publication, to consider the reproducibility of their studies when they are using such highly adaptive and dynamic models of analysis. In this case, there may be little more for hospital librarians to do than to continue to assist researchers in discovering the current best practices for big data publication with respect to transparency.

For the solo hospital librarian, who is often juggling multiple tasks, big data may simply not be a sphere in which to operate. I feel that despite having gained a useful surface understanding of big data concepts, I am still unable to determine how I may be able to apply this knowledge in my current position, or whether I would even have the time to do so. And without a clearly-defined path, management support for professional development opportunities is essentially non-existent (Burton, 2017). One area of interest that has piqued my curiosity is how big data may inform my organization’s operations through the revision of protocols, thereby improving clinical practice. I have never really had much opportunity before to consider how data collected by our EHR truly inform patient care, and especially how they might impact revenue. For example, patients who come in for vaccinations may also receive additional preventive care (Kaelber, 2016). But in order to actually delve into the big data arena, it may be up to each individual librarian to either maintain a basic awareness, or seek out opportunities that may or may not be supported at the organizational level.


Burton, M., & Lyon, L. (2017). Data science in libraries. Bulletin of the Association for Information Science and Technology, 43(4), 33-35.

Brennan, P. (2015). Big Data in Nursing Research. NINR Big Data Boot Camp Part 4: Big Data in Nursing Research.

Gandomi, A., & Haider, M. (2015). Beyond the hype: Big data concepts, methods, and analytics. International Journal of Information Management, 35(2), 137-144.

Groeneveld, P. W., & Rumsfeld, J. S. (2016). Can big data fulfill its promise? Circulation: Cardiovascular Quality and Outcomes, 9(6), 679-682. PMCID: PMC5396388.

Iwashyna, T. J., & Liu, V. (2014). What’s so different about big data? A primer for clinicians trained to think epidemiologically. Annals of the American Thoracic Society, 11(7), 1130-1135. PMCID: PMC4214055.

Kaelber, D. (2016). Using Clinical Data to Improve Clinical Patient Outcomes. NNLM Forum (online).

Martin, E. R. (2016). The Role of Librarians in Data Science: A Call to Action. Journal of eScience Librarianship, 4(2), 7.

Categories: Data Science

Reflections on Big Data in Healthcare: Exploring Emerging Roles

Tue, 2018-04-24 08:00

In the NNLM Big Data in Healthcare: Exploring Emerging Roles course, we asked participants, as they progressed through the course to consider the following questions: Do you think health sciences librarians should get involved with big data in healthcare? Where should librarians get involved, if you think they should? If you think they should not, explain why. You may also combine a “should/should not” approach if you would like to argue both sides. NNLM will feature responses from different participants over the coming weeks.

Written By Darlene Kelly, Library Director, Charles R. Drew University of Medicine and Science, Los Angeles, CA

Over the course of the past few years, health sciences librarians (HSL) have become engaged in the discovery of Big Data in Healthcare. HSL have a history of being early adopters in technology and we continue to demonstrate our ability to become transformative in the field of information science. I believe Dr. Patricia Brennan’s, National Library of Medicine (NLM) Director, focus on data science initiatives has further propelled librarians to contribute to National Library of Medicine’s Strategic Initiatives on Big Data, and we are making strides. I strongly believe HSL are taking the necessary steps to become expert partners in the field of data sciences. There are 4 components that I assert are essential in this process, including: NLM training and support; the Medical Library Association (MLA) education; the creation of collaborations; and self-discovery through life-long learning.

The National Library of Medicine continues to be a catalyst in the discovery of Big Data initiatives. Epstein (2017) reported on NLM’s “increased focus on data science” (p. 308). Hence NLM has created funding opportunities such as: the NLM Administrative Supplements for Informationist Services in NIH-funded Research Projects (Admin Supp) to assist in identifying collaborations with researchers who have NIH funding, and encouraging librarians to participate in data science. In addition, the National Network of Libraries of Medicine (NNLM) has implemented training courses on data sciences, including this course on Big Data in Healthcare.

The Medical Library Association (MLA) is addressing the data science initiatives by modifying its competencies to include skills related to “information management and the curation of clinical and health information data” (Epstein, 2017, p. 309). MLA offers continuing education opportunities for librarians in the forms of classes, webinars, the annual meeting presentations, and most recently the implementation of the MLA Research Training Institute. Thus, librarians have multiple opportunities to learn more about the field of data science.

Additionally, collaboration is key to becoming involved in data science. Through this class, I have learned about many of the opportunities that are available in the field of data science. Since this course, I have become more confident in how to identify possible areas of collaborations. I was especially interested in the work by Read et al, (2015) that has been implemented at the New York University Health Sciences Library.

Librarians are curious informationists and I think our involvement in data science is a natural progression. Often it is through self-discovery that we become involved in or are able to identify where we can contribute. An example, of self-discovery is to continue to learn more about data science through other courses such as through Coursera. In addition, it starts with having a conversation with select stakeholders to identify potential interest in data science.

In conclusion, librarians can become experts in the field of data science. There are several mechanisms that are already in place to assist us with learning these skill sets. Both NLM and MLA have a vast number of opportunities to learn more about data science and to participate in data science initiatives. As we create partnerships and collaborations, we will become more knowledgeable about the needs of our researchers. In addition, librarians must remember that we are life-long learning and we welcome a challenge.


Epstein, B. A. (2017). Health sciences libraries in the United States: new directions. Health Information & Libraries Journal34(4), 307-311.

Read, K. B., Surkis, A., Larson, C., McCrillis, A., Graff, A., Nicholson, J., & Xu, J. (2015). Starting the data conversation: informing data services at an academic health sciences library. Journal of the Medical Library Association: JMLA103(3), 131.


Categories: Data Science