National Network of Libraries of Medicine
English Arabic Chinese (Simplified) French Hindi Japanese Korean Persian Portuguese Russian Spanish

MAR Data Science

Subscribe to MAR Data Science feed
News Highlights from the NNLM Middle Atlantic Region
Updated: 1 hour 26 min ago

Big Data in Healthcare: Exploring Emerging Roles

Tue, 2018-01-09 07:00

The National Network of Librarians of Medicine (NNLM) invites you to participate in Big Data in Healthcare: Exploring Emerging Roles. This course will be primarily held via the Moodle platform with optional WebEx discussions. This course is designed to help health sciences librarians understand the issues of big data in clinical outcomes and what roles health sciences librarians can take on in this service area. Register today!

DatesFebruary 5 – March 30, 2018

The class size for this course is limited to 60 students. We will begin a waitlist if there are more interested in participating.

Course instructors for the winter session are Ann Glusker, Pacific Northwest RegionDerek Johnson, Greater Midwest RegionAlicia Lillich, MidContinental Region, Ann Madhavan, Pacific Northwest Region, Tony Nguyen, Southeastern/Atlantic Region, and Elaina Vitale, Mid-Atlantic Region.

Please contact Tony Nguyen with questions.

About the Class

The Big Data in Healthcare:  Exploring Emerging Roles course will help health sciences librarians better understand the issues of big data in clinical outcomes and what roles health sciences librarians can take on in this service area. Course content comes from information shared by the presenters at the March 7, 2016 NNLM Using Data to Improve Clinical Patient Outcomes Forum, top selections from the NNLM MCR Data Curation/Management Journal Club and NNLM PSR Data Curation/Management Journal Club’s articles, NINR’s Nursing Research Boot Camp, recommended readings from previous cohorts, and Big Data University’s Big Data Fundamentals online course.

Participants will have the opportunity to share what they learned with the instructor from each section of the course content either through WebEx discussions or Moodle Discussions within each Module. These submissions can be used to help support the student’s views expressed in the final essay assignment.


Students who successfully complete the course will:

  • Explain the role big data plays in clinical patient outcomes.
  • Explain current/potential roles in which librarians are supporting big data initiatives
  • Illustrate the fundamentals of big data from a systems perspective
  • Articulate their views/options on the role health sciences sector librarians is in supporting big data initiatives

NOTE: Participants will articulate their views on why health sciences librarians should or should not become involved in supporting big data initiatives by sharing a 500-800 word essay. Students are encouraged to be brave and bold in their views so as to elicit discussions about the roles librarians should play in this emerging field. Participants are encouraged to allow their views to be published on a NNLM online blog/newsletter as part of a dialog with the wider health sciences librarian community engaging in this topic. Your course instructors will reach out to you following the completion of the course.

On top of information gained, being a part of the big data in clinical care dialog, and earning 9 continuing education credits from the Medical Library Association, students may earn an IBM Open Badge program from the Big Data University.

This is a semi-self-paced course (“semi” meaning there are completion deadlines). While offered primarily asynchronously, your course instructors plan to offer opportunities in which participants can join a WebEx discussion to discuss some of the content.

Course Expectations

To complete this course for nine hours of MLA contact hours, participants are expected to:

  • Spend 1-2 hours completed the work within each module.
  • Commit to complete all activities and articulate your views within each module.
  • Complete course requirements by the deadline established in each module.
  • Coordinate with a course instructor to publish your observations/final assignments on a NNLM blog/newsletter
  • Provide course feedback on the Online Course Evaluation Form

Grades for this course is simply a pass/fail grading system. When your submission meets the assignment’s expectations, you will receive full credit for the contact hours for that Module. For submissions that are unclear or incomplete, you may be requested for more information until your instructor approves.

  • For discussion posts, your activity will be marked as complete after you’ve submitted a discussion AND your instructor assigns a point to mark as complete
  • If you participate in WebEx Journal Club Discussions (when available), your instructor will assign points in the Discussions for that module.
  • Students have the option to accept fewer contact hours. However, you will need to inform your course instructors ahead of time.
Categories: Data Science

Reflections on Librarianship and Big Data

Wed, 2017-11-01 08:00

In the NNLM Big Data in Healthcare: Exploring Emerging Roles course, we asked participants, as they progressed through the course to consider the following questions: Do you think health sciences librarians should get involved with big data in healthcare? Where should librarians get involved, if you think they should? If you think they should not, explain why. You may also combine a “should/should not” approach if you would like to argue both sides. NNLM will feature responses from different participants over the coming weeks.

Written By Margaret (Peg) Burnette, Assistant Professor & Biomedical Sciences Librarian, University of Illinois at Urbana-Champaign

The world of librarianship is changing at what seems to be an ever-increasing rate. The librarian’s role has evolved from information organization and access to the provision of specialized services related to information and data quality, management, analysis, and application. Big data is here to stay and permeates both our professional and personal lives. In the era of digital content and libraries without walls, librarians grapple with new challenges in order to remain productive and relevant. And while users may no longer need help finding information, many likely need help with evaluation and management of increasingly large amounts of information and data.

In many ways, the demands of big data are the same as for small data. These demands afford opportunities for librarians that naturally complement librarians’ expertise. Traditional organization and classification skills are still needed to help researchers find, wrangle, and share research and data products of all kinds. More specialized skills, such as statistical or analytical expertise, subject or technical expertise, or advanced computer skills (coding, etc.), enhance the ability to provide highly sought after services that complement the research and education enterprise.

Despite these opportunities, librarians often lack the skills necessary to support research data in a holistic way. Libraries need to plan carefully to match services with librarian competencies and implement strategies to fill gaps. The research and data lifecycles may provide useful frameworks for determining and developing services. For example, an institution might decide to focus on the identification, procurement and application of existing data. Another might focus on infrastructure for data storage solutions which can be a huge challenge for researchers, particularly for big data initiatives. Support for data analysis and data visualization are additional support areas that researchers clamor for. SPSS and R are familiar tools but few have the skills necessary to provide robust support. The immersion that is necessary for mastery of tools like these is simply not realistic for librarians who often wear multiple hats.

A second framework that librarians might consider is big data’s five “Vs”. The Volume of data being produced can benefit from librarian expertise in the areas of organization, security, and storage options. Libraries that are not equipped to offer storage solutions can nonetheless provide information about options and respective implications. Velocity affords opportunities for librarian expertise in the areas of organization, access, and retrieval. For example, librarians can leverage expertise in controlled vocabularies and metadata for data mining projects. Additionally, librarians can apply organizational acumen to help wrangle the Variety of data, both structured and unstructured. Veracity of information is a mainstay of librarianship and data quality is no different. And finally, librarian contributions to data management, curation, and sharing strategies can contribute significantly to the Value of that data.

Ultimately, with all of these opportunities, it is vital to consider data services within the larger institutional context. Some of the services that libraries consider may be provided by other entities such as offices of research or IT units. Coordination is vital to ensure seamless and integrated services streams, shared and complementary responsibilities, and unified goals.

Categories: Data Science

Perspectives of Librarian Involvement in the Use of Big Data and Data Science

Wed, 2017-10-25 08:00

In the NNLM Big Data in Healthcare: Exploring Emerging Roles course, we asked participants, as they progressed through the course to consider the following questions: Do you think health sciences librarians should get involved with big data in healthcare? Where should librarians get involved, if you think they should? If you think they should not, explain why. You may also combine a “should/should not” approach if you would like to argue both sides. NNLM will feature responses from different participants over the coming weeks.

Written By Mary Pat Harnegie, MLIS, AHIP, Medical Librarian, Cleveland Clinic Alumni Library and Manager, South Pointe Hospital Library

ostrich and man with their heads in the sand

This picture puts into words about how I might want to feel about Big Data and the role of the Librarian. After seeing the complexity of the Big Data processes and the unorganized systems that contribute to its disorder, I feel overwhelmed with the expansiveness of what needs to be done to make it usable. If I put my head in the sand, the problem(s) go away…Right?! Wrong!!

Sometimes, order comes out of organizing parts of disorder. So if you have a big picture of chaos, one way to attack the disorder is to pick a part that one can bring into order. When my family is faced with a seemingly insurmountable problem, I tell them that solving the problem is like eating an elephant. You can’t eat an elephant all in one sitting, but you have to deal with it in bite size chunks. The same thing can be applied to a problem: break down your problem in bite size chunks, identify facets of the problem, develop a solution to, and execute it. Look at the next facet of the problem, solve it. After a series of time, you have your elephant-sized problem solved because you dealt with it incrementally.

The class participants observed many examples of what is big data and its amazing applications in business and commerce. Several applications of Big Data and its use in medicine were exhibited in the videos of Kaelber, Longhurst and Meo. I found Dr. Longhurst’s examples of Big Data implications and adopted practices interesting. When given the opportunity of the supported research option and another “this is the way we have always done it” option in the EHR, his colleagues would often choose the second option. But when the EHR was defaulted to the supported research option, with the alternative option available as a “fill-in the blank”, researchers took the road of least resistance and checked the defaulted option. It seems that a lot of the success he described was in giving colleagues an easy-to-use default of the supported recommended action. This was the case in Dr. Kaelber’s examples.

Many of our readings utilized in the course discussed the nature of the unstructured data and its uselessness. The librarian has a place in the Big Data universe as a provider of organizational skills. We have experience in building ontologies like MeSH, where a controlled vocabulary can facilitate a uniform vocabulary through the use of related terms and automated relationship that can help build order in a data schema as well be used a format for use in machine learning. In our readings, we see that the massive amount of data will have to be parsed against standards of uniformity to be reliable and usable. This organizational skill can contribute to Big Data utilization in this way.

Librarians have database design and development skills that can be applied to the organization and data mining processes for Big Data processing. These skills can be adapted and refined for data management processes also. The use of a clinical decision making features, similar to the Green Button, will require organization, architecture design and prioritization that librarians have developed as a tool of their trade.

The enormity of the processes needed to happen is the reason for the picture of the ostrich and the man’s head in the sand. But in ignoring the elephant in the room, librarians will not serve their ultimate constituent well- the patient. The Big Data elephant presents a large and complex set of problems to be organized to be effective in patient care. Our skill sets can make us a team player in the organization, analysis and dissemination of great health care information and practices.

Categories: Data Science

Finding a Foothold for Hospital Librarians in Big Data

Thu, 2017-10-19 08:00

In the NNLM Big Data in Healthcare: Exploring Emerging Roles course, we asked participants, as they progressed through the course to consider the following questions: Do you think health sciences librarians should get involved with big data in healthcare? Where should librarians get involved, if you think they should? If you think they should not, explain why. You may also combine a “should/should not” approach if you would like to argue both sides. NNLM will feature responses from different participants over the coming weeks.

Written by Emily Schon, MLIS, AHIP, Librarian, Boston Children’s Hospital

“Big Data” seems to be a term used everywhere – from giant purchasing sites like Amazon and streaming services like Netflix, to government agencies and universities. It certainly seems useful to look at giant amounts of data, analyze it, and see how it can project outcomes or improve users’ experiences. Thanks to this growing trend, hospitals are making great strides toward utilizing Big Data. Many are now collecting and storing enormous amounts of data about their patients, which data scientists and other individuals around the hospital can utilize to improve and support clinical care.

As long-standing brokers of information, hospital librarians would seem to have a natural role in this new era of big data. Librarians possess many of the skills (e.g. data organization, management, etc.) that are and will be increasingly important in this realm. Yet, as things stand now, hospital librarians have neither the time nor the resources to add such a “big” responsibility to an already lengthy list of duties. Additionally, many hospitals do not include librarians in big data initiatives, such as EMR/EHR, where their skills could be most utilized to help change clinical decision making and ultimately clinical practice.

But that doesn’t mean this will always be the case. As big data becomes increasingly critical to hospital business – from clinical research to hospital operations – library departments could very well reorganize in order to prioritize the management of big data. For instance, dedicated librarians with skills and experience in data science could fill this role in hospitals. As hospitals’ big data efforts continue to grow, interdepartmental efforts may become more cohesive and integrated, and librarians will gain access to important parts along the whole big data process.  And it goes without saying that hospital librarians would need to be compensated at a level comparable to data scientists in order to attract top talent once they reach this point.

In the meantime, hospital librarians can make small measures to support data scientists and other researchers in their big endeavors hospital-wide. It would be worthwhile for hospital medical librarians to help researchers understand and prepare for sharing mandates, which would include finding repositories for data and providing guidance on where and how to share data in a reproducible/preservable manner. Librarians can do this through individual meetings and small classes that fit in with other daily operations, or by creating or adding to resource guides or pages on library websites. Librarians can also create general overview guides on what big data is, along with best practices, definitions, links to tools commonly used in big data, and suggested readings.

For the librarian who has more time, they can become better versed in statistical analysis tools (SAS, SPSS, R, Python, etc.) to provide instruction or assist researchers working on datasets on a consultation basis, similar to how they may assist with literature searches. They can also develop relationships with other departments, such as research computing groups within a hospital, to collaborate and find other fits for helping researchers in this manner.

Given the limited time and resources of many hospital librarians, and the often compartmentalized nature of hospitals, it is up to the hospital medical librarian to find and create a “role” within the world of hospital big data if one is desired. Librarians can draw upon their skillsets already in place, such as their superb organization and management skills, teaching, searching, and preservation. Since big data is a vast, quickly growing, and important field, it seems a natural fit for a librarian. But perhaps, for now, the role of the hospital librarian should only be a small role – one to start and find a foothold, and later look to grow.

Categories: Data Science

Request for Information: Next-Generation Data Science Challenges in Health and Biomedicine

Tue, 2017-10-03 15:47

On behalf of the National Institutes of Health (NIH), the National Library of Medicine (NLM) seeks community input on new data science research initiatives that could address key challenges currently faced by researchers, clinicians, administrators, and others, in all areas of biomedical, social/behavioral and health-related research. The field of data science is broad in scope, encompassing approaches for the generation, characterization, management, storage, analysis, visualization, integration and use of large, heterogeneous data sets that have relevance to health and biomedicine. Data science undergirds the broad and interdependent objectives of the NIH Strategic Plan.

Information about data science research directions that could lead to breakthroughs in any or all NIH interest areas is welcomed, whether applicable across wide swaths of health and biomedicine, or focused on particular research domains.

Information Requested:

NLM requests information on the three focal areas listed below:

  1. Promising directions for new data science research in the context of health and biomedicine.  Input might address such topics as Data Driven Discovery and Data Driven Health Improvement.
  2. Promising directions for new initiatives relating to open science and research reproducibility. Input might address such topics as Advanced Data Management and Intelligent and Learning Systems for Health.
  3. Promising directions for workforce development and new partnerships. Input might address such topics as Workforce Development and Diversity and New Stakeholder Partnerships.

Within these general topic areas, or others related to data science in health and biomedicine, NLM invites researchers, clinicians, organizations, industry representatives and other interested parties to provide input on:

  • Research areas that could benefit most from advanced data science methods and approaches;
  • Data science methods that need updating, or gap areas where new approaches are needed;
  • Priorities for new data science research;
  • Appropriate partnerships and settings for expanded data science research.

See the full notice of request for more background information and details on how to submit a response.


Please direct all inquiries to Valerie Florance, PhD
National Library of Medicine (NLM)
Telephone: 301-496-4621

Categories: Data Science