National Network of Libraries of Medicine
English Arabic Chinese (Simplified) French Hindi Japanese Korean Persian Portuguese Russian Spanish

SEA Data Science

Subscribe to SEA Data Science feed SEA Data Science
News for Network Members in Alabama, District of Columbia, Florida, Georgia, Maryland, Mississippi, North Carolina, Puerto Rico, South Carolina, Tennessee, U.S. Virgin Islands, Virginia and West Virginia
Updated: 1 hour 42 min ago

Moodle Class Announcement: Big Data in Healthcare: Exploring Emerging Roles

Wed, 2018-01-03 14:46

The National Network of Librarians of Medicine (NNLM) invites you to participate in Big Data in Healthcare: Exploring Emerging Roles. This course will be primarily held via the Moodle platform with optional WebEx discussions. This course is designed to help health sciences librarians understand the issues of big data in clinical outcomes and what roles health sciences librarians can take on in this service area.

DatesFebruary 5 – March 30, 2018

Register: To register for this class, please visit:

The class size for this course is limited to 60 students. We will begin a waitlist if there are more interested in participating.

Course instructors for the winter session are Ann Glusker, Pacific Northwest RegionDerek Johnson, Greater Midwest RegionAlicia Lillich, MidContinental Region, Ann Madhavan, Pacific Northwest Region, Tony Nguyen, Southeastern/Atlantic Region, and Elaina Vitale, Mid-Atlantic Region.

Please contact Tony Nguyen with questions.

Description: The Big Data in Healthcare:  Exploring Emerging Roles course will help health sciences librarians better understand the issues of big data in clinical outcomes and what roles health sciences librarians can take on in this service area. Course content comes from information shared by the presenters at the March 7, 2016 NNLM Using Data to Improve Clinical Patient Outcomes Forum, top selections from the NNLM MCR Data Curation/Management Journal Club and NNLM PSR Data Curation/Management Journal Club’s articles, NINR’s Nursing Research Boot Camp, recommended readings from previous cohorts, and Big Data University’s Big Data Fundamentals online course.

Participants will have the opportunity to share what they learned with the instructor from each section of the course content either through WebEx discussions or Moodle Discussions within each Module. These submissions can be used to help support the student’s views expressed in the final essay assignment.

Objectives: Students who successfully complete the course will:

  • Explain the role big data plays in clinical patient outcomes.
  • Explain current/potential roles in which librarians are supporting big data initiatives
  • Illustrate the fundamentals of big data from a systems perspective
  • Articulate their views/options on the role health sciences sector librarians is in supporting big data initiatives

NOTE: Participants will articulate their views on why health sciences librarians should or should not become involved in supporting big data initiatives by sharing a 500-800 word essay. Students are encouraged to be brave and bold in their views so as to elicit discussions about the roles librarians should play in this emerging field. Participants are encouraged to allow their views to be published on a NNLM online blog/newsletter as part of a dialog with the wider health sciences librarian community engaging in this topic. Your course instructors will reach out to you following the completion of the course.

On top of information gained, being a part of the big data in clinical care dialog, and earning 9 continuing education credits from the Medical Library Association, students may earn an IBM Open Badge program from the Big Data University.

This is a semi-self-paced course (“semi” meaning there are completion deadlines). While offered primarily asynchronously, your course instructors plan to offer opportunities in which participants can join a WebEx discussion to discuss some of the content.

Course Expectations: To complete this course for nine hours of MLA contact hours, participants are expected to:

  • Spend 1-2 hours completed the work within each module.
  • Commit to complete all activities and articulate your views within each module.
  • Complete course requirements by the deadline established in each module.
  • Coordinate with a course instructor to publish your observations/final assignments on a NNLM blog/newsletter
  • Provide course feedback on the Online Course Evaluation Form

Grading: Grades for this course is simply a pass/fail grading system. When your submission meets the assignment’s expectations, you will receive full credit for the contact hours for that Module. For submissions that are unclear or incomplete, you may be requested for more information until your instructor approves.

  • For discussion posts, your activity will be marked as complete after you’ve submitted a discussion AND your instructor assigns a point to mark as complete
  • If you participate in WebEx Journal Club Discussions (when available), your instructor will assign points in the Discussions for that module.
  • Students have the option to accept fewer contact hours. However, you will need to inform your course instructors ahead of time.
Categories: Data Science

Call for Participation: NNLM SEA Data Management Program Advisory Committee

Wed, 2017-11-15 12:00

The National Network of Libraries of Medicine (NNLM), Southeastern/Atlantic Region (SEA) is extending an invitation for network members to join and participate in the Data Management Program Advisory Committee (PAC).

The Data Management PAC will work cooperatively with Tony Nguyen, Technology and Communications Coordinator in planning and carrying out committee work. Members are volunteers who share an expert knowledge on the topic.

The responsibility of PACs includes:

  • Advise NNLM staff on the need for and relative priority of education within the program area.
  • Assist with program evaluation.
  • Ensure that programming is aligned with local needs.
  • Evaluate technology and data related award applications.

The PAC will meet a few times a year via web conferencing software. NNLM SEA will select up to 7 members to participate in this PAC.

If you would like to nominate yourself or a colleague as a member, please visit: The deadline to apply is December 1, 2017.

Categories: Data Science

Reflections on: Big Data in Healthcare: Exploring Emerging Roles

Tue, 2017-11-07 09:13

In the NNLM Big Data in Healthcare: Exploring Emerging Roles course, we asked participants, as they progressed through the course to consider the following questions: Do you think health sciences librarians should get involved with big data in healthcare? Where should librarians get involved, if you think they should? If you think they should not, explain why. You may also combine a “should/should not” approach if you would like to argue both sides. NNLM will feature responses from different participants over the coming weeks.

Reflections on: Big Data in Healthcare: Exploring Emerging Roles

Written by: Meaghan Muir, MLIS, Manager, Library Services, Boston Children’s Hospital

“Big Data in Healthcare: Exploring Emerging Roles” has been a valuable introduction to discovering how physicians, nurses, researchers, and librarians are using big data and data science. It has been interesting to explore the different ways in which big data is being used, especially in our day-to-day lives, such as how Netflix and online retailers are using big data to interact with their customers. Data of all kinds is being created every second of the day, and the exponential growth is overwhelming and difficult to comprehend.

Data science is multidisciplinary, and there absolutely is a role for health sciences libraries. However, we cannot assume that all health sciences libraries, and especially all health sciences librarians, can readily become involved. There are clear opportunities, but there are also significant barriers to offering library-based support of data science activities. Hospital libraries, may have unique challenges and opportunities. Some challenges that have been discussed in this course that are specific to hospital libraries/librarians include:

  • Lack of competencies to use data science tools.
  • No dedicated library staff/position for data science.
  • Lack of knowledge about researchers work and data life cycle.
  • Getting buy in from stakeholders/partners
  • Lack of experience, have never worked with a big data project.
  • Lack of time resources to implement data science support services.

The good news for hospital librarians is that there are plenty of opportunities and various ways to engage with clinicians and researchers working with big data. Librarians already possess skills to assist clinicians and researchers. We are accustomed to educating user populations on how to use resources such as databases and other library-related programs. Taking literature searches a step further by not only searching for published literature, but also searching directly in the associated data set (if applicable) is a possible role for health sciences librarians. Librarians are also well-versed in advising on open access/information sharing policies which can be translated to helping researchers comply with data sharing policies. This includes talking to researchers about mandates to share their data and helping them prepare it in a shareable form as well as educating others on existing hospital specific data management policies. Focusing on specific populations that are engaging in big data projects is an opportunity. For example, nurses will often turn to a hospital library as their sole resource because they might not be connected to an academic library. Libraries working with nurses who are involved or getting involved with big data endeavors is an obvious partnership seeing as the library is already their go to for help with various projects. Libraries can help people who are new to big data by teaching them about how big data is generated and collected. It’s also a natural fit for librarians to help others learn how to organize information of all types, including big data.  

Getting started is somewhat daunting.  The JMLA article (Read KB, Surkis A, Larson C, McCrillis A, Graff A, Nicholson J, Xu J. Starting the data conversation: informing data services at an academic health sciences library. J Med Libr Assoc. 2015 Jul;103(3):131-5) is one way to approach this. Simply, librarians can start a conversation with groups within the hospital that might be potential partners. Ideally a conversation would be started with a clinical research and a basic science research group, as the JMLA article discussed. This conversation ideally would assess current practices and potential needs, and introduce to the stakeholders what a librarian might bring to the table. Keeping in mind what Dr. Brenner said about not needing to be data scientists to do data science. It is unlikely that the typical hospital library will have a data science librarian on staff (as of this moment in time) but as described above there are many ways in which health sciences librarians can complement activities of clinicians and researchers engaging in data science efforts. It is rather encouraging to see that the number of opportunities discussed far outnumbers the challenges.

Categories: Data Science

Reflection: Should Health Science Librarians Be Involved in Big Data?

Thu, 2017-11-02 07:58

In the NNLM Big Data in Healthcare: Exploring Emerging Roles course, we asked participants, as they progressed through the course to consider the following questions: Do you think health sciences librarians should get involved with big data in healthcare? Where should librarians get involved, if you think they should? If you think they should not, explain why. You may also combine a “should/should not” approach if you would like to argue both sides. NNLM will feature responses from different participants over the coming weeks.

Should Health Science Librarians Be Involved in Big Data?

Written by Adelia Grabowsky, MLIS, Health Sciences Librarian, Ralph Brown Draughon Library, Auburn University

I think that health science librarians are able to support big data in the same way that they are involved in supporting any type of data. Chandrasekaran (2013) illustrates the variety and complexity of skills required to work with data. He includes additional requirements for big data, including the necessity of working with specialized software like Hadoop, which permits collection and analysis of data sets spread out across multiple computers (Chandrasekaran, 2013). Most librarians do not have all or even most of the skills enumerated on Chandrasekaran’s (2013) map. However, during a talk at a National Institute of Nursing Big Data Boot Camp, Brennan (2015) suggests that not every nurse needs to be or has the time to be a data scientist. Instead, she believes that all nurses should have an understanding of data science with a small number of nurses developing the skills and knowledge to actively engage in big data studies (Brennan, 2015). I think this premise also holds true for librarian support for big data. It is important that all librarians have a basic understanding of the research data life cycle and of the vocabulary of data. However, involvement that is more extensive may depend on the fit of data needs to more traditional librarian roles and/or the skills and interests of the specific librarian.

Federer (2016) presents a research data life cycle which begins with data-specific planning for research projects and proceeds to data collection or acquisition, data analysis or interpretation, data preservation and curation, and finally, sharing of data. Many librarians already support these stages of the data life cycle, with the exception of data analysis or interpretation, in some way. Although librarians have not traditionally been involved with data collection, they have often been involved with data acquisition by assisting in finding free or acquiring fee-based data sets. Librarians have also traditionally been part of the process of making results of research more “findable” by attaching metadata. As funding agencies have begun to require planning, which includes how data will be stored and shared; librarians have used those same skills to assist in the planning process, increase findability by attaching metadata to data sets and find suitable spaces (either in-house or subject or agency-based) in which to store and preserve data. All of these activities should translate to work with big data. The exception to library support of the research data life cycle is data analysis/visualization. For most librarians, this area will require an upgrading of skills in order to provide support. I think the decision to provide support for data analysis will depend on an individual librarian’s interest and the time they have to devote to new support activities. One example of a likely requirement in this area is a knowledge of programming languages like R or Python (Federer, 2016). For librarians that are interested in providing support for data analysis, there are many training opportunities ranging from learning R through an institutional subscription like to specialized short courses like the Data and Visualization Institute for Librarians (NCSU Libraries, n.d.).

One thing to remember is the use of big data in healthcare is still in its infancy, with continuing discussions about how and when data should be used (Cohen et al., 2015; Iwashyna & Liu, 2014; Krumholz, 2014) and about how current patient privacy protections impact the effective use of big data (Longhurst, Harrington, & Shah, 2014). As the use of big data grows and evolves, decisions made today about librarian support may not be as applicable in the future. Instead, librarians must stay informed about changes that are occurring and remain flexible in offering support and in willingness to update skills if needed.


Brennan, P. (2015). NINR Big Data Boot Camp part 4: Big data in nursing research. Retrieved from

Chandrasekaran, S. (2013). Becoming a data scientist – Curriculum via metromap. Retrieved from

Cohen, B., Vawdrey, D. K., Liu, J., Furuya, E. Y., Mis, F. W., Larson, E., & Hospital, N. Y. (2015). Challenges associated with using large data sets for quality assessment and research in clinical settings, 16(0), 117–124.

Federer, L. (2016). Research data management in the age of big data: Roles and opportunities for librarians. Information Services and Use, 36(1–2), 35–43.

Iwashyna, T. J., & Liu, V. (2014). What’s so different about big data?: A primer for clinicians trained to think epidemiologically. Annals of the American Thoracic Society, 11(7), 1130–1135.

Krumholz, H. M. (2014). Big data and new knowledge in medicine: The thinking, training, and tools needed for a learning health system. Health Affairs, 33(7), 1163–1170.

Longhurst, C. A., Harrington, R. A., & Shah, N. H. (2014). A “green button” for using aggregate patient data at the point of care. Health Affairs, 33(7), 1229–1235.

NCSU Libraries. (n.d.). Data Science and Visualization Institute for Librarians. Retrieved from

Categories: Data Science

Call for Feedback: NNLM Data Science and Data Management Training Needs Assessment

Tue, 2017-10-24 13:33

Are you interested in learning about biomedical and health research data management? Is there a specific area of data science/data management that you would like more information on?

The National Network of Libraries of Medicine (NNLM) Research Data Management Working Group requests feedback on the training needs throughout the country on data science and data management. The field of data science is broad in scope; encompassing a wide variety of areas including the generation, characterization, management, storage, analysis, visualization, integration, and use of large data sets relevant to biomedical and health research. Participation in this training needs assessment will provide NNLM direction for future educational opportunities.

To participate in this assessment, please visit: This survey will close November 30, 2017.


Categories: Data Science

Call for Reviewers: Biomedical and Health Research Data Management Training for Librarians

Thu, 2017-10-05 09:23

Are you an information professional experienced in research data management? Are you eager to share your knowledge with others and help expand the community of data librarians? The National Network of Libraries of Medicine Training Office has several opportunities for you to contribute to shaping a new training experience specifically for librarians.

This training is an 8-week online class with engaging lessons and practical activities, starting in January 2018. Students will complete a capstone project at the end of the course and the experience will culminate in a Capstone Summit at NIH on April 10-11, 2018.

Modules for the course may include, but are not limited to the following core research data management (RDM) areas:

  1. Data Lifecycle and RDM Overview
  2. Data Documentation
  3. Data Wrangling
  4. Data Standards, Taxonomies, and Ontologies
  5. Data Security, Storage, and Preservation
  6. Data Sharing and Publishing
  7. Data Management Plans
  8. RDM at Your Institution

We are looking for experienced data librarians to participate in this project as module reviewers, co-teachers, and/or mentors. You may (and are encouraged to) apply for more than one role, and for more than one module.

  • Reviewers: Critique module content, test exercises, make suggestions, add resources. Deliverable: written report of findings. (Due Nov 30) Paid $250.
  • Co-Teachers: Assigned to one or more modules. Work with course facilitator to create a case study related to module topic (due Nov 15). Provide feedback on student assignments and answer questions for your module(s) in a timely manner during the course (Jan-March 2018).
    Deliverables: Case study by deadline, written report of suggestions for class improvement (due April 2, 2018). Paid $750.
    Mentors: Participate in class discussions, sharing expertise as needed, during the course (January – March 2018). Provide at least 2 mentoring sessions to each assigned student (4-5) for completing the Capstone project, attend and participate in the Capstone Summit.
    Deliverables: written report of experience as mentor, suggestions for program improvement and sustainability of project. Paid $1250, and travel support to Capstone Summit up to $1250.

All reviewers, co-teachers, and mentors will be required to submit a W-9. Those receiving $1000 or more will also be required to complete a contract with the University of Utah.

Please submit your application via online form by October 20, 2017:

Application Includes:

  • Name
  • Current Role/Title
  • Place of Employment
  • Please briefly describe your area(s) of interest, research, or primary expertise in data management.
  • Please summarize your qualifications to serve as a content reviewer, co-teacher, and/or mentor for this research data management class.
  • Indicate which modules you would like to serve as a content reviewer and/or co-teacher.
  • Would you like to serve as a mentor for 4-5 students in completing the Capstone Project?
  • Curriculum vitae (attachment)

For questions, please contact: Shirley Zhao, Training Development Specialist:


Categories: Data Science