National Network of Libraries of Medicine
English Arabic Chinese (Simplified) French Hindi Japanese Korean Persian Portuguese Russian Spanish

SEA Data Science

Subscribe to SEA Data Science feed SEA Data Science
News for Network Members in Alabama, District of Columbia, Florida, Georgia, Maryland, Mississippi, North Carolina, Puerto Rico, South Carolina, Tennessee, U.S. Virgin Islands, Virginia and West Virginia
Updated: 37 min 35 sec ago

Moodle Class Announcement: Big Data in Healthcare: Exploring Emerging Roles – July 9 – August 31, 2018

Tue, 2018-06-12 15:45

The National Network of Librarians of Medicine (NNLM) invites you to participate in Big Data in Healthcare: Exploring Emerging Roles. This course will be primarily held via the Moodle platform with optional WebEx discussions. This course is designed to help health sciences librarians understand the issues of big data in clinical outcomes and what roles health sciences librarians can take on in this service area.

DatesJuly 9 – August 31, 2018

Register: To register for this course, please visit the class details page.

The class size for this course is limited to 40 students. We will begin a waitlist if there are more interested in participating.

Course instructors for the winter session are Ann Glusker, Pacific Northwest RegionDerek Johnson, Greater Midwest RegionAlicia Lillich, MidContinental RegionAnn Madhavan, Pacific Northwest RegionAimee Gogan, Southeastern/Atlantic Region, and Elaina Vitale, Mid-Atlantic Region.

Please contact Aimee Gogan with questions.

Description: Class Overview

Big Data in Healthcare: Exploring Emerging Roles

The Big Data in Healthcare: Exploring Emerging Roles course will help health sciences librarians better understand the issues of big data in clinical outcomes and what roles health sciences librarians can take on in this service area. Course content comes from information shared by the presenters at the March 7, 2016 NNLM Using Data to Improve Clinical Patient Outcomes Forum, top selections from the NNLM MCR Data Curation/Management Journal Club and NNLM PSR Data Curation/Management Journal Club’s articles, NINR’s Nursing Research Boot Camp, recommended readings from previous cohorts, and Big Data University’s Big Data Fundamentals online course.

Participants will have the opportunity to share what they learned with the instructor from each section of the course content either through WebEx discussions or Moodle Discussions within each Module. These submissions can be used to help support the student’s views expressed in the final essay assignment.

Objectives: Students who successfully complete the course will:

  • Explain the role big data plays in clinical patient outcomes.
  • Explain current/potential roles in which librarians are supporting big data initiatives
  • Illustrate the fundamentals of big data from a systems perspective
  • Articulate their views/options on the role health sciences sector librarians is in supporting big data initiatives

NOTE: Participants will articulate their views on why health sciences librarians should or should not become involved in supporting big data initiatives by sharing a 500-800 word essay. Students are encouraged to be brave and bold in their views so as to elicit discussions about the roles librarians should play in this emerging field. Participants are encouraged to allow their views to be published on a NNLM online blog/newsletter as part of a dialog with the wider health sciences librarian community engaging in this topic. Your course instructors will reach out to you following the completion of the course.

On top of information gained, being a part of the big data in clinical care dialog, and earning 9 continuing education credits from the Medical Library Association, students may earn an IBM Open Badge program from the Big Data University.

This is a semi-self-paced course (“semi” meaning there are completion deadlines).

Course Expectations: To complete this course for nine hours of MLA contact hours, participants are expected to:

  • Spend 1-2 hours completed the work within each module.
  • Commit to complete all activities and articulate your views within each module.
  • Complete course requirements by the deadline established in each module.
  • Coordinate with a course instructor to publish your observations/final assignments on a NNLM blog/newsletter
  • Provide course feedback on the Online Course Evaluation Form

Grading: Grades for this course is simply a pass/fail grading system. When your submission meets the assignment’s expectations, you will receive full credit for the contact hours for that Module. For submissions that are unclear or incomplete, you may be requested for more information until your instructor approves.

  • For discussion posts, your activity will be marked as complete after you’ve submitted a discussion AND your instructor assigns a point to mark as complete
  • If you participate in WebEx Journal Club Discussions (when available), your instructor will assign points in the Discussions for that module.
  • Students have the option to accept fewer contact hours. However, you will need to inform your course instructors ahead of time.

This is a Medical Library Association approved course that will earn students 9 contact hours.

 

The National Network of Librarians of Medicine (NNLM) invites you to participate in Big Data in Healthcare: Exploring Emerging Roles. This course will be primarily held via the Moodle platform with optional WebEx discussions. This course is designed to help health sciences librarians understand the issues of big data in clinical outcomes and what roles health sciences librarians can take on in this service area.

DatesJuly 9 – August 31, 2018

Register: To register for this course, please visit the class details page.

The class size for this course is limited to 40 students. We will begin a waitlist if there are more interested in participating.

Course instructors for the winter session are Ann Glusker, Pacific Northwest RegionDerek Johnson, Greater Midwest RegionAlicia Lillich, MidContinental RegionAnn Madhavan, Pacific Northwest RegionAimee Gogan, Southeastern/Atlantic Region, and Elaina Vitale, Mid-Atlantic Region.

Please contact Aimee Gogan with questions.

Description: Class Overview

Big Data in Healthcare: Exploring Emerging Roles

The Big Data in Healthcare: Exploring Emerging Roles course will help health sciences librarians better understand the issues of big data in clinical outcomes and what roles health sciences librarians can take on in this service area. Course content comes from information shared by the presenters at the March 7, 2016 NNLM Using Data to Improve Clinical Patient Outcomes Forum, top selections from the NNLM MCR Data Curation/Management Journal Club and NNLM PSR Data Curation/Management Journal Club’s articles, NINR’s Nursing Research Boot Camp, recommended readings from previous cohorts, and Big Data University’s Big Data Fundamentals online course.

Participants will have the opportunity to share what they learned with the instructor from each section of the course content either through WebEx discussions or Moodle Discussions within each Module. These submissions can be used to help support the student’s views expressed in the final essay assignment.

Objectives: Students who successfully complete the course will:

  • Explain the role big data plays in clinical patient outcomes.
  • Explain current/potential roles in which librarians are supporting big data initiatives
  • Illustrate the fundamentals of big data from a systems perspective
  • Articulate their views/options on the role health sciences sector librarians is in supporting big data initiatives

NOTE: Participants will articulate their views on why health sciences librarians should or should not become involved in supporting big data initiatives by sharing a 500-800 word essay. Students are encouraged to be brave and bold in their views so as to elicit discussions about the roles librarians should play in this emerging field. Participants are encouraged to allow their views to be published on a NNLM online blog/newsletter as part of a dialog with the wider health sciences librarian community engaging in this topic. Your course instructors will reach out to you following the completion of the course.

On top of information gained, being a part of the big data in clinical care dialog, and earning 9 continuing education credits from the Medical Library Association, students may earn an IBM Open Badge program from the Big Data University.

This is a semi-self-paced course (“semi” meaning there are completion deadlines).

Course Expectations: To complete this course for nine hours of MLA contact hours, participants are expected to:

  • Spend 1-2 hours completed the work within each module.
  • Commit to complete all activities and articulate your views within each module.
  • Complete course requirements by the deadline established in each module.
  • Coordinate with a course instructor to publish your observations/final assignments on a NNLM blog/newsletter
  • Provide course feedback on the Online Course Evaluation Form

Grading: Grades for this course is simply a pass/fail grading system. When your submission meets the assignment’s expectations, you will receive full credit for the contact hours for that Module. For submissions that are unclear or incomplete, you may be requested for more information until your instructor approves.

  • For discussion posts, your activity will be marked as complete after you’ve submitted a discussion AND your instructor assigns a point to mark as complete
  • If you participate in WebEx Journal Club Discussions (when available), your instructor will assign points in the Discussions for that module.
  • Students have the option to accept fewer contact hours. However, you will need to inform your course instructors ahead of time.

This is a Medical Library Association approved course that will earn students 9 contact hours.

 

 

Categories: Data Science

Big Data in Healthcare – Opportunities for Librarians

Wed, 2018-05-09 14:12

by Douglas J. Joubert, Informationist, NIH Library, Washington, DC

Over the last seven weeks, in the Big Data in Healthcare – Opportunities for Librarians, we learned about big data and data science within the context of five distinct disciplines. This essay will provide an overview of big data and data science within each of the five disciplines, with a focus on how librarians can support researchers working in these fields.

Although not focused exclusively on Big Data, a recent report has strongly advocated for an increased role for librarians in the field of data science (Burton, Lyon, Erdmann, & Tijerina, 2018). This report outlines a multi-faceted framework for understanding the internal (within the discipline) and external (within the broader science disciplines) drivers that are changing the way in which we think about data.

Data science is one those terms that can take on different meanings, based on a particular practice area. One of the more popular representations of data science is that of Drew Conway. Conway represents data science as the intersection between three primary domains [Figure 1]. It is not vital that librarians be experts in each of the three domains that comprise this Venn diagram, nor is it even possible. What is important, and serves as the primary thesis of this essay is that librarians be grounded in how researchers in each of these areas produce, organize, and analyze data.

The Data Science Venn Diagram

 

 

 

 

 

 

 

 

 

 

Figure 1: The Data Science Venn diagram[1].

This course introduced us to a number of different perspectives on the topic of big data. The first view was provided by a data informationist (Lisa Federer) who works for a large biomedical research center. She defined big data as having a number of distinct qualities. This first of these qualities is the amount of data being produced, commonly referred to as its volume (Federer, 2017). The second quality is the variety of the data, specifically, pulling data from many different sources, in many different formats (Federer, 2017). The third feature of big data is the rate in which the data is being produced, or its velocity (Federer, 2017). Last, is data veracity. This refers to how much trust we place in the source of the data and the data quality (Federer, 2017). Additional definitions were provided by two social scientists, a practicing clinician, and a nursing researcher.

The nursing perspective provided some additional insights that are worth exploring. First is the unique role that nurses play in the delivery of health care, and how this role influences big data research (Brennan, 2015). Second, Dr. Brennan emphasized that terms like the Big Data to Knowledge (BD2K), big data, and precision medicine mean different things to different people (Brennan, 2015). The role for nursing to play is making these terms meaningful to patients and their families. Last, she emphasized that these tools need to be understood from the nursing experience, which takes a more humanistic approach when compared to the traditional medical model of health care delivery. Nurses are focused on getting the goals of precision medicine into the “hands of the people” (Brennan, 2015). All of these different perspectives are needed to fully understand the role of big data and how big data is changing the way that we conduct research, deliver health care, and make informed decisions.

Using three elements from Martin’s User-Centered Data Management Framework for Librarians, I will advocate for the increased role of librarians in both data science and big data initiatives. These elements are: (1) Service, (2) Best Practices for Data Management, and (3) Literacy (Martin, 2016).

Libraries have a long and rich history of providing services to different user groups. Adding data services as a component to more traditional library services allows libraries to respond to an increased demand for specialized levels of support for data science. Potential roles for librarians could fall into the following categories (1) data extraction, (2) data wrangling, (3) data analysis, or (4) data visualization (Hamalainen, 2016). Some of these skills, like data extraction or data analysis, can be performed without much additional training. Data wrangling and data visualization are not out of reach for most librarians, if they get supplemental training. These four areas also require the least amount of overhead when compared with, for example, hosting a data repository.

Also, many data service questions are very similar to the types of reference questions that librarians have traditionally answered. For example:

  • Knowing where to find authoritative and curated datasets
  • Knowing the best methods for searching datasets
  • Knowing how to choose the best software solutions
  • Knowing about current metadata schemas for data

Each week in this class presented us with a different challenge for managing data, and innovative solutions for dealing with these challenges. We also learned that these challenges are being addressed by local and national initiatives. At the federal level, a 2013 report was released by the Office of Science and Technology that outlined a number of important policy principles (Holdren, 2013). Many of these principles align to the work of libraries, and present us with numerous opportunities. The first is helping researchers comply with changing grant requirements. Second is working with researchers in efforts to maximize transparency and accountability in terms of collecting and storing data. Last is connecting researchers with tools like the Open Science Framework to support data sharing and increasing reproducibility.

As someone who has spent a great deal of his professional life teaching library users, this topic resonates the most with me. Also, I feel that librarians make some of the best teachers. Teaching about data literacy, data analysis, and data management offers incredible potential for librarians. It has been my experience that starting small is the best entry point into teaching these topics, for example, working with a colleague to develop a data literacy class, or volunteering to serve as a teaching assistant or back-up for a more seasoned teacher. Teaching a class in R or Python are admirable goals. However, it might not be the best place to start, nor is it necessarily the right solution for your library. Finally, look for both formal and informal professional development opportunities. This MOOC (Big Data in Healthcare[2]) and Best Practices for Biomedical Research Data Management[3] are just two recent examples of librarian-led data management classes. However, Meet Up groups[4] and connections developed through Social Media are also wonderful way to learn and network.

References

Brennan, P. (2015). Big Data in Nursing. Bethesda: NINR Big Data Bootcamp.
Burton, M., Lyon, L., Erdmann, C., & Tijerina, B. (2018). Shifting to Data Savvy: The Future of Data Science in Libraries.
Federer, L. (2017). Data Science 101. NNLM Beyond the SEA Webinar Series.
Hamalainen, H. W. (2016). Geoscience Librarianship 101: Making Sense out of “GeoReference.” Baltimore.
Holdren, J. P. (2013). Increasing Access to the Results of Federally Funded Scientific Research. Retrieved from https://obamawhitehouse.archives.gov/sites/default/files/microsites/ostp/ostp_public_access_me mo_2013.pdf
Martin, E. (2016). The Role of Librarians in Data Science: A Call to Action. Journal of eScience
Librarianship, e1092. http://doi.org/10.7191/jeslib.2015.1092

 

[1] http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram
[2] https://nnlm.gov/moodle/enrol/index.php?id=703
[3] https://learn.canvas.net/courses/1854
[4] http://www.datacommunitydc.org/calendar/  or https://www.meetup.com/find/s

Categories: Data Science

NNLM Research Data Management Webinar Series: Research Data Management Services: Beyond Analysis and Coding – June 14, 2:00 PM ET

Fri, 2018-05-04 10:10

Date/Time: Thursday, June 14, 2018, 2:00 PM ET/11:00 AM PT

Presenter: Margaret Henderson, Health Sciences Librarian, San Diego State University Library, San Diego, CA

Contact: For additional information or questions, please contact Tony Nguyen.

Abstract: There is more to RDM services than the technical skills necessary for data management. Soft skills and non-technical skills are very important when setting up RDM services, and continue to be important to the sustainability of services. Reference skills, relationship building, negotiation, listening, facilitating access to de-centralized resources, policy knowledge and assessment, are all important to the success of a service. Margaret Henderson will discuss these skills and show you how to start RDM services, even if you don’t feel confident about your statistical skills or knowledge of R.

Presenter Bio: Margaret Henderson was recently appointed Health Sciences Librarian at San Diego State University Library. She is liaison to the College of Health and Human Services and is also working with other Librarians at SDSU to set up RDM services. Previously, she spent three and a half years setting up RDM services at Virginia Commonwealth University Libraries. Margaret has been a biomedical librarian for over 30 years and is a Distinguished member of the Academy of Healthcare Information Professionals. She has presented and written on many library topics over the years, and wrote the book, Data Management: A Practical Guide for Librarians (2016, Rowman & Littlefield).

Registration: Please visit our class page to sign up!

Categories: Data Science

Reflections on Big Data in Healthcare: Exploring Emerging Roles

Thu, 2018-05-03 08:15

Written by: Paul Levett, Reference and Instructional Librarian, Himmelfarb Health Sciences Library, George Washington University, Washington, DC

Do you think health sciences librarians should get involved with big data in healthcare?

Of the four V’s: velocity, volume, variety, value, described in Cognitive Class (n.d.), it is value where medical librarians come into a discussion about Big Data because we add value to unstructured data, we bring order to chaos! Traditionally librarians have done this by creating metadata about learning objects, e.g. cataloging, finding aids, & infographics. However data mining, cleaning, analysis, and visualization requires computer programming, mathematics, and statistics skills not part of library school MLS programs.  

Burton & Lyon (2017) point to a technical skills gap that prevents librarians from contributing to big data initiatives. They promote the NCSU Data Science and Visualization Institute and Library Carpentry workshops to provide knowledge and opportunity to practice. But the NCSU Data Science and Visualization Institute lasts just one week, nowhere near enough time to develop and practice computer programming language, math, and statistics skills. Library Carpentry workshops typically are one-off instructional sessions that offer even less time, although I appreciate that the course material is available online at http://librarycarpentry.github.io/.  

If we look at the argument should librarians be doing data science, you can argue data science skills do touch on all the domains identified by Drummond et al (2015, Fig. 3 p.15) in the national librarian education needs assessment. Were I invited to suggest a program for developing the necessary skills to work in Big Data in Health Care Information Systems I would suggest a program like the MSc Data Analytics program in the University of Sheffield Department of Computer Science, that provides opportunities to study R and Python programming and statistical analysis and work on a real world project to apply those skills over a one year timeframe. Students on this program apply advanced Mathematics skills which is why the program requires an undergraduate degree in mathematics, economics, accounting, physics, chemistry or engineering.  

This suggests a need for the creation of a data scientist specialty role, but I am not convinced the Library actually is the best home for that role. Recently Simmons College (2017) surveyed 1117 graduates of their MLS program about core librarian professional skills and knowledge, of whom nobody rated data science as a core or a specialized skill, 14 mentioned statistics/working with data, only 6 mentioned data science/curation/management. As recently as last November in the IMLS (2017) meeting on positioning MLS programs for the 21st century there was lots of discussion about increasing the diversity of the profession but only one mention of data curation.

Tsakalos (2017) described Data wrangling as “the process of importing, cleaning, and transforming raw data into actionable information for analysis. It is a time-consuming process that is estimated to take about 60-80% of analysts’ time.” I feel the current push for librarians to develop data wrangling skills is perilously close to an admission from data analysts they want to offload what appears to be an onerous burden. This role would better fit someone working in University Departments of Computer Science, Mathematics, Statistics, or Epidemiology and Biostatistics.  It’s critical for librarians to manage expectations that the library is not a raw data processing warehouse but instead is a knowledge repository.

Where should librarians get involved?

There may be a role for librarians to pass on to Hospital IT departments information about updates and changes to important biomarkers where those need to be manually set as parameters by programmers building clinical decision support on top of EHR systems, however as this enters the realm of medico-legal responsibility the onus should be on EHR software developers to perform this necessary ongoing maintenance role.

Krumholz (2014) described how observational non-experimental studies generate data to support causal inferences and he points to comparative effectiveness studies as a potentially useful application of cluster analysis on large clinical data sets. A systematic review should be a pre-requisite for any health policy comparative effectiveness study, and this is where I as a librarian could best employ my literature search skills.

Librarians could be trained and certified to deliver RedCAP training, the data capture form design issues are similar to Microsoft Access, librarians would benefit by developing a deeper understanding of study design issues such as timing follow-up, patient data protection principles, and setting automated reminder parameters, while the enterprise would benefit from additional trainers to further spread the use of the RedCAP clinical trial data collection tool.

Sources

Burton, M., and Lyon, L. (2017). Data Science in Libraries. Research Data and Preservation (RDAP) Review. Bulletin of the Association for Information Science and Technology, 43(4) 33-35.

Cognitive Class/Fireside Analytics (n.d.). Big Data 101. Retrieved from  https://cognitiveclass.ai/courses/what-is-big-data/

Drummond, C., Clareson, T., Gemmill Arp, L., and Skinner, K. (2015). Libraries, Archives, and Museums (LAM) Education Needs Assessments: Bridging the Gaps. Retrieved from https://educopia.org/sites/educopia.org/files/publications/MtL_LAM_EducationNeedsAssessments_20151104_0.pdf

U.S. Institute of Museum and Library Services (IMLS) (2017). Positioning MLS programs for the 21st century. Retrieved from https://www.imls.gov/news-events/events/positioning-library-and-information-science-graduate-programs-21st-century

Krumholz, H. M. (2014). Big data and new knowledge in Medicine. Health Affairs, 33(7): 1163-1170

Simmons College (2017). Librarian professional skills and knowledge survey April 2017. Retrieved from http://slis.simmons.edu/blogs/unbound/2017/05/17/core-skills-lis/

Tsakalos, V. (2017). Data wrangling. Retrieved from https://www.r-bloggers.com/data-wrangling-cleansing-regular-expressions-33/

Categories: Data Science

Reflections on Big Data in Healthcare: Exploring Emerging Roles

Thu, 2018-04-05 09:57

Written by: Monica Riley, Serials Librarian, Morehouse School of Medicine, Atlanta, GA

As librarians and information managers, it is our duty to stay abreast of emerging trends and technologies and how they might impact our users. Health Sciences librarians can and should get involved with big data in healthcare, but to what extent I’m not certain. There are many factors that may present challenges in providing data services including staffing, knowledge and expertise, budget, and technology deficiencies. Health Sciences librarians need to think strategically and collaboratively about what type of data services they might be able to provide, and how best to execute them. Recognizing that all aspects of data management require a specific skill set, areas for training and continuing education should be identified, and these newly learned skills put into practice on a consistent basis.

As the trend of big data and research data management becomes more massive, organizations should consider incorporating data services into their strategic plans and developing priority areas with specific timelines. While I don’t think it’s necessary nor always feasible for librarians to become data scientists, at minimum we need to be prepared to answer questions related to big data and point to resources. One thing libraries can do immediately is assemble a big data/data science working group, with a cross-section of staff who can assess the needs at their institution, and contribute unique perspectives and ideas on how best to address those needs. Through these discussions the working group can develop an action plan and establish big data initiatives. Something as simple as creating an online resource guide or other research guide with general information, tools, and resources for big data is a good way to test the waters without getting into the complexities of big data. This could be either a collaborative or individual effort.

Thinking about a user-centered approach as Elaine R. Martin discussed in her article “The Role of Librarians in Data Science: A Call to Action”, identifying user needs is the first step in determining whether or not you can successfully provide data services. Some questions that would need to be addressed are, what type of data services are your users asking for? And who on staff is readily available with the knowledge and expertise to handle those requests? This will most likely involve partnering with other departments, as well as getting valuable feedback from users.

As health sciences librarians attempt to make sense of and define their specific role in big data, for the immediate future I feel our talents would be best suited in a supportive role. Providing consultations and assistance with research data management, bringing awareness to big data resources, and possibly facilitate training on big data tools are just a few ways we can contribute. How we gather, analyze, store, and preserve information is constantly evolving, so should our roles as librarians. Although big data can be very complex, and the idea of assuming these responsibilities can be extremely daunting, I think it’s important for us to remain steadily involved in these discussions. As others may not recognize the skills and value that librarians bring to the table, we need to advocate for ourselves and create opportunities to become a part of this big data movement.

Categories: Data Science