English Arabic Chinese (Simplified) French Hindi Japanese Korean Persian Portuguese Russian Spanish

Data Science

Call for pilot course participants: FAIR data for non-data librarians

SEA Data Science - Wed, 2020-12-16 12:59

NNLM SEA is proud to share an opportunity for joining a pilot course developed by a 2020-2021 Funded Project team from Virginia Commonwealth University, the University of Virginia, and George Mason University. Please see below for full details and contact information.

Are you a health sciences librarian in a public-facing role (for example, a subject specialist or hospital librarian)? Is helping patrons with their research data of interest to you, but not in your current skill set? Do you wish you had more knowledge of data-related issues and also more practical experience working with spreadsheet data?

We are currently accepting applications to pilot an online program called “Building FAIR Habits: Pilot FAIR Data Workshop for Data Novices”. This training opportunity seeks to meet data novice librarians at their comfort levels, in order to increase awareness of concepts like open science and FAIR data. We believe that librarians shouldn’t have to be data experts to introduce patrons to basic data management and open science concepts. We are looking for non-data librarians who want to learn how to talk to patrons about basic ways to preserve and share data. Since this is a pilot and due to the time commitment, each participant will be provided with a small stipend.

Workshop lessons will use examples from library spreadsheets to illustrate how FAIR practices apply to everyday data. Additional lessons will use the familiar library data as a bridge for understanding health assessment and health research data. The program will take place from February through April, 2021, online. Exact dates to be determined.

Brief Description

The program consists of:

1) workshops led by fellow librarians in data-centric roles;

2) online exercises and homework to explore and practice data management concepts;

3) discussions with trainers and fellow participants; and

4) a capstone project by participants to create an outreach or education object of their own (such as a brochure, short video, or LibGuide).

We expect it to take a minimum of 10 hours of lecture and discussion time, plus more time on exercises, homework, and capstone projects.

To Apply:

To apply, please complete this brief form. Preference will be given to librarians at NNLM SEA member libraries. Applications are due January 11.


Questions? Please contact Nina Exner, nexner@vcu.edu.

The post Call for pilot course participants: FAIR data for non-data librarians first appeared on SEA Currents.

Categories: Data Science

Living on the Data Fringe: Big Data, Small Data, Thick Data, Oh My …

MCR Data Science - Mon, 2020-12-14 10:32

As I have been watching the COVID-19 daily numbers of cases, hospitalizations, and deaths unfold, I have also been noticing the increase in the number of COVID ‘stories’ being shared across many news and media outlets. Even the hospitals are using a more qualitative approach to COVID-19 by having nurses and doctors tell their COVID-19 experience stories about the challenges of caring for patients, and their concerns about infecting their families, as they plead for people to wear masks. Although quantitative data (numbers and statistics) and qualitative data (words, stories and images) are very different, used together they each contribute to drawing a more holistic picture of our current and dire situation. One data approach is not better than another; in reality they support and enhance each other.

Big data are numerical or quantitative data (Example John Hopkins University COVID website). Big data analysis involves very large datasets, either structured or unstructured, that are analyzed by specialized software and requires advanced data skill to clean, manipulate, and synthesize the data. The NNLM Data Thesaurus is a great resource to learn more about big data. Big data can also be large textual datasets analyzed using computational processes like text mining, natural language processing, machine learning, and artificial intelligence methods (Ex. Medical record data). On the other hand, smaller and more manageable datasets of numerical data are called small data. It is data that you can manually analyzed in Excel, for example. An example of a small data website would be the CDC Places: Local Data for Better Health. Built on a larger national dataset, data are organized so you can easily visualize data on health outcomes, prevention, and unhealthy behaviors and then download small subsets of data for further analysis. Another example of small data is using library data such as gate counts, collection usage data, or instructional statistics to take action or make decisions about library work.

Textual or qualitative data that is analyzed in a more manual process is called thick data. Examples of qualitative data are  interviews or focus group transcripts, observation or field notes, and open-ended survey questions. Social media text can also be analyzed using qualitative methods such as thematic or sentiment analysis. For example related to Covid-19, a database of oral histories from the Voces of a Pandemic Collection at the University of Texas Austin, presents Latinx Covid-19 experiences and transcripts of these COVID stories could be analyzed for patterns and themes using qualitative thematic analysis.

Qualitative data analysis can provide a rich description of the quantitative data findings if the two types of data are used together. Quantitative data can be used to explore what is happening, and qualitative data  can be used to get at the why and how of what is happening. This mixed method research design (using both quantitative and qualitative data collection methods together) is becoming a more common method of data analysis for improving business organizations, exploring health science or medical topics, doing assessment and evaluation, designing products, and studying innovation practices. Tricia Wang, a technology ethnographer, makes a case for why big data needs thick data. Want to read more about how ethnography? The Association for Medical Education in Europe (AMEE) has created an informative guide on ethnography and how it is being used in medical education.

Image Sources: Numbers & Letters

The post Living on the Data Fringe: Big Data, Small Data, Thick Data, Oh My … first appeared on MidContinental Region News.

Categories: Data Science

Join the NNLM in December for Two New Data Literacy Webinars!

SEA Data Science - Mon, 2020-11-30 15:48
Better than Best Practices: Inclusive Data Visualization

Description: Data visualization design “best practices” often do not prioritize (or outright reject) efforts to be inclusive. Libraries have an opportunity to step into the world of data visualization and empower historically marginalized voices in data creation and sharing. This webinar will explore the intersections of equity, inclusion, accessibility, and data visualization to consider who we’re visualizing for, what we’re visualizing, and how and why we’re visualizing it.

Presenter: Negeen Aghassibake is the Data Visualization Librarian at the University of Washington Libraries. Her goal is to help researchers think critically about data visualization and how it might play a role in their work. Before coming to UW, Negeen was a graduate student at the University of Texas at Austin School of Information, where she worked as an Assessment Graduate Research Assistant at UT Libraries. She holds a BA in Historical Studies and Literary Studies from the University of Texas at Dallas.

Date: Thursday, December 10th, 11 PT/Noon MT/1 CT/2 E

Register: https://nnlm.gov/class/better-best-practices-inclusive-data-visualization/28122


How to “Speak Data”: Librarians as Public Data Ambassadors

Description: Data has become central to many aspects of civic life – governments run open data portals, organizations release public datasets, newspapers publish data-driven stories. How can librarians navigate this all more effectively? How can you help library patrons learn to use what they find in these data resources? Librarians can play an impactful role as “data ambassadors”, assisting their communities in finding and using data as part of their life as citizens. Join Prof. Bhargava for an interactive virtual workshop that will introduce participatory approaches to building your ability to “speak data”. You’ll leave the session with more confidence in your own ability to work with data, a language for talking about the role data can play in civic engagement, and experience with playful activities you can run yourself to build data literacy with your colleagues or patrons.

Speaker: Rahul Bhargava leads hands-on data workshops around the world. His Data Therapy workshops have been bringing people together around data with engaging activities for over 10 years. Rahul is co-creator of the Data Culture Project, which helps individuals and organizations build their data capacity in creative ways. He combines a background in interactive robotics, education, and effective data presentation to build creative and playful activities that introduce data literacy in appropriate ways to a variety of audiences. Rahul is an Assistant Professor in Journalism and Art + Design at Northeastern University, where he leads the Data Culture Group. He is an educator, technologist, drummer, and father. Twitter: @rahulbot

Date: Tuesday, December 15th , 11 PT/ 12 MT/ 1 CT/ 2 ET

Register: https://nnlm.gov/how-to-speak-data

Learn more about the Data Culture Group: https://medium.com/data-culture-group/librarians-as-civic-data-ambassadors-407fc9fd608

The post Join the NNLM in December for Two New Data Literacy Webinars! first appeared on SEA Currents.

Categories: Data Science

Living on the Data Fringe: Vaccines on the Mind

MCR Data Science - Wed, 2020-11-18 10:32


light at the end of the tunnel photoThe GOOD news on the vaccine front over the past few weeks related to the progress of the pharmaceutical companies may be an indicator that we are seeing the light at the end of this dark COVID-19 tunnel. Although no vaccine is 100% effective (WHO, 2020), numbers like 90 – 95% efficacy should bring us hope that the rising hospitalization numbers and death tolls will eventually decrease. However, we still need to be diligent in wearing masks and social distancing now more than ever because it will take time to implement a plan to vaccinate over 300 million people.

This good vaccine news made me think about some visualizations I saw in the past that were created to show just how effective vaccines can be. Before COVID, the Wall Street Journal in 2015 published a series of visualizations that depict the impact of several vaccines. This type of visualization is called a heat map and shows, through a range of color squares, how cases of disease have decreased across time and especially after the point where vaccines have been introduced. I hope to see the COVID-19 visualization get added to this list soon so that we can watch our states slowly move from red to blue. Not only is a heat map a compelling image that tells a story, it is also interactive and you can mouse over the color squares to see the data behind the square and explore the numbers in your own state.

Does this peak your interest to see more interesting visualizations? Here is a galley of visualizations created in Tableau Public, a free visualization software. In addition, The New York Times has a great website called “What’s Going On in This Graph?” that is being used to teach students about statistics.

Want to learn more about creating visualizations? NNLM has some great additional resources you can explore. This recorded webinar, Data Visualization: Theory to Practice provides an overview of data visualization and an introduction to some tools to create visualizations. This webinar recording, What’s in a Data Story? Understanding the Basics of Data Storytelling focuses on how storytelling and data visualizations are connected.

Sometimes a picture really is worth a thousand words!!

Photo source: Pickpik

The post Living on the Data Fringe: Vaccines on the Mind first appeared on MidContinental Region News.

Categories: Data Science

Reflecting on the 2019 American Medical Informatics Association Meeting, A Year Later

PSR Data Science - Tue, 2020-11-17 18:10

by John Borghi
Manager, Research and Instruction
Stanford University, Lane Medical Library

A little over a year ago, I boarded a plane to Washington DC to attend the 2019 meeting of the American Medical Informatics Association (AMIA). At this point in my career, I had been working in academic libraries for over six years. For much of that time, I had worked in biomedical settings and focused my activities on research data. I teach classes on data management and data sharing, but I had come to AMIA because I wanted to learn more about clinical data, informatics, and health information technology.

Over the next few days, I attended sessions on ethics in biomedical informatics, the emergence of artificial intelligence in healthcare, and so many other interesting topics that I was constantly exhausted and in search of coffee. Because the conference was in D.C., I also learned a lot about data-related initiatives at federal agencies, especially the National Institutes of Health and National Library of Medicine.

So why am I writing about this now? As I sit down to write this, the 2020 AMIA meeting is occurring. But rather than being held in a conference center it is, like so many other meetings in the last year, entirely virtual. Shortly after I returned from the 2019 meeting, the first cases of the disease we now know as COVID-19 began to emerge. I can’t even begin to summarize or even characterize the year that followed. But topics related to how researchers and clinicians collect, analyze, and apply data to healthcare decisions now consume so many of our personal, professional, and political conversations and activities. Everything I learned at last year’s meeting resonates very differently in the time of COVID.

The session I was most eager to attend last year was about the data-related initiatives at the NIH. At the time, I had just contributed to my institution’s response to a request for comments on a draft data management and sharing policy and I was eager to hear more about what was happening and what was planned in the future. A year later, and the final policy has been announced and I’m glad to see that the suggestions made by my peers and I- both in the meeting and in our written comments- have been integrated into the new policy. But also, the necessity of biomedical and health science researchers making the products of their work available (and in a usable form) to one another could not be clearer than during a global pandemic.

Another standout session I attended at the AMIA meeting concerned the All of Us Research Program, an effort to gather genetic and health data from one million or more people living in the United States in order to accelerate medical breakthroughs. At the time, I was amazed at the sheer scale of the project and interested in how the data would be curated and made available to the research community. Now, when I check the project’s website, I see there are a series of efforts to leverage the dataset to study COVID antibodies, survey the pandemic’s effect on community health, and use the electronic health record to study patterns and learn about COVID-related symptoms. Rather than a redirection of the project, this represents its immediate application.

When I proposed attending the 2019 AMIA meeting, I told my colleagues I wanted to explore another dimension of our profession- to understand more about how clinical data was actually being applied and used. Looking back now, at all of the notes I took during the meeting, I am struck by two things. The first is that the meeting feels like it occurred a lifetime ago. Everything surrounding my attendance at the meeting, from walking through a crowded airport to catch my flight to D.C. to presenting on what I saw to a room full of my colleagues upon my return, feels so remote now. But I am also struck by the immediacy of everything I learned at the meeting. Understanding and working to improve how clinical data is collected, analyzed, and applied are always absolutely vital pursuits. But the last year has shined a light on just how vital.

The post Reflecting on the 2019 American Medical Informatics Association Meeting, A Year Later first appeared on Latitudes.

Categories: Data Science

DataFlash: NIH Issues New Policy for Data Management and Sharing

PNR Data Science - Fri, 2020-11-06 12:24

The National Institutes of Health (NIH) has released its Final NIH Policy for Data Management and Sharing which requires NIH-funded researchers to prospectively submit a plan outlining how scientific data will be managed and shared. This new policy will replace the 2003 NIH Data Sharing Policy.  NIH will continue to engage the community to support the change and implementation of this new Policy, which will take effect January 25, 2023.

For more information, please read an NIH Director’s statement by Dr. Francis Collins as well as an “Under the Poliscope” blog by Dr. Carrie D. Wolinetz:

NIH Director’s Statement

Under the Poliscope: NIH Releases New Policy for Data Management and Sharing

For questions, please contact: SciencePolicy@od.nih.gov

The post DataFlash: NIH Issues New Policy for Data Management and Sharing first appeared on Dragonfly.

Categories: Data Science

Living on the Data Fringe: Through a Library Liaison Lens #1

MCR Data Science - Wed, 2020-10-28 15:40

Data can be scary! When we think about data, we often think about ‘big data’ or data science and how data scientists use programming skills to extract large datasets for analysis or use visualization software to display complex datasets. However, you don’t have to be a data scientist, a data librarian, or a health science librarian to be interested in data or use data in your daily life. I am proof of that concept. On a daily basis I need or use data in my librarian practice, my teaching, and/or my research. I contend that ALL librarians (academic librarians, school librarians, and public librarians) should consider how data integrates or impacts their own practice.

As the data coordinator for the MidContinental Region (MCR) in the Network of the National Library of Medicine, I like to think about a broader data vision. In addition to biomedical or clinical research data, data can be assessment data we collect in libraries, statistical data that our students need to find to support a paper argument, numerical or textual data collected and analyzed around a topic when conducting research. Helping students manipulate data can be one of the conduits for teaching digital literacy skills to students at a variety of levels. Data does not have to be ‘big’ data, it can be ‘small’ data or ‘thick’ data (more on this to come in future post). Starting small and learning about data as it impacts your daily work or life can be a great way to dip your toes into the data science world. Start by checking out the first level of the NNLM Data Roadmap (Data Demystified) to begin your data journey to scaffold up your knowledge and skills, and find data topics of interest that are relevant to YOUR OWN librarian context!

This is the first blog post in a blog series, Living on the Data Fringe: Through a Library Liaison Lens, that will appear on this MCR blog over the next few months to help you scaffold up with data and rethink using data in your practice. Two blogs each month will help take the scariness out of data, and provide a context to help you learn more about alternative data topics, understand the different levels of data usage and expertise, and try out some resources and tools. As librarians we can enhance our practice by learning more about data, and using data in our teaching/librarianship (using data to learn about our libraries), in our own research, and even help others find data for their research. Reach out with questions or suggestions for future data blog topics!

Stay tuned and Happy Halloween!
Photo Source: Needpix.com

The post Living on the Data Fringe: Through a Library Liaison Lens #1 first appeared on MidContinental Region News.

Categories: Data Science

NNLM’s Data Thesaurus Provides Key Tools for Data-Driven Exploration

SEA Data Science - Thu, 2020-10-22 14:57

Are you struggling to find a simple definition for key data terminologies? Wondering where to find resources and relevant literature regarding data vocabularies? Look no further! The Network of the National Library of Medicine’s Data Thesaurus provides key tools for data-driven exploration.

The Data Thesaurus is a resource connecting and defining concepts, services, and tools relevant to librarians working in data-driven discovery. A definition, relevant literature, and web resources accompany each term along with links to related terms. Users can search or browse the 70 different terms.

Launched in 2013, the original data thesaurus has undergone updates and transformations. As the world of data evolves, so too does the thesaurus. In fact, over the past year, a group of dedicated librarians from across the country have come together to serve on the NNLM Data Thesaurus Advisory Group. Members of the Advisory Group are working on evaluating and updating the current thesaurus with new resources, terms, and definitions. As you explore the thesaurus, please share your feedback! Do you see missing terms? Broken links? General feedback? We’re open to hearing it all!

We hope the Data Thesaurus proves to be a useful resource for you and your stakeholders!

The post NNLM’s Data Thesaurus Provides Key Tools for Data-Driven Exploration first appeared on SEA Currents.

Categories: Data Science

DataFlash: Library Carpentry Workshops (October ’20 – January ’21)

PNR Data Science - Mon, 2020-10-19 14:32

In partnership with The Carpentries, NNLM’s National Training Office (NTO) are thrilled to bring core lessons of Library Carpentry virtually to NNLM. Library Carpentry focuses on building software and data skills within library and information-related communities. Their hands-on, approachable workshops empower people in a variety of roles to use software and data in their own work and support effective, efficient, reproducible practices.

The NNLM Training Office is pleased to announce a new opportunity for information professionals to build data skills through online Library Carpentry workshops, at no cost to participants. 5 workshops will be offered October 2020 through January 2021. Applications and more information available here. Questions can be directed to nto@utah.edu.

The post DataFlash: Library Carpentry Workshops (October '20 - January '21) first appeared on Dragonfly.

Categories: Data Science

Online Library Carpentry: Seats available November 3-6!

SEA Data Science - Fri, 2020-10-16 14:14

The University of Maryland, Baltimore is hosting an online Library Carpentry workshop from noon – 4 pm, November 3rd – 6th.

Library Carpentry focuses on building software and data skills within library and information-related communities. Their hands-on, approachable workshops empower people in a variety of roles to use software and data in their own work and support effective, efficient, reproducible practices.

This opportunity is FREE for SEA members, and eligible for 20 MLA CEs. Seats are limited.

Contact Kiri Burcat at kburcat@hshsl.umaryland.edu with questions and visit the workshop webpage for information and to register.

The post Online Library Carpentry: Seats available November 3-6! first appeared on SEA Currents.

Categories: Data Science

The Charts are Off: Approaches to Ethical Decision-Making in Data Visualization

SEA Data Science - Wed, 2020-10-14 10:47

Description: Many of us work in environments filled with “data-driven” decision-making and regular reporting of data to justify our budgets and planning. Data Visualization can be a powerful tool for telling our story and presenting the facts on the ground, but how can we make ethically informed decisions when visualizing our data? This talk will discuss different ethical frameworks and how they can inform the decisions we make in data visualization. We aim to go beyond discussion of avoiding misleading charts and into the ethical decision-making frameworks that inform how to present our data.

This is part of the Research Data Management Webinar Series.


Nicole Contaxis, MLIS is the Project Lead for the NYU Data Catalog at the NYU Health Sciences Library. Nicole develops the vision and strategy for the future of the NYU Data Catalog, including software development, curation, and partnerships with allied departments at the institutions. She leads NYU’s participation in the Data Discovery Collaboration, a national effort to improve institutional data discovery. Her areas of interest include data sharing, data ethics, and community engagement. Nicole is a former National Digital Stewardship Resident at the National Library of Medicine. She received her MLIS from UCLA and is currently working on her M.A. in Bioethics at NYU.

Fred LaPolla, MLS is a Research and Data Librarian and Lead of Data Education at NYU Health Sciences Library. He works with the library’s Data Services team and serves as liaison to the Departments of General Internal Medicine and Clinical Innovation (DGIMCI) and Radiology. Fred also teaches Rigor and Reproducibility and R Programming in the Grossman School of Medicine Sackler Institute of Graduate Biomedical Sciences. He is passionate about professional education and finding ways to facilitate learning around data collection, management, visualization and analysis. Fred holds a Masters of Library Science (MLS) from Queens College, CUNY.

When: October 22nd, 2020 | 11 AM PT/ 12 PM MT/ 1 PM CT/ 2 PM ET

Register: https://nnlm.gov/rdm-webinar-oct2020

The post The Charts are Off: Approaches to Ethical Decision-Making in Data Visualization first appeared on SEA Currents.

Categories: Data Science

Recordings for the NNLM PSR Subawards Webinar & Research Data Management (RDM) Webinar Series Now Available!

PSR Data Science - Mon, 2020-09-28 22:30

On September 22, 2020, NNLM PSR hosted a special funding webinar, NNLM PSR Subawards: Guidelines, Resources, and Answers! . This webinar welcomed awarded project liaisons to the 2020-2021 NNLM PSR subaward program, going over some key points in subaward guidelines, demonstrating the DRS data reporting system, providing an overview of some of the resources NNLM PSR offers for subawardees, and answering any questions subawardees have for NNLM staff. To view the webinar, visit the NNLM PSR YouTube playlist or click on the YouTube video player below.

On September 24, 2020, the NNLM Research Data Management (RDM) Webinar Series presented, Operationalizing the CARE Principles for Indigenous Data Governance in Research Data Management, with speaker, Stephanie R. Carroll, Assistant Professor and Associate Director of the Native Nations Institute at the University of Arizona. This webinar focuses on the CARE Principles and identifies practical tools for implementing the CARE Principles alongside the FAIR Principles in the context of the open science and open data environments. This webinar may be of interest to those working with Indigenous data or collections, as well as metadata librarians and those interested in open access policies and managing institutional repository. To view the webinar, visit the NNLM Research Data Management YouTube playlist, or play the video below.

The post Recordings for the NNLM PSR Subawards Webinar & Research Data Management (RDM) Webinar Series Now Available! first appeared on Latitudes.

Categories: Data Science