Archive for the ‘Data’ Category
Tuesday, February 24th, 2015
The Next Generation of Access to Sequencing Data: Using NCBI’s SRA Toolkit to Access Data from dbGaP and SRA
Next Wednesday, February 25, NCBI staff will present a webinar on the SRA Toolkit (Sequence Read Archive), a system for accessing the approximately 3.4 Petabases of next-generation genomic and expressed sequence data housed in the NCBI Sequence Read Archive (SRA). As data sets become larger, mining information and performing comparisons directly from structured databases becomes increasingly necessary. The SRA Toolkit is not only capable of dumping the data out as a fastq or sam file, but also provides direct analysis and comparison from specific genomics regions across hundreds or thousands of samples.
In the webinar, we will show examples of configuration and use of the Toolkit for both public SRA and controlled access data associated with studies in the Database of Genotypes and Phenotypes (dbGaP).
To register for this webinar, please go here: https://attendee.gotowebinar.com/register/2847950984085163009
Monday, November 18th, 2013
The Lamar Soutter Library at the University of Massachusetts Medical School has recently released the New England Collaborative Data Management Curriculum which offers openly available materials that librarians can use to teach research data management best practices to students in the sciences, health sciences and engineering fields, at the undergraduate and graduate levels. The materials in the curriculum are openly available, with lecture notes and slide presentations that librarians teaching RDM can customize for their particular audiences. The curriculum also has a database of real life research cases that can be integrated into the curriculum to address discipline specific data management topics.
Each of the curriculum’s six online instructional modules aligns with the National Science Foundation’s data management plan recommendations and addresses universal data management challenges. Included in the curriculum is a collection of actual research cases that provides a discipline specific context to the content of the instructional modules. These cases come from a range of research settings such as clinical research, biomedical labs, an engineering project, and a qualitative behavioral health study. Additional research cases will be added to the collection on an ongoing basis. Each of the modules can be taught as a stand-alone class or as part of a series of classes. Instructors are welcome to customize the content of the instructional modules to meet the learning needs of their students and the policies and resources at their institutions.
Monday, September 30th, 2013
Guest Author: Susan Barnes, Assistant Director, NN/LM Outreach Evaluation Resource Center (OERC), Health Sciences Libraries and Information Center, University of Washington
The 2nd Edition of the Planning and Evaluating Health Information Outreach Projects series of 3 booklets http://nnlm.gov/evaluation/guides.html#A2 is now available online from the NN/LM Outreach Evaluation Resource Center (OERC).
Getting Started with Community-Based Outreach (Booklet 1) http://nnlm.gov/evaluation/booklets508/bookletOne508.html
What’s new? More emphasis and background on the value of health information outreach, including its relationship to the Healthy People 2020 Health Communication and Health Information Technology topic areas
Planning Outcomes-Based Outreach Projects (Booklet 2) http://nnlm.gov/evaluation/booklets508/bookletTwo508.html
What’s new? Focus on uses of the logic model planning tool beyond project planning, such as providing approaches to writing proposals and reports.
Collecting and Analyzing Evaluation Data (Booklet 3) http://nnlm.gov/evaluation/booklets508/bookletThree508.html
What’s new? Step-by-step guide to collecting, analyzing, and assessing the validity (or trustworthiness) of quantitative and qualitative data, using questionnaires and interviews as examples.
These are all available free to network members. To request printed copies, send an email to firstname.lastname@example.org. PDF versions of all three booklets are available here: http://nnlm.gov/evaluation/guides.html#A2 .
The Planning and Evaluating Health Information Outreach Projects series, by Cynthia Olney and Susan Barnes, supplements and summarizes material in Cathy Burroughs’ groundbreaking work from 2000, Measuring the Difference: Guide to Planning and Evaluating Health Information Outreach. Printed copies of Burroughs’ book are also available free—just send an email request to email@example.com.
Wednesday, September 25th, 2013
Is a table worth a thousand words? Sometimes you need an understandable and memorable diagram that will illustrate what you are trying to say. This Periodic Table of Visualization Methods http://www.visual-literacy.org/periodic_table/periodic_table.html# (a resource from Visual-Literacy.org http://www.visual-literacy.org/) provides examples of 100 visualization methods.
This table is not just a cool looking list of visualization methods, but it also uses the format that we are familiar with from the Periodic Table of Elements to organize the visualization techniques into different types and purposes.
For example, the chart is color coded from yellow to purple. The colors represent different kinds of visualization types: Data Visualization; Information Visualization; Concept Visualization; Strategy Visualization; Metaphor Visualization; and Compound Visualization (examples below). Scrolling over each box in the Table will bring a pop-up window with an example in it.
In addition, several other pieces of information about the methods are contained in this table. There are icons that show if the visualization method is process visualization (depicting a temporal sequence) or structure visualization (depicting conceptual relationships), and whether the different methods show macro patterns (overview) or micro patterns (detail), and finally whether the methods demonstrate divergent or convergent thinking. These can help you determine whether this visualization technique might be right for you.
Here are some examples from each of the main categories of visualization methods:
Data Visualization: example Area Chart
Information Visualization: example Radar Chart
Concept Visualization: example Argument Slide
Strategy Visualization: example Fishbone Diagram
Metaphor Visualization: example Tree
Compound Visualization: example Knowledge Map
Friday, September 6th, 2013
One of the Centers in The National Network of Libraries of Medicine is the Outreach and Evaluation Resource Center (OERC). This center is located at the University of Washington in Seattle, WA http://nnlm.gov/evaluation. The OERC has created a Guide to evaluation tools and other resources that you and your library can use to evaluate your programs: http://guides.nnlm.gov/content.php?pid=494137&sid=4058311. Here are some of the tools and resources described in the Guide:
Community Oriented Outreach
- Building Partnerships: tips on successful collaborations, tools for improving collaboration with community networks
- Participatory Evaluation: toolkits for practical participatory evaluation, processes for conducting outcome-based evaluations
- OERC Guides to incorporating evaluation planning into your outreach projects
- Evaluation planning resources from other organizations, including logic models
- List of outreach projects funded by NLM
Data Collection and Analysis
- Needs Assessments and Data Collection: access to data indicators, tips for questionnaire development, guides for using Appreciative Inquiry for evaluation
- Data Analysis: Resources for statistical methods and guides for analyzing qualitative and quantitative data
Reporting and Visualizing:
- Data Dashboards: guides for creating popular data dashboards
- Data Visualization: teachings of Edward Tufte and lists of visualization methods
- Reporting: tools for presentation design and TEDtalks about presentation structure
Monday, August 26th, 2013
When demonstrating your library’s impact to your institution, you will need to organize all the data that you have collected – gate counts, reference statistics, cost/benefit analyses, anecdotal data, etc. – and present them to your administration in some format. Your goal is that your presentation gets the attention of your administration, makes the case that your library has a huge positive impact on the institution, and convinces them that support for the library needs to be maintained or increased.
Part 3 in the Demonstrating Your Impact series is called “Telling Your Story.” This section is about exploring the idea of using storytelling as a means of organizing your data and having the most impact.
Andy Goodman, the author of Why Bad Presentations Happen to Good Causes (free download http://www.thegoodmancenter.com/resources/) says that “stories are a terrific way to bring large issues down to ground level where people can get their minds (and hearts) around them. But after you have told your story, you must back it up with the numbers that prove you have more than one story to tell.” In this video of a Plenary address for the National Assembly on School-Based Health Care, Andy Goodman gives a powerful demonstration of the importance of storytelling in engaging decision makers: http://www.ustream.tv/recorded/15665748.
How can you take this concept of storytelling and apply it to the data that you have been collecting on your library? Cindy Olney, with the NN/LM Outreach and Evaluation Resource Center, describes a very do-able process in her April 17, 2013 SCR CONNECTions webinar, Once Upon a Time: Using Evaluation Findings to Tell Your Project’s Story (recorded webinar: https://webmeeting.nih.gov/p18217101/). In her description of how to organize your presentation, Olney suggests
- analyzing the data that you have collected,
- articulating the key findings from charts and graphs into sentences, and
- deciding what the most important findings are
- weave them into one of two story systems: Sparkline or Storybook
Sparkline: This system, described by Nancy Duarte’s in a TED Talk (http://www.ted.com/talks/nancy_duarte_the_secret_structure_of_great_talks.html), is designed for persuasive arguments (like convincing your employers to expand the role of the library). In this system, the presentation goes back and forth between the vision of what could be and the situation as it is now. The presentation ends with a call to action. This Sparkline system can be shown to underlay great persuasive speeches, such as Abraham Lincoln’s Gettysburg Address and Martin Luther King Jr.’s I Have a Dream speech.
Storybook: Olney suggests the storybook format is best for presenting the results of a completed project. Three important elements should be included for a good story:
- a likeable main character in an undesirable circumstances
- this main character takes steps toward improving those circumstances – their progress is rife with obstacles
- at the end, the main character is transformed
Whether you use the Storybook or the Sparkline system, to keep your story interesting and memorable, Olney adds “don’t let the data get in the way of a good story – write your story, then weave the data into it.”
Read part 1 and 2 of the Demonstrating Your Impact series (Return on Investment and Collecting Stories).
Monday, August 19th, 2013
To demonstrate your library’s impact to decision makers, it can be helpful to bring your data to life with some great success stories: researchers that were helped by your librarians, doctors’ time saved, or patients understanding their follow-up instructions. Even better than success stories you tell your administration are stories told about your library by satisfied customers, for example, satisfied doctors whose time is valuable to their hospital as well as themselves, satisfied patients who can recommend your hospital to others, or satisfied researchers who can vote where their city dollars go. In addition, there is evidence that anecdotal data can influence the outcomes of decisions (http://mande.co.uk/2010/uncategorized/stories-vs-statistics-the-impact-of-anecdotal-data-on-accounting-decision-making/).
Part 2 in the Demonstrating Your Impact series is about collecting and telling those success stories. The Centers for Disease Control and Prevention (CDC) has a publication called Impact and Value: Telling Your Program’s Story http://www.cdc.gov/oralhealth/publications/library/pdf/success_story_workbook.pdf. This document is intended for program managers to provide steps they can use to systematically collect and create success stories: “with attention to detail, a system of regular data collection and practice, this tool can become a powerful instrument to spread the word about your program.”
According to the Impact and Value publication, stories should not be the main method of presenting data, but they put a face to the numbers of research and evaluation data: “What does it really mean when you report that you have provided ‘X’ amount of services to ‘Y’ amount of people? How are the lives of the program participants [or your library customers] changed because of your services?”
A great example of systematic story collection can be found in the article, “MedlinePlus and the challenge of low health literacy: findings from the Colonias project,” (http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1773027/pdf/i0025-7338-095-01-0031.pdf) which describes a project funded by the National Library of Medicine in which community health workers, known as promotoras, were trained to help members of some Texas-Mexico border communities find health information using MedlinePlus. These promotoras were asked to collect up to two stories every week on how they used online resources to help residents with health concerns. The 157 stories that resulted from this technique were treated as data: thematically coded, checked for validity, and studied to show the degree of success of the promotoras project.
What to do with all this data? Stay tuned for part 3 of the Demonstrating Your Impact series: Telling Your Story
Tuesday, August 6th, 2013
Are you looking for ways to demonstrate your impact to your administration? This is Part 1 in a 3 part series on demonstrating your impact.
The NN/LM MidContinental Region (MCR) has created three online tools that can be used to enable a library to put actual figures to their importance within an organization http://nnlm.gov/mcr/evaluation/tools.html.
CBA/ROI Calculator: Sometimes it’s a good idea to speak the language of the administration. Cost/Benefit Analysis (CBA) and Return on Investment (ROI) are measures used by financial managers to indicate if their money is going in the right place. In a cost/benefit analysis, the goal is to show how much benefit the organization receives for the cost of the library. For a cost/benefit analysis, the result is actually a number: the benefit to cost ratio. If $50 was spent and the benefits to the organization could be seen as worth $25, then the benefit/cost ratio would be $25/$50 or 1/2 (50 cents of benefit for every $1 spent). This would obviously not look good for your library. However, if you could show that for the $50 that was spent, the benefit to the organization could be valued at $150, then the benefit/cost ratio is $150/$50, or $3 of benefit for every $1 spent. This could make your library look like a real asset!
Return on Investment is a very similar concept. In order to get the final percentage, the benefit of an investment (minus the original cost) is divided by the cost of the investment. So using the figures from the second case above, if $50 had been spent and a $150 benefit was achieved, $50 is subtracted from $150 to show a total return of $100. Then dividing that by the original investment ($100/$50), equals 2.00 or 200%. A 200% return on investment would make your library look very good!
“How can I apply this to my library” you might ask? The CBA/ROI Calculator from the MidContinental Region does most of the work for you. You simply fill in the blanks with the cost of books, cost of staff time, time saved, etc., and the final costs, benefits and ratios are determined at the bottom.
Database ROI Calculator: The calculator above is mostly designed for the books in your library’s collection. The MCR also provides a CBA/ROI calculator for databases. Getting statistics for databases can be a little more difficult than for other library services. Databases are often bundled with other products, and vendors define use statistics in multiple ways that make if difficult to compare across databases. Nevertheless, the MidContinental Region has some helpful tips for deciding which statistics to enter.
Valuing Library Services Calculator: Isn’t this what we all want – to explain that our library services have a financial value to the organization? Using this calculator, you can assign a dollar amount to the services you supply based on their retail value. You type in the number of times a particular service is used, and the calculator multiplies it by the retail value of that service. And at the bottom, it sums up your library’s total retail value.
The MCR is gathering data for advocacy purposes. If you would like your data included, be sure to fill out the form completely including the CAPTCHA box, and hit “submit data.” Librarians everywhere will appreciate your thoughtfulness.
Stay tuned for Part II – Collecting Stories.
Thursday, July 25th, 2013
The National Institutes of Health (NIH) recently announced plans to fund up to $24 million per year for four years to establish six to eight investigator-initiated Big Data to Knowledge Centers of Excellence. The centers will improve the ability of the research community to use increasingly large and complex datasets through the development and distribution of innovative approaches, methods, software, and tools for data sharing, integration, analysis and management. The centers will also provide training for students and researchers to use and develop data science methods.
Biomedical research is increasingly data-intensive, with researchers routinely generating and using large, diverse datasets. Yet the ability to manage, integrate and analyze such data, and to locate and use data generated by others, is often limited due to a lack of tools, accessibility, and training. In response, NIH launched the Big Data to Knowledge (BD2K) initiative in December. This initiative supports research, implementation, and training in data science that will enable biomedical scientists to capitalize on the transformative opportunities that large datasets provide. The investigator-initiated BD2K Center of Excellence funding opportunity is the first of several BD2K funding opportunities to be announced in coming months.
An information webinar for prospective applicants will be held on Thursday, Sept. 12, 2013, from 3 p.m. to 5 p.m. EDT. More details about this event and the overall BD2K initiative can be found at NIH Big Data to Knowledge (BD2K) website. Applications will be due on Nov. 20, 2013.
Tuesday, July 23rd, 2013
The National Institutes of Health (NIH) Commonfund recently launched the Big Data to Knowledge (BD2K) imitative. The mission of the BD2K initiative is to enable biomedical scientists to capitalize more fully on the Big Data being generated by those research communities.
With advances in technologies, these investigators are increasingly generating and using large, complex, and diverse datasets. Consequently, the biomedical research enterprise is increasingly becoming data-intensive and data-driven. However, the ability of researchers to locate, analyze, and use Big Data (and more generally all biomedical and behavioral data) is often limited for reasons related to access to relevant software and tools, expertise, and other factors. BD2K aims to develop the new approaches, standards, methods, tools, software, and competencies that will enhance the use of biomedical Big Data by supporting research, implementation, and training in data science and other relevant fields that will lead to:
- Appropriate access to shareable biomedical data through technologies, approaches, and policies that enable and facilitate widespread data sharing, discoverability, management, curation, and meaningful re-use;
- Development of and access to appropriate algorithms, methods, software, and tools for all aspects of the use of Big Data, including data processing, storage, analysis, integration, and visualization;
- Appropriate protections for privacy and intellectual property;
- Development of a sufficient cadre of researchers skilled in the science of Big Data, in addition to elevating general competencies in data usage and analysis across the biomedical research workforce.
Overall, the focus of the BD2K initiative is the development of innovative and transforming approaches as well as tools for making Big Data and data science a more prominent component of biomedical research.
As biomedical tools and technologies rapidly improve, researchers are producing and analyzing an ever-expanding amount of complex biological data. New analytics tools are needed to extract critical knowledge from this vast amount of data, and new policies must be developed to encourage data and software sharing to maximize the value of the data for all researchers across the spectrum of biomedical research. In addition, data and metadata standards to ensure data quality and uniformity must be developed, with broad input from the scientific community to ensure that these standards will have maximum utility and value.
Funding and educational opportunities are provided through the BD2K initiative.
Each day more and more data is generated. Through efforts such as the BD2K initiative it is hoped that the data can be widely used across disciplines and lead to scientific discovery or breakthroughs, particularity in the fields of health and medicine. Health science librarians also play an important role in the organization and curation of data. With expert skills in organization of information librarians are well suited to participate with researchers in data organization processes.