Skip all navigation and go to page content
NN/LM Home About PSR | Contact PSR | Feedback |Site Map | Help | Bookmark and Share

Archive for the ‘E-Science’ Category

How to Throw a Data Party

A data party is another name for a kind of participatory data analysis, where stakeholders are gathered together to help analyze data that you have collected. Here are some reasons to include stakeholders in the data analysis stage:

  • It allows stakeholders to get to know and engage with the data.
  • Stakeholders may bring context to the data that will help explain some of the results.
  • When stakeholders participate in analyzing the data, they are more likely to understand and use it.
  • Watching their interactions often reveals the person with the power to act on your recommendations.

To begin the process, you need to know what you hope to gain from the attendees, since you may only be able to hold an event like this one time. There are a number of different ways to organize the event, such as the World Cafe format, where everyone works together to explore a set of questions, or an Open Space system in which attendees create their own agenda about which questions they want to discuss. Recently the American Evaluation Association held a very successful online unconference using MIT’s Unhangout, an approach that could be used for an online data party with people from multiple locations.

Here are suggested questions to ask at a data party:

  • What does this data tell you?
  • How does this align with your expectations?
  • What do you think is occurring here and why?
  • What other information do you need to make this actionable?

At the end of the party it might be time to present some of your findings and recommendations. Considering the work that they have done, stakeholders may be more willing to listen, since people often tend to support what they helped to create.

NCBI Advanced Genomics Hackathon in January 2016

From January 4-6, 2016, NCBI will host a genomics hackathon focusing on advanced bioinformatics analysis of next generation sequencing data. This event is for students, post-doctorates, and investigators already engaged in the use of pipelines for genomic analyses from next generation sequencing data. (Specific projects are available to other developers or mathematicians.) Working groups of 5-6 individuals will be formed for twelve teams, in the following sections: Network Analysis of Variants, Structural Variation, RNA-Seq, Streaming Data and Metadata, and Neuroscience/Immunity. The working groups will build pipelines to analyze large datasets within a cloud infrastructure. Please see the application link below for specific team projects.

After a brief organizational session, teams will spend three days analyzing a challenging set of scientific problems related to a group of datasets. Participants will analyze and combine datasets in order to work on these problems. This course will take place at the National Library of Medicine on the NIH main campus in Bethesda, MD. Datasets will come from the public repositories housed at NCBI. During the course, participants will have an opportunity to include other datasets and tools for analysis. Please note, if you use your own data during the course, you will be asked to submit it to a public database within six months of the end of the hackathon. All pipelines and other scripts, software, and programs generated in this course will be added to a public GitHub repository designed for that purpose. A manuscript outlining the design of the hackathon and describing participant processes, products and scientific outcomes will be submitted to an appropriate journal.

To apply, complete the online form, which takes approximately ten minutes. Applications are due by December 1 at 2:00pm PST. Participants will be selected from a pool of applicants; prior students and prior applicants will be given priority in the event of a tie. Please note: applicants are judged based on the motivation and experience outlined in the form itself. Accepted applicants will be notified on December 4 by 11:00am PST, and have until December 7 at 2:00pm PST to confirm their participation. Please include a monitored email address, in case there are follow-up questions.

Participants will need to bring their own laptop to this program. A working knowledge of scripting (e.g., Shell, Python) is necessary to be successful in this event. Employment of higher level scripting or programming languages may also be useful. Applicants must be willing to commit to all three days of the event. No financial support for travel, lodging or meals can be provided for this event. Also note that the course may extend into the evening hours on Monday and/or Tuesday. Please make any necessary arrangements to accommodate this possibility. Please contact with any questions.

NLM Informatics Lecture Series on November 4: Use of Clinical Big Data to Inform Precision Medicine

The next session of the National Library of Medicine Informatics Lecture Series will be held on November 4, at 11:00am-12:00pm PST, with the feature presentation Use of Clinical Big Data to Inform Precision Medicine. The speaker will be Joshua Denny, MD, Associate Professor in the Departments of Biomedical Informatics and Medicine at Vanderbilt University Medical Center. This talk will be broadcast live and archived.

At Vanderbilt, Dr. Denny and his team have linked phenotypic information from de-identified electronic health records (EHRs) to a DNA repository of nearly 200,000 samples, creating a ‘virtual’ cohort. This approach allows study of genomic basis of disease and drug response using real-world clinical data. Finding the right information in the EHR can be challenging, but the combination of billing data, laboratory data, medication exposures, and natural language processing has enabled efficient study of genomic and pharmacogenomic phenotypes. The Vanderbilt research team has put many of these discovered pharmacogenomic characteristics into practice through clinical decision support. The EHR also enables the inverse experiment – starting with a genotype and discovering all the phenotypes with which it is associated – a phenome-wide association study (PheWAS). Dr. Denny’s research team has used PheWAS to replicate more than 300 genotype-phenotype associations, characterize pleiotropy, and discover new associations. They have also used PheWAS to identify characteristics within disease subtypes.

Dr. Denny is part of the NIH-supported Electronic Medical Records and Genomics (eMERGE) network, Pharmacogenomics Research Network (PGRN), and Implementing Genomics in Practice (IGNITE) networks. He is a past recipient of the American Medical Informatics Association New Investigator Award, Homer Warner Award, and Vanderbilt Chancellor’s Award for Research. Dr. Denny remains active in clinical care and in teaching students. He is also a member of the National Library of Medicine Biomedical Library and Informatics Review Committee.

New Videos Available on the NCBI YouTube Channel

Two new three-minute videos on the NCBI YouTube channel will provide information about how to view track sets in all of the NCBI genome browsers and Sequence Viewer displays and how to store and share custom sets of tracks in track collections. NCBI Recommended Tracks presents track sets, which allow you to instantly tailor your display to a specific need, while My NCBI Track Collections: Introduction shows how to store and share tracks in custom sets called track collections. To learn more about track sets and collections, visit the FAQ on the Sequence Viewer page. Subscribe to the NCBI YouTube channel to receive alerts about new videos ranging from quick tips to full webinar presentations.

Now Available: Presentation Slides and Session Recordings for 2015 Science Boot Camp West for Librarians

Video recordings and slide presentations for most sessions of the 2015 Science Boot Camp West for Librarians are now available. The meeting was held July 27-29, 2015, at Stanford University. Video files are large and best viewed by downloading rather than watching online. The full meeting agenda is also available.

First Offering of NCBI NOW (Next Generation Sequencing Online Workshop) Begins October 13!

NCBI will present the first iteration of NCBI NOW, a free online experience aimed at those new to next generation sequencing (NGS) analysis, from October 13-23. Enrollment in this course is limited to the first 1,000 participants who sign up through the ORAU Portal. Since enrollment is on a first-come, first-served basis, please only sign up for this educational opportunity if you will be able to participate fully. Learners will watch 6-7 videos (average video duration: 45-60 minutes) online during the first seven days of the course. These videos will cover the basics of NGS data, preprocessing, quality control and alignment strategies for both DNA-Seq and RNA-Seq, as well as a brief discussion of downstream analysis. Additionally, there will be demonstrations about leveraging BLAST tools for NGS analysis.

Next, participants will apply a selection of RNA-Seq alignment algorithms over three days (1-2 hours per day), mapping RNA-Seq data to GRCh38 chromosome 20. Finally, participants will compare the results of these mappers for specific genes. Throughout the course, participants will be able to post questions at Biostars; experts from NCBI and elsewhere will be available online to answer questions. Learners will emerge from the course equipped to map their own RNA-Seq or DNA-Seq data to the human genome, understand the options for downstream analysis, and use their understanding of the basic steps of data processing to interact more effectively with bioinformatician collaborators.

NLM Announces Inaugural Annual Donald Lindberg and Donald King Lecture on October 7

The National Library of Medicine is pleased to announce the first annual Donald A.B. Lindberg & Donald West King Lecture on Wednesday, October 7, at 10:00 AM PDT in Bethesda, MD, and will be videocast (and archived for later viewing). The inaugural lecture, which honors recently retired NLM Director Dr. Lindberg and former NLM Deputy Director of Research and Education Dr. King, is titled, Integrating Multi-scale Data for Biomedical Discovery and Clinical Implementation. It will be given by Russell Altman, MD, PhD, of Stanford University. Dr. Altman’s primary interests are in the field of bioinformatics. He is particularly interested in the analysis of protein and RNA structure and function, both in an individual problem-centered manner and on a functional genomic scale. Dr. Altman currently serves as a member of the Advisory Committee to the NIH Director (ACD).

Webinar Recording Available: NIH Common Data Element (CDE) Initiatives Overview

NIH encourages the use of common data elements (CDEs) in clinical research, patient registries, and other human subject research in order to improve data quality and opportunities for comparison and combination of data from multiple studies and with electronic health records. The NIH Common Data Element Resource Portal provides access to information about NIH-supported CDEs, as well as tools and resources to assist investigators developing protocols for data collection. In addition, the session recording and presentation slides for the 90-minute webinar “NIH Common Data Element (CDE) Initiatives – Overview,” held on September 8, are available for viewing.

Data Management Training Opportunities

Data management activities present opportunities for librarians to adopt new roles and support the research process in their institutions. There is a variety of educational resources available to librarians wishing to get started in this field and learn more about data management and related functions. One example is MANTRA: Research Data Management Training, an online course sponsored by the University of Edinburgh, which is freely available to anyone to explore. It consists of nine online units, such as “Organising Data,” Storage & Security,” and “Sharing, Preservation, & Licensing.” Each unit takes up to one hour to complete, plus time for further reading and data handling exercises. The current course content represents the fourth release of MANTRA in September, 2014. Data Management for Clinical Research is a five-week free online course offered by Coursera. It utilizes best-practice guidelines, along with hands-on demonstrations and exercises, to cover important concepts related to research data collection and management, with a primary focus on data management for patient-centered research. The Medical Library Association also offers continuing education opportunities related to data management.

In addition to these courses, a Mendeley group, Data Management for Librarians, is an active community created for librarians of all disciplines to share literature and resources about data management and related areas. Members are also encouraged to share their experiences working with data in their institutions. Another introductory resource is the article “Research Data Management,” by Alisa Surkis, PhD, MLS; and Kevin Read, MLIS, MAS; both of NYU Health Sciences Library, published in the July 2015 issue of the Journal of the Medical Library Association.

New e-Science Portal Launched!

Blue banner with e-science logo

The NN/LM New England Region has announced the launch of the newly redesigned e-Science Portal for New England Librarians. Along with a new look, features of the e-Science Portal 2.0 include:

  • A “Getting started with e-Science” quick guide
  • Events calendar and Twitter feed
  • Links to recent e-Science Community blog posts
  • Prominent hyperlinks to e-Science Partner Projects (New England e-Science Symposium, Science Boot Camp, New England Collaborative Data Management Curriculum (NECDMC), Journal of eScience Librarianship)
  • Reorganized content headings (e.g. About, Connect with Others, Data Management, and Research Environment sections)
  • Links to data tools posted on a delegated “Data Tools” page as well as in relevant pages such as data curation
  • Editor’s photo, biography, and contact form on portal pages

For further details about the portal redesign, visit the e-Science Community blog post Portal 2.0 is here, by Jen Ferguson, co-chair of the e-Science Portal Editorial Board.