Archive for the ‘E-Science’ Category
Registration is now available for the full-day workshop, Teaching Research Data Management with the New England Collaborative Data Management Curriculum, that will be held on Friday, November 8, at the Beechwood Hotel, 367 Plantation St., Worcester, MA. This is a “train the trainer” class, intended for librarians who will be teaching best practices in research data management to science, health science, and/or engineering students and faculty. During the workshop, Elaine Martin, Andrew Creamer, and Donna Kafel will be demonstrating the components of the New England Collaborative Data Management Curriculum and discussing ways that the curriculum materials can be used and customized.
Registrants for the workshop must attend a prerequisite webinar, Best Practices for Teaching Research Data Management and Consulting on Data Management Plans in New England, that will be held on Thursday, October 31, from 9-10 AM PDT. The webinar will be archived so that anyone unable to attend the live session may view it prior to the November 8 class. The number of attendees for the in-person workshop will be limited to 40. Registration for the workshop is on a first-come, first-serve basis. The fee for the workshop is $35 (no refunds will be issued). The webinar is free, but registration is required to attend the live session on 10/31.
The National Library of Medicine (NLM) has announced its next initiative as part of its ongoing partnership with the National Endowment for the Humanities (NEH). Working with NEH’s Office of Digital Humanities, the National Science Foundation, and Virginia Polytechnic Institute and State University (Virginia Tech), the NLM will be a part of An Epidemiology of Information: New Methods for Interpreting Disease and Data, an interdisciplinary symposium exploring new methods for large-scale data analysis of epidemic disease.
Scheduled to take place at the Virginia Tech Research Center in Arlington, VA, on October 17, 2013, from 8:30 AM to 5:00 PM, “An Epidemiology of Information” will be a unique public forum through which policy makers, public health experts, and scholars can address pressing questions about how new methods of analyzing large-scale datasets can inform research and policy approaches to epidemic disease. Panelists will consider what these new methods suggest for contemporary infodemiology and epidemic intelligence, as well as the implications of data mining as a disease surveillance mechanism, and how new forms of reporting and public health surveillance affect public health policy. The symposium will also explore how these new methods can inform research on the 1918 influenza pandemic, and help to answer lingering questions about the spread of the disease, its pathogenicity, the unusual mortality rates, or the effectiveness of public health responses.
Featured speakers will include Dr. Jeffery Taubenberger, Chief, Viral Pathogenesis and Evolution Section, National Institute of Allergy and Infectious Diseases (NIAID), and Dr. David Morens, Senior Advisor to the Director, NIAID, whose research in data analysis and historical epidemiology has influenced the approaches being adopted and adapted by digital humanities scholars working in the history of medicine. “An Epidemiology of Information” is made possible in part from support received by Virginia Tech through the international Digging into Data Challenge competition sponsored by NEH. Funding for Virginia Tech’s Canadian partner, the Center for E-Health Initiatives of the University of Toronto, comes from the Social Science and Humanities Research Council of Canada. The symposium is free and open to the public, but registration is required.
The National Library of Medicine (NLM) will join with other health data leaders and innovators for the fourth annual Health Datapalooza. The unique event will be held June 3-4, 2013, at the Omni Shoreham Hotel in Washington, DC. Health Datapalooza IV highlights new, innovative, and effective ways health data is being used by companies, startups, academics, government agencies, and individuals. More than 1,500 people are expected to attend. The event is organized by a consortium of private sector, non-profit and government agencies, including the Department of Health and Human Services (HHS). NLM has participated in the event every year.
As the world’s largest medical library, NLM has made its electronic data freely available for decades, so that others can use it to develop new products and services. Additionally, NLM provides application programming interfaces (APIs) so that external products and services, such as electronic health records, can easily access its data. NLM experts will be in the Health Datapalooza exhibit hall (Booth 12), to explain how developers can utilize the variety of available NLM data, including medical literature; consumer health information; clinical trials; medical terminology; and drugs. NLM will also participate in the “Datalab” breakout session, featuring federal government data experts.
The National Library of Medicine has announced that Extensible Markup Language (XML) data from the IndexCat™ database is now available for free download. Released with a Document Type Definition (DTD) that allows researchers to validate the data, this new XML release includes the digitized content of more than 3.7 million bibliographic items from the printed, 61-volume Index-Catalogue of the Library of the Surgeon-General’s Office, originally published from 1880 to 1961. The XML describes items spanning five centuries, including millions of journal and newspaper articles, obituaries, and letters; hundreds of thousands of monographs and dissertations; and thousands of portraits. Together, these items cover a wide range of subjects such as the basic sciences, scientific research, civilian and military medicine, public health, and hospital administration.
The NLM release of the Index-Catalogue in XML format opens this key resource in the history of medicine and science to new uses and users. It is one of the monuments of the Library’s longstanding, systematic indexing of the medical literature, an effort which William Henry Welch (1850-1934), the great pathologist and bibliophile, considered to be “America’s greatest contribution to medical knowledge.” This indexing, begun by John Shaw Billings in the nineteenth century at the Library of the Surgeon-General’s Office, United States Army (known today as the NLM), eventually created two distinct products: the Index-Catalogue of the Library of the Surgeon-General’s Office, United States Army, and the Index Medicus, forerunner of MEDLINE®, and now the largest component of PubMed.®
Released alongside the IndexCatalogue XML are an integrated XML file and associated DTD for two collections developed from the electronic database of A Catalogue of Incipits of Mediaeval Scientific Writings in Latin (rev.), by Lynn Thorndike and Pearl Kibre (eTK), and the updated and expanded version of Scientific and Medical Writings in Old and Middle English: An Electronic Reference (eVK2), edited by Linda Ehrsam Voigts and Patricia Deery Kurtz. Also available via the online IndexCat, these resources encompass over 42,000 records of incipits, or the beginning words of a medieval manuscript or early printed book, covering various medical and scientific writings on topics as diverse as astronomy, astrology, geometry, agriculture, household skills, book production, occult science, natural science, and mathematics, as these disciplines and others were largely intermingled in the medieval period of European history. The NLM release of these resources in XML format joins many other freely downloadable resources, including the XML for MEDLINE®/PubMed® data, which includes over 22 million references to biomedical and life sciences journal articles back to 1946, and, for some journals, much earlier.
The release also coincides with the NLM’s participation in “Shared Horizons: Data, Biomedicine, and the Digital Humanities,” an interdisciplinary symposium exploring the intersection of digital humanities and biomedicine, being held April 10-12, 2013, in partnership with the National Endowment for the Humanities’ Office of Digital Humanities, Maryland Institute for Technology in the Humanities at the University of Maryland, and Research Councils UK. Shared Horizons will create opportunities for disciplinary cross-fertilization through a mix of formal and informal presentations, combined with breakout sessions designed to promote a rich exchange of ideas about how large-scale quantitative methods can lead to new understandings of human culture. Bringing together researchers from the digital humanities and bioinformatics communities, the symposium will explore ways in which these two communities might fruitfully collaborate on projects that bridge the humanities and medicine around the topics of sequence alignment and network analysis, two modes of analysis that intersect with “big data.” All Shared Horizons sessions will be live-streamed with a monitored back channel for the public to post/tweet comments. Recordings of all talks will also be posted to the Shared Horizons website, with the ability to comment pre- and post-event.
Data dashboards provide a mechanism to use visualization, rather than words, to get a quick overview of progress made towards programmatic goals, and to engage stakeholders in the evaluation process. To use data dashboards effectively, it is important to define the user group(s) involved and to select recognizable metrics from trusted sources. There are a variety of resources available to assist with producing dashboards for web sites, blogs, etc., including Juice Analytics, Tableau Software, and Google Analytics. After registering with Juice Analytics, one resource to consider is a white paper listed in the “Visualization Resources” category, called A Guide to Creating Dashboards People Love to Use. Once established, data dashboards can monitor the progress of a program, communicate progress to stakeholders, and provide early signs of problems that may be arising.
To get an idea of a final product, a good example to view is the Health IT Dashboard showing the implementation of the Regional Extension Center (REC) Cooperative Agreement Program, coordinated by the federal Office of the National Coordinator for Health IT (ONC). The REC program is funded to provide technical assistance for EHR implementation to 100,000 primary care providers, through 62 nationwide sites. The dashboard charts the enrollment of primary care providers in this program, and monitors their efforts to become meaningful users of electronic health records (EHRs). Dashboards could be a colorful, visual way for you to show what you do to benefit the overall institution!
The ENCODE Project was planned as a follow-up to the Human Genome Project. The Human Genome Project sequenced the DNA that makes up the human genome; the ENCODE Project seeks to interpret this sequence. Coinciding with the completion of the Human Genome Project in 2003, the National Human Genome Research Institute (NHGRI) organized the launching of the ENCODE Project, as a worldwide effort involving more than 30 research groups and 400 scientists. The approximately 20,000 genes that provide instructions for making proteins account for only about 1% of the human genome. Researchers embarked on the ENCODE Project to figure out the purpose of the remaining 99% of the genome. Scientists discovered that more than 80 percent of this non-gene component of the genome, which was once considered “junk DNA,” actually has a role in regulating the activity of particular genes (gene expression).
Researchers think that changes in the regulation of gene activity may disrupt protein production and cell processes and result in disease. A goal of the ENCODE Project is to link variations in the expression of certain genes to the development of disease. The ENCODE Project has given researchers insight into how the human genome functions. As researchers learn more about the regulation of gene activity and how genes are expressed, the scientific community will be able to better understand how the entire genome can affect human health.
NHGRI recently announced updated results of the ENCODE project in a press release. Further detailed information about the findings are available from the ENCODE project portal. Published research findings are also available through the new web site, Nature Encode Explorer, which provides public access to scientific information collected from the ENCODE Project.
The National Library of Medicine is pleased to announce its first initiative as part of its recently established partnership with the National Endowment for the Humanities (NEH), which lays groundwork for the two institutions to cooperate on initiatives of common interest. Working in cooperation with the NEH’s Office of Digital Humanities; Maryland Institute for Technology in the Humanities at the University of Maryland; and Research Councils UK, the NLM will be a part of “Shared Horizons: Data, Biomedicine, and the Digital Humanities,” an interdisciplinary symposium exploring the intersection of digital humanities and biomedicine.
Scheduled to take place April 10-12, 2013, at the University of Maryland, College Park Campus, Shared Horizons will be a unique forum, through which participants and their institutions will be able to address questions about collaboration, research methodologies, and the interpretation of evidence arising from the interdisciplinary opportunities in this burgeoning area of biomedical-driven humanities scholarship. Shared Horizons will create opportunities for disciplinary cross-fertilization through a mix of formal and informal presentations, combined with breakout sessions, all designed to promote a rich exchange of ideas about how large-scale quantitative methods can lead to new understandings of human culture. Bringing together researchers from the digital humanities and bioinformatics communities, the symposium will explore ways in which these two communities might fruitfully collaborate on projects that bridge the humanities and medicine around the topics of sequence alignment and network analysis, two modes of analysis that intersect with “big data.”
The Symposium’s Call for Papers is now available, with a submission deadline of November 10, 2012. Applicants will be selected by the Advisory Board, in consultation with the Shared Horizons Staff and Sponsors, based on the following criteria, with each area being weighted equally: scholarly engagement with sequence alignment and/or network analysis; quality of proposed paper; and collaborative potential. Notification of selection will be made by January 10, 2013.
The National Science Foundation and the National Institutes of Health are co-sponsoring a webinar regarding their joint Core Techniques and Technologies for Advancing Big Data Science & Engineering (BIGDATA) solicitation. The webinar will be held from 8-9am PDT on May 8, 2012. Questions about the solicitation can be submitted during the webinar. Please register for the webinar by May 7, 2012. After your registration is accepted, you will get an email with a URL to join the meeting. The webinar will be archived for later viewing, and linked to the BIGDATA program web page.
The BIGDATA solicitation aims to advance the core scientific and technological means of managing, analyzing, visualizing, and extracting useful information from large, diverse, distributed, and heterogeneous data sets so as to accelerate the progress of scientific discovery and innovation; lead to new fields of inquiry that would not otherwise be possible; encourage the development of new data analytic tools and algorithms; facilitate scalable, accessible, and sustainable data infrastructure; increase understanding of human and social processes and interactions; and promote economic growth and improved health and quality of life.
The phrase “big data” in this solicitation does not refer just to the volume of data, but also to its variety and velocity. Big data includes large, diverse, complex, longitudinal, and/or distributed data sets generated from instruments, sensors, Internet transactions, email, video, click streams, and/or all other digital sources. Proposal submission deadlines are June 13, 2012, for mid-scale projects; and July 11, 2012, for small projects. Fifteen to twenty projects will be funded, subject to availability of funds.
In partnership with librarians at the University of Minnesota, the University of Oregon, and Cornell University, the Purdue University Libraries received nearly $250,000 from the Institute of Museum and Library Services (IMLS) to develop training programs for the next generation of scientists, to enable them to find, organize, use, and share data efficiently and effectively. This training will be vital to scientists as they look to secure research funding. In 2007, the National Science Foundation issued a report on the need to build public collections of research data and since 2011 has required scientists to include data management plans in their grant applications.
The Data Information Literacy research project will be carried out over a two-year period by five project teams, to develop and implement a data information literacy curriculum. Two of the teams, consisting of a data librarian, a subject librarian and a disciplinary faculty researcher, are based at Purdue, with one team each at the other institutions. The program is intended for graduate students in engineering and science disciplines who are working toward careers as research scientists. With the continued evolution of technology driven research or e-science impacting the skills necessary for effective data management and curation, a curriculum designed to effectively prepare the next generation of scientists for the dynamic nature of research is essential.
The teams are constructed to represent a variety of subject areas, from electrical and computer engineering to landscape architecture, so that commonalities and differences in data curation needs across disciplines can be explored. Each team will conduct an assessment of data needs of their discipline, including interviewing and observing researchers. The teams will then develop and implement targeted instruction and assess the impact of that instruction in developing the data information literacy skills of graduate students.
The results of this first ever effort at articulating and addressing data information literacy skills will help future scientists and engineers contribute to and take full advantage of the potentials that cyberinfrastructure and information technologies provide. The collaboration between librarians and faculty will identify the educational needs of future e-scientists in organizing, describing, disseminating and preserving their data, and teach them these skills in ways that can be applied in their day-to-day research activities.
E-Science is a very timely subject for libraries. PSR, with the University of California Davis, hosted an E-Science Day in December, 2011. Here is another offering to consider attending:
You are invited to join the faculty and staff of the Spencer S. Eccles Health Sciences Library for the Priscilla M. Mayden Lecture on Wednesday, February 22 at 1:00 p.m. MT in the Eccles Institute of Human Genetics Auditorium or via the program link for viewing from a distance. The lecture and broadcast are offered free of charge, and prior registration is not required.
This year’s Mayden lecturer is Bart Ragon, Associate Director for Library Technology Services and Development. Mr. Ragon’s lecture focuses on eScience and the Evolution of Library Services. Not just for librarians, eScience/eResearch potentially impacts faculty, staff and student access to the data, tools and resources needed to collaborate, share and move science forward.
Mr. Ragon’s topic description: ‘Science is changing and changing fast. Concepts like the data life cycle, data curation, translational science, high performance computing, and data sharing are having an impact on how science is conducted. At the same time, libraries are adjusting services to meet the needs of highly networked and technically savvy patron groups. eScience is a term that describes the dynamic re-shaping of collaboration and workflows in science and creating unique and important opportunities for librarianship. This presentation explores potential roles for librarians in eScience, how new collaborations might form, and the role of the libraries in the data life cycle.”
A conversation break with light refreshments is scheduled from 2:00-2:30 in the EIHG atrium. At 2:30 MT a Meet the Experts panel convenes to further define and discuss issues related to eScience and eResearch. Panelists include:
- Bill Barnett, Ph.D.,
- Steve Corbató, Ph.D.,
- Donald McClain, M.D., Ph.D.,
- Daureen Nesdill, M.L.I.S.
- Ellie Phillipo
- Bart Ragon – moderator
(link to presenter bios)
The program link will be available on the Mayden Lecture page for viewing from a distance. The broadcast will be archived for on-demand viewing.
For more information contact Jeanne Le Ber; 801-585-6744.