Archive for the ‘E-Science’ Category
The National Library of Medicine has announced that Extensible Markup Language (XML) data from the IndexCat™ database is now available for free download. Released with a Document Type Definition (DTD) that allows researchers to validate the data, this new XML release includes the digitized content of more than 3.7 million bibliographic items from the printed, 61-volume Index-Catalogue of the Library of the Surgeon-General’s Office, originally published from 1880 to 1961. The XML describes items spanning five centuries, including millions of journal and newspaper articles, obituaries, and letters; hundreds of thousands of monographs and dissertations; and thousands of portraits. Together, these items cover a wide range of subjects such as the basic sciences, scientific research, civilian and military medicine, public health, and hospital administration.
The NLM release of the Index-Catalogue in XML format opens this key resource in the history of medicine and science to new uses and users. It is one of the monuments of the Library’s longstanding, systematic indexing of the medical literature, an effort which William Henry Welch (1850-1934), the great pathologist and bibliophile, considered to be “America’s greatest contribution to medical knowledge.” This indexing, begun by John Shaw Billings in the nineteenth century at the Library of the Surgeon-General’s Office, United States Army (known today as the NLM), eventually created two distinct products: the Index-Catalogue of the Library of the Surgeon-General’s Office, United States Army, and the Index Medicus, forerunner of MEDLINE®, and now the largest component of PubMed.®
Released alongside the IndexCatalogue XML are an integrated XML file and associated DTD for two collections developed from the electronic database of A Catalogue of Incipits of Mediaeval Scientific Writings in Latin (rev.), by Lynn Thorndike and Pearl Kibre (eTK), and the updated and expanded version of Scientific and Medical Writings in Old and Middle English: An Electronic Reference (eVK2), edited by Linda Ehrsam Voigts and Patricia Deery Kurtz. Also available via the online IndexCat, these resources encompass over 42,000 records of incipits, or the beginning words of a medieval manuscript or early printed book, covering various medical and scientific writings on topics as diverse as astronomy, astrology, geometry, agriculture, household skills, book production, occult science, natural science, and mathematics, as these disciplines and others were largely intermingled in the medieval period of European history. The NLM release of these resources in XML format joins many other freely downloadable resources, including the XML for MEDLINE®/PubMed® data, which includes over 22 million references to biomedical and life sciences journal articles back to 1946, and, for some journals, much earlier.
The release also coincides with the NLM’s participation in “Shared Horizons: Data, Biomedicine, and the Digital Humanities,” an interdisciplinary symposium exploring the intersection of digital humanities and biomedicine, being held April 10-12, 2013, in partnership with the National Endowment for the Humanities’ Office of Digital Humanities, Maryland Institute for Technology in the Humanities at the University of Maryland, and Research Councils UK. Shared Horizons will create opportunities for disciplinary cross-fertilization through a mix of formal and informal presentations, combined with breakout sessions designed to promote a rich exchange of ideas about how large-scale quantitative methods can lead to new understandings of human culture. Bringing together researchers from the digital humanities and bioinformatics communities, the symposium will explore ways in which these two communities might fruitfully collaborate on projects that bridge the humanities and medicine around the topics of sequence alignment and network analysis, two modes of analysis that intersect with “big data.” All Shared Horizons sessions will be live-streamed with a monitored back channel for the public to post/tweet comments. Recordings of all talks will also be posted to the Shared Horizons website, with the ability to comment pre- and post-event.
Data dashboards provide a mechanism to use visualization, rather than words, to get a quick overview of progress made towards programmatic goals, and to engage stakeholders in the evaluation process. To use data dashboards effectively, it is important to define the user group(s) involved and to select recognizable metrics from trusted sources. There are a variety of resources available to assist with producing dashboards for web sites, blogs, etc., including Juice Analytics, Tableau Software, and Google Analytics. After registering with Juice Analytics, one resource to consider is a white paper listed in the “Visualization Resources” category, called A Guide to Creating Dashboards People Love to Use. Once established, data dashboards can monitor the progress of a program, communicate progress to stakeholders, and provide early signs of problems that may be arising.
To get an idea of a final product, a good example to view is the Health IT Dashboard showing the implementation of the Regional Extension Center (REC) Cooperative Agreement Program, coordinated by the federal Office of the National Coordinator for Health IT (ONC). The REC program is funded to provide technical assistance for EHR implementation to 100,000 primary care providers, through 62 nationwide sites. The dashboard charts the enrollment of primary care providers in this program, and monitors their efforts to become meaningful users of electronic health records (EHRs). Dashboards could be a colorful, visual way for you to show what you do to benefit the overall institution!
The ENCODE Project was planned as a follow-up to the Human Genome Project. The Human Genome Project sequenced the DNA that makes up the human genome; the ENCODE Project seeks to interpret this sequence. Coinciding with the completion of the Human Genome Project in 2003, the National Human Genome Research Institute (NHGRI) organized the launching of the ENCODE Project, as a worldwide effort involving more than 30 research groups and 400 scientists. The approximately 20,000 genes that provide instructions for making proteins account for only about 1% of the human genome. Researchers embarked on the ENCODE Project to figure out the purpose of the remaining 99% of the genome. Scientists discovered that more than 80 percent of this non-gene component of the genome, which was once considered “junk DNA,” actually has a role in regulating the activity of particular genes (gene expression).
Researchers think that changes in the regulation of gene activity may disrupt protein production and cell processes and result in disease. A goal of the ENCODE Project is to link variations in the expression of certain genes to the development of disease. The ENCODE Project has given researchers insight into how the human genome functions. As researchers learn more about the regulation of gene activity and how genes are expressed, the scientific community will be able to better understand how the entire genome can affect human health.
NHGRI recently announced updated results of the ENCODE project in a press release. Further detailed information about the findings are available from the ENCODE project portal. Published research findings are also available through the new web site, Nature Encode Explorer, which provides public access to scientific information collected from the ENCODE Project.
The National Library of Medicine is pleased to announce its first initiative as part of its recently established partnership with the National Endowment for the Humanities (NEH), which lays groundwork for the two institutions to cooperate on initiatives of common interest. Working in cooperation with the NEH’s Office of Digital Humanities; Maryland Institute for Technology in the Humanities at the University of Maryland; and Research Councils UK, the NLM will be a part of “Shared Horizons: Data, Biomedicine, and the Digital Humanities,” an interdisciplinary symposium exploring the intersection of digital humanities and biomedicine.
Scheduled to take place April 10-12, 2013, at the University of Maryland, College Park Campus, Shared Horizons will be a unique forum, through which participants and their institutions will be able to address questions about collaboration, research methodologies, and the interpretation of evidence arising from the interdisciplinary opportunities in this burgeoning area of biomedical-driven humanities scholarship. Shared Horizons will create opportunities for disciplinary cross-fertilization through a mix of formal and informal presentations, combined with breakout sessions, all designed to promote a rich exchange of ideas about how large-scale quantitative methods can lead to new understandings of human culture. Bringing together researchers from the digital humanities and bioinformatics communities, the symposium will explore ways in which these two communities might fruitfully collaborate on projects that bridge the humanities and medicine around the topics of sequence alignment and network analysis, two modes of analysis that intersect with “big data.”
The Symposium’s Call for Papers is now available, with a submission deadline of November 10, 2012. Applicants will be selected by the Advisory Board, in consultation with the Shared Horizons Staff and Sponsors, based on the following criteria, with each area being weighted equally: scholarly engagement with sequence alignment and/or network analysis; quality of proposed paper; and collaborative potential. Notification of selection will be made by January 10, 2013.
The National Science Foundation and the National Institutes of Health are co-sponsoring a webinar regarding their joint Core Techniques and Technologies for Advancing Big Data Science & Engineering (BIGDATA) solicitation. The webinar will be held from 8-9am PDT on May 8, 2012. Questions about the solicitation can be submitted during the webinar. Please register for the webinar by May 7, 2012. After your registration is accepted, you will get an email with a URL to join the meeting. The webinar will be archived for later viewing, and linked to the BIGDATA program web page.
The BIGDATA solicitation aims to advance the core scientific and technological means of managing, analyzing, visualizing, and extracting useful information from large, diverse, distributed, and heterogeneous data sets so as to accelerate the progress of scientific discovery and innovation; lead to new fields of inquiry that would not otherwise be possible; encourage the development of new data analytic tools and algorithms; facilitate scalable, accessible, and sustainable data infrastructure; increase understanding of human and social processes and interactions; and promote economic growth and improved health and quality of life.
The phrase “big data” in this solicitation does not refer just to the volume of data, but also to its variety and velocity. Big data includes large, diverse, complex, longitudinal, and/or distributed data sets generated from instruments, sensors, Internet transactions, email, video, click streams, and/or all other digital sources. Proposal submission deadlines are June 13, 2012, for mid-scale projects; and July 11, 2012, for small projects. Fifteen to twenty projects will be funded, subject to availability of funds.
In partnership with librarians at the University of Minnesota, the University of Oregon, and Cornell University, the Purdue University Libraries received nearly $250,000 from the Institute of Museum and Library Services (IMLS) to develop training programs for the next generation of scientists, to enable them to find, organize, use, and share data efficiently and effectively. This training will be vital to scientists as they look to secure research funding. In 2007, the National Science Foundation issued a report on the need to build public collections of research data and since 2011 has required scientists to include data management plans in their grant applications.
The Data Information Literacy research project will be carried out over a two-year period by five project teams, to develop and implement a data information literacy curriculum. Two of the teams, consisting of a data librarian, a subject librarian and a disciplinary faculty researcher, are based at Purdue, with one team each at the other institutions. The program is intended for graduate students in engineering and science disciplines who are working toward careers as research scientists. With the continued evolution of technology driven research or e-science impacting the skills necessary for effective data management and curation, a curriculum designed to effectively prepare the next generation of scientists for the dynamic nature of research is essential.
The teams are constructed to represent a variety of subject areas, from electrical and computer engineering to landscape architecture, so that commonalities and differences in data curation needs across disciplines can be explored. Each team will conduct an assessment of data needs of their discipline, including interviewing and observing researchers. The teams will then develop and implement targeted instruction and assess the impact of that instruction in developing the data information literacy skills of graduate students.
The results of this first ever effort at articulating and addressing data information literacy skills will help future scientists and engineers contribute to and take full advantage of the potentials that cyberinfrastructure and information technologies provide. The collaboration between librarians and faculty will identify the educational needs of future e-scientists in organizing, describing, disseminating and preserving their data, and teach them these skills in ways that can be applied in their day-to-day research activities.
E-Science is a very timely subject for libraries. PSR, with the University of California Davis, hosted an E-Science Day in December, 2011. Here is another offering to consider attending:
You are invited to join the faculty and staff of the Spencer S. Eccles Health Sciences Library for the Priscilla M. Mayden Lecture on Wednesday, February 22 at 1:00 p.m. MT in the Eccles Institute of Human Genetics Auditorium or via the program link for viewing from a distance. The lecture and broadcast are offered free of charge, and prior registration is not required.
This year’s Mayden lecturer is Bart Ragon, Associate Director for Library Technology Services and Development. Mr. Ragon’s lecture focuses on eScience and the Evolution of Library Services. Not just for librarians, eScience/eResearch potentially impacts faculty, staff and student access to the data, tools and resources needed to collaborate, share and move science forward.
Mr. Ragon’s topic description: ‘Science is changing and changing fast. Concepts like the data life cycle, data curation, translational science, high performance computing, and data sharing are having an impact on how science is conducted. At the same time, libraries are adjusting services to meet the needs of highly networked and technically savvy patron groups. eScience is a term that describes the dynamic re-shaping of collaboration and workflows in science and creating unique and important opportunities for librarianship. This presentation explores potential roles for librarians in eScience, how new collaborations might form, and the role of the libraries in the data life cycle.”
A conversation break with light refreshments is scheduled from 2:00-2:30 in the EIHG atrium. At 2:30 MT a Meet the Experts panel convenes to further define and discuss issues related to eScience and eResearch. Panelists include:
- Bill Barnett, Ph.D.,
- Steve Corbató, Ph.D.,
- Donald McClain, M.D., Ph.D.,
- Daureen Nesdill, M.L.I.S.
- Ellie Phillipo
- Bart Ragon – moderator
(link to presenter bios)
The program link will be available on the Mayden Lecture page for viewing from a distance. The broadcast will be archived for on-demand viewing.
For more information contact Jeanne Le Ber; 801-585-6744.
Announcing priority registration for E-Science Day: An Opportunity for Education & Networking, held on Tuesday, December 6, 2011 from 9:30 AM until 4:00 PM at the University of California Davis Medical Center in Sacramento, California! This event is hosted by the University of California Davis Health Sciences Libraries and funded by the National Network of Libraries of Medicine Pacific Southwest Region.
A day-long meeting, e-science will be examined in terms of the role that technology has had on transforming the environments of the health and life sciences; presentations on this topic, the electronic medical record, outreach & community engagement, and e-science initiatives developed at other institutions will provide the framework for the day. The agenda features a keynote speaker, a panel presentation, poster and paper presentations, lightning rounds, and afternoon break-out sessions. There is no cost to attend this event. For remote participants, the event will also be available through web conferencing. Attendees (in person and remote) will be eligible for 5.0 Continuing Education credits from the Medical Library Association.
To register, please visit the E-Science Day website at http://nnlm.gov/psr/training/e-science_day.html and complete the registration form which is available from the URL provided in the Registration information on the website. In-person attendance is limited, and registration will be confirmed on a first-come, first-served basis. Priority registration is available until Wednesday, October 19 for medical librarians located in the areas served by NN/LM PSR (Arizona, California, Hawaii, Nevada and the U.S. Territories in the Pacific Basin). The deadline for all registrations is Tuesday, November 8.
For questions, please contact Raquel Abad at firstname.lastname@example.org or 916.734.3870.