NCBI has just released Entrez Direct, a new software suite that allows you to use the UNIX command line to directly access NCBI’s data servers, as well as parse and format data to create customized data files. The latest NCBI News story discusses Entrez Direct and gives several examples of how the programs may be used, as well as links to the suite on FTP and documentation. Entrez Direct is available as a simple FTP download and has extensive documentation on the NCBI web site.
Archive for the ‘Informatics’ Category
The Coalition for Networked Information (CNI), the Association of Research Libraries (ARL), and EDUCAUSE have announced that Donald A.B. Lindberg, Director of the National Library of Medicine, has been named the 2014 recipient of the Paul Evan Peters Award. The award recognizes notable, lasting achievements in the creation and innovative use of network-based information resources and services that advance scholarship and intellectual productivity. Named for CNI’s founding director, the award honors the memory and accomplishments of Paul Evan Peters (1947–1996), a visionary and a coalition builder in higher education and the world of scholarly communication, who led CNI from its founding in 1990. The award will be presented during the CNI membership meeting in St. Louis, MO, to be held March 31–April 1, 2014, where Dr. Lindberg will deliver the Paul Evan Peters Memorial Lecture. The talk will be recorded and made available on CNI’s YouTube and Vimeo channels after the meeting concludes.
Dr. Lindberg’s interest in the potential intersection between information technology and the biological sciences stretches back to the early days of his career. He joined the pathology faculty at the University of Missouri in 1960, where he developed the first automated lab system and an automated patient history acquisition system. He implemented an automated statewide system for interpreting electrocardiograms, as well as other medical applications for the computer. Around this time, he also began publishing articles in a field that would come to be known as medical informatics, including The Computer and Medical Care, which appeared in 1968.
Dr. Lindberg has worked as a scientist for over 50 years, becoming widely recognized as an innovator in applying computer technology to health care, medical diagnosis, artificial intelligence, and educational programs. In 1984 he was appointed director of the National Library of Medicine (NLM), the world’s largest biomedical library, a post that he still holds. As NLM’s Director, Dr. Lindberg convinced the United States Congress that the Library was an essential information conduit, facilitating the decision-making process of scientists and pharmaceutical companies, and, ultimately, benefiting patients and the general public, thereby securing the organization’s robust future. He has spearheaded countless transformative programs in medical informatics, including the Unified Medical Language System, making it possible to link health information, medical terms, drug names and billing codes across different computer systems; the Visible Human Project, a digital image library of complete, anatomically detailed, three-dimensional representations of the normal male and female human bodies; the production and implementation of ClinicalTrials.gov, a registry and results database of publicly and privately supported clinical studies of human participants conducted around the world; and, the establishment of the National Center for Biotechnology Information, a national resource for molecular biology information and genetic processes that control health and disease. Today, NLM has a budget of $327 million, more than 800 employees, and digital information services that are used billions of times a year by millions of scientists, health professionals, and members of the public.
Dr. Lindberg is a member of the Institute of Medicine of the National Academy of Sciences, and has received numerous honors and awards, including the prestigious Morris F. Collen, MD, Award of Excellence of the American College of Medical Informatics, and the Surgeon General’s Medallion of the US Public Health Service. He received his medical degree from the College of Physicians and Surgeons, Columbia University, and an undergraduate degree from Amherst College. A four-member committee selected Dr. Lindberg for the award: the late Ann J. Wolpert, director of libraries at the Massachusetts Institute of Technology; George O. Strawn, director of the Federal Networking and Information Technology Research and Development (NITRD) National Coordination Office (NCO); Sally Jackson, professor of communication at the University of Illinois at Urbana-Champaign; and Joan Lippincott, associate executive director of the Coalition for Networked Information.
The NCBI, in partnership with the National Library of Medicine Training Center (NTC), will offer the Librarian’s Guide to NCBI course on the NIH campus in April 2014. This will be the second presentation of the course; it was previously offered in the spring of 2013. After the course, lecture slides and hands-on practical exercises will be posted on the education area of the NCBI FTP site and video tutorials of the course lectures will be available on the NCBI YouTube channel. Materials from the 2013 course are currently available.
A Librarian’s Guide is an intense five-day exploration of modern molecular biology, genetic, and other biomedical data as represented at the NCBI. The course explains how and why these data are generated, their importance in modern biomedical research, and how to access them through the NCBI Web site. It is intended for medical librarians in the United States who currently are offering bioinformatics education and support services to their patrons or are planning to offer such services in the future. More information is available in the newest NCBI Insights blog post.
All applicants for A Librarian’s Guide must have successfully completed the asynchronous online Fundamentals of Bioinformatics and Searching class, which is a six-week introduction to molecular biology and bioinformatics taught by Diane Rein, Ph.D., MLS, and offered through the NTC. The Fundamentals course is open to any medical librarian in the United States interested in an introduction to bioinformatics and NCBI resources. A winter 2014 Fundamentals class, which runs from February 10 – March 21, 2014, is open for applications. Only people who have successfully completed the Fundamentals class may apply to A Librarian’s Guide to NCBI. The application process for eligible Fundamentals candidates will be announced in February 2014.
Health science librarians are invited to participate in an online bioinformatics training course, Fundamentals of Bioinformatics and Searching, sponsored by the National Library of Medicine (NLM), the National Center for Biotechnology Information (NCBI), and the National Network of Libraries of Medicine, NLM Training Center (NTC). The course provides basic knowledge and skills for librarians interested in helping patrons use online molecular databases and tools from the NCBI. Attending this course will improve your ability to initiate or extend bioinformatics services at your institution. Prior knowledge of molecular biology and genetics is not required.
The major goal of this course is to provide an introduction to bioinformatics theory and practice in support of developing and implementing library-based bioinformatics products and services. This material is essential for decision-making and implementation of these programs, particularly instructional and reference services. The course encompasses visualizing bioinformatics end-user practice, places a strong emphasis on hands-on acquisition of NCBI search competencies, and a working molecular biology vocabulary, through self-paced hands-on exercises.
This course is offered online (asynchronous) from October 21 – December 2, 2013. The course format includes video lectures, readings, a molecular vocabulary exercise, an NCBI discovery exercise, and other hands-on exercises. The instructor is Diane Rein, Ph.D., MLS, Bioinformatics and Molecular Biology Liaison from the Health Science Library, University at Buffalo. Due to limited enrollment, interested participants are required to complete an application form. The deadline for completing the application is September 9, 2013; participants will be notified of acceptance on September 23, 2013.
The course is offered at no cost to participants. Participants who complete all assignments and the course evaluation by the due dates within the course will receive 15 hours of MLA CE credit. No partial CE credit is granted. This course is a prerequisite for the face-to-face workshop, Librarian’s Guide to NCBI. Participants who complete the required coursework and earn full continuing education credit will be eligible to apply to attend the five-day Librarian’s Guide that will be offered in April of 2014. Questions about the online course may be directed to the course organizers.
The National Library of Medicine’s (NLM) Medical Text Indexer is being used as one of the baselines for the international BioASQ challenge. The Medical Text Indexer (MTI), a system for producing indexing recommendations, assists in the indexing process at NLM. The BioASQ challenge is a series of challenges on biomedical semantic indexing and question answering, with the aim of advancing the state of the art accessibility for researchers and clinicians to biomedical text. The MTI indexing results are providing one of the baselines used in the “Large-scale online biomedical semantic indexing” part of the challenge, which is designed to parallel the human indexing currently being done at NLM. Alan R. Aronson, PhD, Principal Investigator for the MTI project, also will be delivering an invited talk on Indexing The Biomedical Literature In A Time Of Increased Demand And Limited Resources at the BioASQ Workshop in September. Dr. Aronson is a Principal Investigator at the Lister Hill National Center for Biomedical Communications, an Intramural Research Division of the National Library of Medicine.
NLM to Participate with Partners in “An Epidemiology of Information: New Methods for Interpreting Disease and Data”
The National Library of Medicine (NLM) has announced its next initiative as part of its ongoing partnership with the National Endowment for the Humanities (NEH). Working with NEH’s Office of Digital Humanities, the National Science Foundation, and Virginia Polytechnic Institute and State University (Virginia Tech), the NLM will be a part of An Epidemiology of Information: New Methods for Interpreting Disease and Data, an interdisciplinary symposium exploring new methods for large-scale data analysis of epidemic disease.
Scheduled to take place at the Virginia Tech Research Center in Arlington, VA, on October 17, 2013, from 8:30 AM to 5:00 PM, “An Epidemiology of Information” will be a unique public forum through which policy makers, public health experts, and scholars can address pressing questions about how new methods of analyzing large-scale datasets can inform research and policy approaches to epidemic disease. Panelists will consider what these new methods suggest for contemporary infodemiology and epidemic intelligence, as well as the implications of data mining as a disease surveillance mechanism, and how new forms of reporting and public health surveillance affect public health policy. The symposium will also explore how these new methods can inform research on the 1918 influenza pandemic, and help to answer lingering questions about the spread of the disease, its pathogenicity, the unusual mortality rates, or the effectiveness of public health responses.
Featured speakers will include Dr. Jeffery Taubenberger, Chief, Viral Pathogenesis and Evolution Section, National Institute of Allergy and Infectious Diseases (NIAID), and Dr. David Morens, Senior Advisor to the Director, NIAID, whose research in data analysis and historical epidemiology has influenced the approaches being adopted and adapted by digital humanities scholars working in the history of medicine. “An Epidemiology of Information” is made possible in part from support received by Virginia Tech through the international Digging into Data Challenge competition sponsored by NEH. Funding for Virginia Tech’s Canadian partner, the Center for E-Health Initiatives of the University of Toronto, comes from the Social Science and Humanities Research Council of Canada. The symposium is free and open to the public, but registration is required.
The National Library of Medicine has announced that Extensible Markup Language (XML) data from the IndexCat™ database is now available for free download. Released with a Document Type Definition (DTD) that allows researchers to validate the data, this new XML release includes the digitized content of more than 3.7 million bibliographic items from the printed, 61-volume Index-Catalogue of the Library of the Surgeon-General’s Office, originally published from 1880 to 1961. The XML describes items spanning five centuries, including millions of journal and newspaper articles, obituaries, and letters; hundreds of thousands of monographs and dissertations; and thousands of portraits. Together, these items cover a wide range of subjects such as the basic sciences, scientific research, civilian and military medicine, public health, and hospital administration.
The NLM release of the Index-Catalogue in XML format opens this key resource in the history of medicine and science to new uses and users. It is one of the monuments of the Library’s longstanding, systematic indexing of the medical literature, an effort which William Henry Welch (1850-1934), the great pathologist and bibliophile, considered to be “America’s greatest contribution to medical knowledge.” This indexing, begun by John Shaw Billings in the nineteenth century at the Library of the Surgeon-General’s Office, United States Army (known today as the NLM), eventually created two distinct products: the Index-Catalogue of the Library of the Surgeon-General’s Office, United States Army, and the Index Medicus, forerunner of MEDLINE®, and now the largest component of PubMed.®
Released alongside the IndexCatalogue XML are an integrated XML file and associated DTD for two collections developed from the electronic database of A Catalogue of Incipits of Mediaeval Scientific Writings in Latin (rev.), by Lynn Thorndike and Pearl Kibre (eTK), and the updated and expanded version of Scientific and Medical Writings in Old and Middle English: An Electronic Reference (eVK2), edited by Linda Ehrsam Voigts and Patricia Deery Kurtz. Also available via the online IndexCat, these resources encompass over 42,000 records of incipits, or the beginning words of a medieval manuscript or early printed book, covering various medical and scientific writings on topics as diverse as astronomy, astrology, geometry, agriculture, household skills, book production, occult science, natural science, and mathematics, as these disciplines and others were largely intermingled in the medieval period of European history. The NLM release of these resources in XML format joins many other freely downloadable resources, including the XML for MEDLINE®/PubMed® data, which includes over 22 million references to biomedical and life sciences journal articles back to 1946, and, for some journals, much earlier.
The release also coincides with the NLM’s participation in “Shared Horizons: Data, Biomedicine, and the Digital Humanities,” an interdisciplinary symposium exploring the intersection of digital humanities and biomedicine, being held April 10-12, 2013, in partnership with the National Endowment for the Humanities’ Office of Digital Humanities, Maryland Institute for Technology in the Humanities at the University of Maryland, and Research Councils UK. Shared Horizons will create opportunities for disciplinary cross-fertilization through a mix of formal and informal presentations, combined with breakout sessions designed to promote a rich exchange of ideas about how large-scale quantitative methods can lead to new understandings of human culture. Bringing together researchers from the digital humanities and bioinformatics communities, the symposium will explore ways in which these two communities might fruitfully collaborate on projects that bridge the humanities and medicine around the topics of sequence alignment and network analysis, two modes of analysis that intersect with “big data.” All Shared Horizons sessions will be live-streamed with a monitored back channel for the public to post/tweet comments. Recordings of all talks will also be posted to the Shared Horizons website, with the ability to comment pre- and post-event.
Research funded by the National Library of Medicine provides new insight into why patients stop taking drugs that lower their cholesterol, and what happens when patients try those drugs, known as statins, a second time. Researchers found that more than 90% of patients who stopped taking statins because of an adverse reaction could tolerate the medication when tried again. The study is published in the April 2, 2013, issue of the Annals of Internal Medicine.
NLM grantee Alexander Turchin MD, MS, of Brigham and Women’s Hospital, a teaching affiliate of Harvard Medical School, notes that statins are commonly stopped even though their benefits are well documented. He and colleagues wanted to better understand why statins are discontinued and whether adverse reactions play a role. They conducted a retrospective study, analyzing clinical data in an electronic medical record (EMR) system. Researchers examined structured data as well as the narrative electronic notes of health providers. Those notes frequently are the only place in an EMR where adverse reactions to medications are documented. Using the NLM grant, researchers developed natural language processing software and scoured more than 5 million notes, on more than 107,000 patients, recorded over nearly a decade. The software generated data on a scale that could not have been done manually. Researchers say the next step would be to conduct a clinical trial to determine if outcomes are improved when statins are tried again, after an adverse event.
The National Library of Medicine, part of the National Institutes of Health, conducts and funds research in biomedical informatics, which involves applying computers and communications technology to the field of health. This research was supported by NLM’s Division of Extramural Programs grant RC1-LM010460. This was an NIH Challenge Grant, supported by NLM with funds from the American Recovery and Reinvestment Act. For additional information, visit the Brigham and Women’s Hospital News Release.
The National Library of Medicine Division of Extramural Programs has announced the 2013 NLM Informatics Lecture series showcasing NLM-funded research in biomedical informatics. The program kicks off March 6 when Chunhua Weng, PhD, presents “Bridging the Semantic Gap Between Research Eligibility Criteria and Clinical Data: Methods and Issues.” Dr. Weng is the Florence Irving Assistant Professor of Biomedical Informatics at Columbia University. Her research centers on developing human-computer collaborative approaches to help clinical researchers make the best use of health information technology. She currently is focusing on problems that include interactive query formulation to assist clinical researchers in interrogating large clinical databases.
On June 5, Graciela Gonzalez, PhD, of Arizona State University, will present “Mining Social Network Postings for Mentions of Potential Adverse Drug Reactions.” The final 2013 lecture, November 13, will be presented by Timothy Cardozo, MD, PhD, of the New York University School of Medicine. He will discuss “A Chemical Biology Network for Personalized Medicine.” The lectures will be held 2-3pm (Eastern) in Balcony A of the Natcher Building (Building 45) on the NIH campus. All three talks will be archived at http://videocast.nih.gov/.
New Course Announced for Spring 2013 at the National Library of Medicine: A Librarian’s Guide to NCBI!
A Librarian’s Guide to NCBI comprises a pre-course, “Fundamentals in Bioinformatics and Searching,” offered online (asynchronous), during March 2013, and a five-day, in-person course offered at the National Library of Medicine, April 15-19, 2013. The course provides basic knowledge and skills for librarians interested in helping their clientele use online molecular databases and tools from the NLM’s National Center for Biotechnology Information (NCBI). Topics include using the BLAST sequence similarity search and Entrez text search systems to find relevant data. The course describes the various kinds of molecular data available, and explains how these are generated and used in modern biomedical research. Instructors will be NCBI staff and Diane Rein, Ph.D., from the University at Buffalo. The course will be limited to 18 students. More information and an online application will be available in December, 2012.