Medical Library Project 134: UNC Genomics Data Catalog
The University of North Carolina at Chapel Hill (UNC) Health Sciences Library (HSL) will implement a pilot data catalog to facilitate discovery, access, and re-use of UNC-generated genomic data produced by research projects funded by NIH and other federal agencies. A 2014 UNC Genomics Data Task Force assessed UNC's data stewardship needs and recommended a data catalog as critical need. This project will be the first initiative to pursue the task force's recommendation. Genomic and genetic data is of critical importance, allowing medical practitioners to create customized treatments targeted to an individual's genetic background. By making UNC's at-risk genomic data discoverable by the wider research community, the pilot project will contribute to the progress of medicine and public health, as well as provide U.S. health professionals with equal access to biomedical information. To achieve these aims, the HSL data catalog team will use the New York University (NYU) open source data catalog model. From initial discussions with the NYU data catalog team, HSL determined that their model would match UNC's needs for a genomic data catalog. The flexible metadata scheme and database structure can accommodate a variety of data types and can be customized, if needed, to extend that accommodation. UNC's catalog pilot would contribute to the NYU model by demonstrating how it can be adapted to fit genomic data. To populate the pilot catalog, the HSL team will work with existing contacts in UNC's Genome Sequencing Center, Lineberger Comprehensive Cancer Center, Center for Genome Sciences, and Center for Bioinformatics to identify interested principal investigators. The team will conduct outreach to identified investigators to learn about their research and data, raise awareness of the data catalog, promote its benefits for facilitating FAIR Data Principles, and explain its potential ability to track schedules for data object review and retention.