National Network of Libraries of Medicine
English Arabic Chinese (Simplified) French Hindi Japanese Korean Persian Portuguese Russian Spanish

Dataset

A dataset can be simply defined as a collection of data that is represented in a particular form. Datasets will vary depending upon the type of research study being completed, and how the researchers have decided to organize their data upon collection. Dataset is essentially a heterogeneous term that could be made up of any type of collection for any type of data. For example, a dataset may contain multiple files such as raw imaging files, 3D reconstructions, protein sequences, DNA sample data and a variety of segmentations just to name a few. As a librarian, it is important to understand how researchers define their datasets in order to create a data management plan to describe the data and plan for its preservation and future use.