Big Data

Definition

Big data refers to datasets that are too large to process on a personal computer. Compared to traditional, smaller datasets that can be stored, analyzed, and easily managed on a personal computer, big data refers to datasets that are much larger, are created or added to more quickly, are more varied in their structures, and are stored on large, cloud-based storage systems.

Researchers working with big data use specialized software tools, supercomputers, and high performance computing clusters designed to handle the volume and complexity of the datasets. Creators of artificial intelligence often train their programs with big data, and researchers may use machine learning to better understand or describe large datasets.

Examples

Awesome Public Datasets is a long list of big datasets taken from public data sources and arranged into categories:

https://github.com/awesomedata/awesome-public-datasets 

Google also provides access to public datasets via BigQuery, which hosts and provides access to big datasets: 

https://cloud.google.com/bigquery/public-data/

 

Further Resources

This animated video presents a short history of the European Organization for Nuclear Research, CERN, as a way of describing big data and providing an example of working with massive data sets:

https://www.youtube.com/watch?v=j-0cUmUyb-Y 

This article shows that big data researchers in psychology and sociology do not share one standard definition for big data, but associate various terms and methodologies with it:

Favaretto M, De Clercq E, Schneble CO, Elger BS (2020) What is your definition of Big Data? Researchers’ understanding of the phenomenon of the decade. PLOS ONE 15(2): e0228987. https://doi.org/10.1371/journal.pone.0228987

This article describes some of the ethical issues involved in working with big data that includes biomedical information:

Saqr M. (2017). Big data and the emerging ethical challenges. International journal of health sciences, 11(4), 1–2. https://pubmed.ncbi.nlm.nih.gov/29085259/

Search for a Term

Send us your feedback or suggestions for new terms

Contact information
CAPTCHA
3 + 6 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.
This question is to prevent spam submissions. Contact nwso@hshsl.umaryland.edu for any accessibility issues.