Data Mining

Definition

Data mining is the process of identifying patterns and relationships in large datasets and extracting this information. This is accomplished with statistics and/or machine learning techniques. Data mining differs from data analysis in that it is approached without a hypothesis. Data mining often involves the automated collection of large quantities of data to “extract” previously unknown or interesting patterns in data.   

Examples

An example of the use of data mining in healthcare is looking for patterns in large sets of EHR data to identify harmful drug interactions.

Similar Terms

Text Mining
Tools

The tidyverse is a heavily used, well-supported set of libraries for R programming with functions that are very useful for data cleaning, analysis, and visualization.

Pandas is a library for Python for data cleaning and analysis, with some basic data visualization functionality.

 

Further Resources

Sadiku, M. N. O., Shadare, A. E., & Musa, S. M. (2015). DATA MINING: A BRIEF INTRODUCTION. European Scientific Journal, ESJ, 11(21). Retrieved from https://eujournal.org/index.php/esj/article/view/6017

Gupta, S. (2022). “Introduction to Data Mining: A Complete Guide.” Springboard Blog. Retrieved from https://www.springboard.com/blog/data-science/data-mining

Search for a Term

Send us your feedback or suggestions for new terms

Contact information
CAPTCHA
7 + 12 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.
This question is to prevent spam submissions. Contact nwso@hshsl.umaryland.edu for any accessibility issues.