Skip to Main Content

Text and data mining

Introduction

What is it?

Text and data mining (TDM) refers to the text and data mining of scholarly materials. The UK Government defines TDM as “the use of automated analytical techniques to analyse text and data for patterns, trends and other useful information” (HM Government, 2014).

It allows researchers to analyse large amounts of data to gain knowledge that cannot be perceived by traditional reading of the individual documents. Data mining involves extracting meaning from large data sets, while text mining is a form of data mining in which unstructured textual data is given structure to enable analysis.  

HM Government. (2014). Exceptions to copyright. Intellectual Property Office.
https://www.gov.uk/guidance/exceptions-to-copyright.

Why is it important?

TDM can help realise the potential of big data to gain new insights, extending the possibilities for research and innovation. This short video from the Royal Society of Chemistry explains the benefits of materials being made available for use in this way: