Skip to Main Content

Text and data mining

Introduction

Please get in touch with us before you download large amounts of data from a library subscribed resource, as you may trigger automatic misuse tools which could lead to our access being suspended. 

If you have been prevented from using a resource due to licensing or technical issue, or you have any other questions, please let us know as soon as you can.

What is it?

The UK Government defines TDM as “the use of automated analytical techniques to analyse text and data for patterns, trends and other useful information” (HM Government, 2014).

In a research context it can be used to select, search and analyse substantial amounts of scholarly material. This could be used to extract meaning from large data sets or to find patterns, discover relationships and enable semantic analysis in text to understand how the content relates to ideas and needs.

It usually requires copying large amounts of content for analysis and is subject to copyright. It therefore needs either permission from the rightsholders or it should fall within the copyright legislation exception described in the following section.

HM Government. (2014). Exceptions to copyright. Intellectual Property Office.
https://www.gov.uk/guidance/exceptions-to-copyright.

Why is it important?

TDM can help realise the potential of big data to gain new insights, extending the possibilities for research and innovation. This short video from the Royal Society of Chemistry explains the benefits of materials being made available for use in this way: