Marianne Pouwer

Marianne Pouwer

Will named entity recognition make reading obsolete?

Language is all around us. You can find it in the news you read online, in the recipe you found for dinner tonight, and in the scientific article you’re reading for your thesis. All words made out of letters on a piece of paper, which are then processed by your brain to form a cohesive story. But what if a computer could do the processing for you? Would you never have to read a word again?

What is named entity recognition?

Named entity recognition (NER) is a part of natural language processing (NLP), which in its turn is a part of artificial intelligence (AI). Lots of abbreviations, but we’ll break it down for you. AI is a name we give to all intelligence that isn’t human. If you’d like to read more about artificial intelligence, make sure to read our blog post What is artificial intelligence?. NLP is about processing natural language. The languages we speak as humans are all natural languages. And if you think about it for a minute, you’ll realise our languages are very complex. Unlike artificial languages, such as programming languages, which are made to be simple and unambiguous so computers can interpret them. Within the field of NLP, we find NER. NER is a technique that is used to extract important information from texts, and to identify and categorise it. The information consists of so-called entities, which can be a word of a string of words. Entities always refer to the same thing, so they can be classified in a predetermined category.

Why should I use NER if I have Ctrl+F?

The keyboard shortcut Ctrl+F lets you easily find a word in a text. Which is helpful, but also causes some problems. Luckily, these problems can be solved with NER. Try and use Ctrl+F to find ‘way’ in the following text, to find out what the best way to Utrecht is.

‘What way to Utrecht?’ asks Lisa on a busy four-way intersection in Amsterdam. ‘Do you want to go all the way to Amsterdam by bike?’ Tom answers surprised. ‘No way, you’re crazy. I think it’s way too far. But if you’re sure you want to go, you should take the first left, take a right over the bridge and follow the path alongside the Amsterdam Rijnkanaal.’ Lisa thanks Tom and before you know it, she’s on her way to Utrecht!

It’s easy to see the word way has multiple meanings here. And the way to Utrecht can be described in multiple ways, as in path, route, or directions left or right. By using NER you will not only find ‘way’ as you would with Ctrl+F, but you’ll also find words such as path, route or road.

Cool, can you give me an example?

Of course! Before you can apply NER on a text, you’ll have to decide in which categories you want the entities to go. The figure below shows the six categories needed for this text.

As you can see, the different entities have been categorised in their specific context. Amsterdam is labelled as a city, but elsewhere Amsterdam Rijnkanaal is labelled as a waterway. And way in ‘No way, you’re crazy’ hasn’t gotten a label, because it actually isn’t part of the infrastructure in this context.

What can we do with NER?

NER can be particularly helpful for reading unstructured, complex texts, such as doctor’s notes in a patient file, or for analysing scientific studies. Every doctor or researcher will have their own way of writing down findings, using different words for the same findings. Let’s say you’re looking for side effects of a certain type of medication in the electronic patient file. If you’re using Ctrl+F, you’ll have to think of all of the different ways the side effects can be described, and search for every single one of them. With NER, you can simply ask for every entity in the category ‘side effects’. Easy!

How can the algorithm know all of this?

Well, an algorithm learns much like you and I do. Remember how you learned how to read as a child? By practising and reading more and more, your vocabulary keeps expanding. You’ll learn new words which you can then use later in life. Training an algorithm is much like learning how to read. You give the algorithm lots of texts as input, with the right labels for every word in their category. The more types of text you give, the better it can learn. It can then even recognise new words in their context, without having seen them in a labelled text before.

How does Medstone use NER?

At Medstone, we use NER to analyse scientific articles. We use it to rapidly select relevant articles for systematic literature reviews. Just like the word ‘way’ in the example above, there are lots of ambiguous words in academic literature. Not only that, but new medicines are developed every day, each time with a different name to it. By using NER, we can automatically recognise these new medicines!

So, will reading become obsolete?

Well, unfortunately, not entirely. Apart from having to read the labelled entities the algorithm has recognised, you mustn’t be replacing your work with the algorithm, but rather you’re working together with the algorithm. Human interpretation will always be important in understanding some findings, and it’s not impossible for an algorithm to get it wrong every once in a while. It’s quite human, actually.

Want to find out more?

If you want to know more about NER, or if you would like to apply NER in your own scientific research, please do not hesitate to contact us.

This blog post was previously published on medstone.com

Recent Articles

Network meta-analysis: for solving an elephant’s problem

Medical progress happens every day with new medicines, devices, and diagnostic tools steadily emerging. How can patients fully benefit from these? How can doctors and pharmacists stay up to date with the endless stream of pharmaceutical research? How can you make medical decisions when comparisons between all medicines are rarely available?  Well, a network meta-analysis […]

Alexandra Gabur

19/01/2022

Read more

An atlas for treating brain tumors: toxicity mapping by Amsterdam UMC and Medstone

An atlas for treating brain tumors: toxicity mapping by Amsterdam UMC and Medstone Medstone and Amsterdam UMC are joining forces in an ambitious research project, searching for better treatment options for patients with brain cancer. Various specialists are working together in this collaboration between Amsterdam UMC and pharmaceutical company Medstone – from Bart Westerman, project […]

Stefanie Bronswijk

04/10/2021

Read more

The Toxicity Atlas

What’s the best way to treat cancer? How do we make sure patients suffer from as few side effects as possible? And can we use artificial intelligence (AI) in treating cancer patients? These and other questions are what healthcare professionals, patients and their families are concerned with on a daily basis. The answers? We haven’t […]

Marianne Pouwer

08/09/2021

Read more