Dependency parsing: the next step in text analysis

In our previous blog post about NER, Marianne explained how you can use artificial intelligence to analyse a text without having to read it yourself. NER stands for Named Entity Recognition, and is a technique used in algorithms to automatically recognise and categorise words, based on their meaning. But if you want to fully understand a text, you need more than just the meanings of separate words. You need to know what the relation is between those words. This is where semantic dependency parsing comes in – another artificial intelligence (AI) technique. In this blog post I’ll tell you a little more about what semantic dependency parsing is, and how we can use it to our benefit.

Words turn into stories

Dependency parsing is used in computers or algorithms to automatically read texts. Automating this process makes it possible to read numerous pieces of text in just a matter of seconds, thus saving a lot of effort and time. But how exactly is it possible for a computer to even begin to comprehend the story that can be found in a pile of letters?

Well, first of all, those letters make up words. And we know now that a computer can recognise words using NER. The next step is then to connect the words in a meaningful and logical way. This can be done using dependency parsing. Dependency parsing is based on dependency grammar, a linguistic theory that describes how to recognise structure within a text. Textual structure can be recognised by looking at mutual dependency, or asking: which words belong together? These words are then linked to each other, like in the example below:

Calculating connections

Remember those grammar lessons in school? Where you had to hunt for the subject, finite verb, and direct object in sentences… Computers can hunt for (and find!) such patterns as well. Once you’ve found those elements, it’s easy to connect them to each other: the verb takes a subject and a direct object, which then takes an indirect object. As long as two sentences contain the same elements, they’ll follow the same structure, which makes it easy for a computer to calculate. This calculation process is what we call dependency parsing.

Sounds easy, right? And it is, as long as your sentence is simple. It gets more complicated when we’re looking at compound sentences, which can have multiple subjects and predicates. If you want a computer to be able to calculate the structure of a compound sentence, you’ll have to turn to statistics and calculating probability. Which is why we use artificial intelligence in dependency parsing.

And it looks like this

In the example shown above, we see a simple sentence that can easily be analysed by a computer, based on simple patterns. But let’s take a look at a more complex sentence:

“My stocks are doing well! I already made €2000 after three months, and double that after a year.”

Say we want to know how much money is made after a certain time period. Using labels such as finite verb, subject, object, et cetera won’t be very helpful answering that question. So we’ve made our own labels, and thanks to NER our computer can already label the entities in the sentence correctly. What our computer can’t tell us yet, is the amount of money made after a set amount of time. Which is why we’ve connected the entities to each other:

Sentences like this, together with the correctly drawn connections, are used as input for an AI-model. The model then learns from this input and can correctly predict structures in new compound sentences it has never seen before. This way, a computer can analyse an entire text using NER and dependency parsing. Want to learn more about how a computer can learn, using artificial intelligence? Read about it in our blog post about AI.

Man and machine

Are we, as human readers, becoming obsolete? At least not for now. Both NER and dependency parsing aren’t completely flawless, and it’s possible for a computer to make small mistakes. There is an endless number of possible text structures, which makes it difficult to prepare an algorithm for each and every one of those possibilities. And then there’s always a chance to find mistakes, such as spelling mistakes or incorrect grammar, in the texts themselves, which can’t be processed properly by a computer. So human readers are still very important for final checks, even with the use of these techniques.

The next step

Even though dependency parsing isn’t operating flawlessly (yet), it is the next big step in the analysis of, for example, medical research. By combining NER and dependency parsing, you can easily find answers to questions such as ‘What is the right dose for the administered drug?’, ‘How many times is it administered throughout the study?’ and ‘What results can we find for which treatment?’. At Medstone, we use dependency parsing to answer these and many other questions, by analysing a database of hundreds of scientific articles simultaneously. This way, we can quickly and easily compare multiple studies with each other! If you want to know how you can use AI in your own research, or if you’d like to hear more about the possibilities of dependency parsing, please do not hesitate to contact us!

This blog post was previously published on medstone.com

Dependency parsing: the next step in text analysis

Words turn into stories

Calculating connections

And it looks like this

Man and machine

The next step

Recent Articles

Network meta-analysis: for solving an elephant’s problem

An atlas for treating brain tumors: toxicity mapping by Amsterdam UMC and Medstone

The Toxicity Atlas

Words turn into stories

Calculating connections

And it looks like this

Man and machine

The next step

Recent Articles

Network meta-analysis: for solving an elephant’s problem

An atlas for treating brain tumors: toxicity mapping by Amsterdam UMC and Medstone￼

The Toxicity Atlas

An atlas for treating brain tumors: toxicity mapping by Amsterdam UMC and Medstone