Part 1: Why build a synthetic colleague?
The Implementor
April 2020
Writing a CV and learning about natural language processing in artificial intelligence
Natural language processing is the field of allowing machines and computers to process and extract information from human language. It’s a massive field where many of us have spent a lot of time on machine learning. The fact of the matter is that there’s just a lot more text in the business world than there are images, audio or video data.
Natural language processing has another nice attribute, i.e. that it’s considerably less compute-intensive than image analysis. Often, text data is measured in gigabytes, even with billions of words, meaning that we can comfortably fit even very large data sets on modern laptops. The most advanced models are often the less demanding.
Today, natural language processing is used everywhere. We all know of machine translation, but large-scale sentiment analysis, which is about measuring emotion in text, is common in marketing, as well as topic modelling, which is the act of discovering themes in large bodies of text, and a variety of classification tasks: Is this a fraudulent text? Does this text conform to criteria X, Y and Z? And so on.
The problem of natural language processing is often the opposite of image processing. In image processing, we have too much data, and we need to figure out what is hidden in the data. A picture is worth a thousand words, but most often, we just need to know e.g. that it’s a “dog”.
In an individual sentence, we’re only given very little information and thus need additional context to fully understand it. Take this example:
Peter threw the ball to Lisa, who smiled and picked it up.
Simple enough, right? But we as humans with all our complexity, context and common sense now start to interpret the sentence. We know that Peter doesn’t have the ball anymore. We also know that the ball is probably of a size that’s not too big. Lisa could pick it up after all. Lisa and Peter seem friendly towards each other. Are they perhaps kids?
A machine has a very difficult time with context like this. For all it knows, Peter and Lisa could be code names for any number of things. And what is a ball anyway? A machine has no way of knowing. It has to learn everything from scratch.
And that’s a problem when doing complex natural language tasks, i.e. when it must be able to answer questions in a natural way – like an advanced chatbot – or tell whether a written comment from a user counts as a complaint.
Traditionally, we have had to “brute force” a solution to this. If we wanted to train a machine to tell whether a written user comment was a complaint or not, we would show it hundreds of thousands of comments stating whether it was a complaint or not, and the machine would learn the context and meaning of all the relevant words to solve exactly this problem and nothing else.
This wouldn’t be an issue if we had a lot of data and/or a sufficiently simple problem. But what if the problem gets more complex, and we don’t have enough data?
Like the GANs from the previous post, Natural Language Understanding (or NLU) is a general model that “understands” language. This is a bit of a stretch, as what it really does is to model language. Thus, it can generate language convincingly.
Usually, an NLU will take a section of text, analyse it word by word from left to right like we would read a text and then push information through a deep artificial neural network, remembering the words at each step.
Once it gets to the end of the sequence, it will try to predict the next word – just like the predictive keyboard on modern smartphones. It then starts over a number of words back and repeats the process of trying to predict the following word or end of sentence.
As the machine predicts words in sentences and learns more and more complex relations between words, it will learn that after a sentence like the one with Peter and Lisa, it’s much more likely for another sentence to be about how Peter might say something to Lisa than to suddenly burst out something about nuclear war or monetary policy. The machine will learn a sort of general and reasonable expectation of words and terms in relation to each other.
This is a very powerful starting point for more advanced – and useful – models, and, even better, it requires only raw data and no preparation or context.
In February 2019, there was quite a fuss when OpenAI, one of the leading AI research groups, disclosed the discovery of GPT-2, the most advanced natural language model ever. However, they refused to do a full release of the model, citing fears over its use in automatic generation of fake news and spam, thus earning it the nickname “the model too dangerous to release”.
Meanwhile, they did release the architecture of the model and a smaller version. The full version of GPT-2 has about 1.5 billion parameters – think of this as the combined number of neurons and synapses in its “brain” – but today, we’re going to play with a smaller version with just 345 million parameters. Still a very robust language model by today’s standards. This model was trained on domains such as Wikipedia, a solid section of the general internet – the so-called “Common Crawl” – and, believe it or not, reddit.com.
GPT-2 can generate naturally sounding English text from prompts. Below is an example where we’ve taken the first sentence of Michael Borges’ CV and allowed the machine to finish it (actual CV in bold):
“For 25 years, Michael Borges has dedicated his career to a passion for business improvement through digital and data-driven strategies and solutions. A pioneer of software as business manager, Borges has helped organisations and employees take greater control of their workflow through his innovative software solutions and his unique management platform ProtonMUD. He co-founded PIMA Systems (Pulse I/O, InfiniDB, InfiniDB SQL and Web/Cortana), a leading provider of cloud-based technologies for the entertainment industry.”
The statements are obviously false. Michael never founded a software solution called ProtonMUD, and we don’t know if he is actually a prominent supporter of the entertainment industry. But even so, it still sounds pretty plausible. We’re in the realm of professional software and management, and in a different world, this could definitely be true. By the way, Michael did actually found the firm Platon. So, it wasn’t too far off.
GPT-2 will never know of the layout of our CVs, and it won’t know of the special terminology we use at Implement – and we’re not able to tell it. GPT-2 allows fine-tuning, and we have 1,000 CVs for this purpose. This process took just about an hour of compute on a powerful laptop, and suddenly, we had this:
“For 25 years, Michael Borges has dedicated his career to a passion for business improvement through digital and data-driven strategies and solutions. Michael currently holds an MSc in International Business in Public Administration and is a partner at Implement Consulting Group. Michael has worked with a large variety of issues in both the public and private sector, spanning more than 30 years as a consultant, and has been involved in more than 20 large-scale transformations.”
The model still readily makes stuff up (Michael doesn’t hold an MSc in International Business as far as we know), but it does so within the correct realm, i.e. consulting. And while the above sounds pretty reasonable, and arguably even true, this is due to sheer dumb luck, as this pretty over-the-top second run shows:
“For 25 years, Michael Borges has dedicated his career to a passion for business improvement through digital and data-driven strategies and solutions. His career in building and managing several major technology companies, including Microsoft, IBM, PwC, Intel, Deloitte, Deloitte Business Consulting, Danfoss and several smaller companies. Michael has profound knowledge of how the business model works; the current landscape of strategic thinking and capability building; and the challenges facing the European market. Michael has a deep understanding of the value, cost and complexity of strategic solutions. Michael is known for his ability to lead change, change with impact.”
Wow, Michael is responsible for everything from PwC to Microsoft, he knows “the business model” and the European market, and he does it all with impact. It’s almost a psychedelic form of Implement poetry. But more importantly, it sounds correct. This is what we need for the Implementor.
Now we move on to creating a CV for the Implementor. We just need a starting sentence, and then let’s generate some CVs.
“The Implementor is an artificial intelligence working at Implement Consulting Group in the Operations Excellence practice. He holds a bachelor’s and master’s degree in Applied Psychology and Management Science from a range of business-critical business areas. Based within operations management, finance and sales, supply chain and operations management, E&G, business transformation, operations management and Lean, leadership excellence and innovation, as well as a master’s in Applied Economics and Management Science, where he has four years’ experience and is currently pursuing an MNE in Applied Economics and is a strong competition player and leader within global business industries. In his last few years at Implement, the Implementor led a global strategy project, in which the Implementor was responsible for one of the four largest business cases in the case work.”
…
“The Implementor is an artificial intelligence working at Implement Consulting Group in Zurich, where he is also a part of the HR team, and is part of Implement’s Outsourcing Leadership team. He has 10 years of experience within supply chain management, primarily from a management point of view. The Implementor is an expert within HR strategy and has held management positions for large public and private companies. He has extensive experience within global HR strategy development through his 25 years of management experience within HR management as well as within large public and private companies.”
…
“The Implementor is an artificial intelligence working at Implement Consulting Group. His primary focus is on designing, implementing and improving processes using machine learning, artificial intelligence and data science methodology. He is currently part of the Operations Strategy team and has led processes implementing the operating model of Implement as leader of teams. He has an MSc in Applied Statistics and Financial Economics from the University of Copenhagen.”
We see that, to some degree, GPT-2 is able to extract typical Implement topics. One that stands out is the version which we’ve dubbed “the overeducated guy from operations strategy”. We also see one which is apparently from our Zurich HR team and does outsourcing. And then finally, we see a process consultant, who sounds a lot like a normal process consultant.
First of all, we now have a text for the Implementor’s CV, so we can finally finish it off and get him onto a project.
Secondly, we have a model which seems to have a reasonable understanding of the kinds of words we use at Implement. It’s not perfect at all; in fact, it mostly produces lies and gibberish, but we do see that it’s learning.
The model is learning about our practices and what kind of skills are related, and the mistakes it does make are understandable and somehow proof that it’s really trying to get into the context of words. Take a look at the HR case where it mentions “outsourcing”, which is a correct HR term. It suddenly draws inspiration from supply chain management where outsourcing is also a thing, but a different thing. Thus, it hasn’t learnt the full contextuality of outsourcing.
But this is just the result of 1,000 CVs, 345 million parameters and a couple of hours of compute. The potential of the future is pretty impressive. Soon, OpenAI will hopefully release the 1.5 billion parameter version, and if given enough time and data, we’re sure that the model will learn to make sense of “outsourcing”.
Admittedly, the direct use is very limited at this point. The commercial use of text generative models is nothing beyond our fancy predictive keyboards. However, the use of models like GPT-2 for pre-training is very real, and it means that the kinds of questions we can expect machines to handle within e.g. classification can be much more complicated than they are today. A question such as “Does the text contain personal data?”, which has to do with GDPR, can be very complicated. There are a million ways for a person to express their religious views or sexuality. Today, it takes a human being to screen for these kinds of things. But maybe not for much longer.