Part 1: Why build a synthetic colleague?
The Implementor
May 2020
How did the Implementor turn out? What we have learnt, and what it might be used for some day.
In the final CV, the Implementor turned out to be a pretty decent machine learning- and data-interested individual. He has studied statistics and financial economics at the University of Copenhagen, and for now, he works in the Operations Strategy team.
As you can see from the below cover picture of the article, we’re just a few keystrokes away from generating another dozen Implementors, some of whom are so decent that they border on boring, while others produce wild lies and statements. But in general, they’re all at least pretty sensible at first glance – maybe except for the creepy pictures.
However, one of the biggest issues is still time inconsistency, often creating timelines that are simply impossible.
While generative models in and of themselves are currently mostly for fun, they have very real but limited use in the creative industry. Here, e.g. “Deepfakes” are already doing a remarkable job of reviving dead actors and generating increasingly impressive music demos. Creepy weaponised usage also exists such as the mass creation of false identities complete with pictures, names and back stories or the generation of unlimited quantities of fake news to drown out the media.
In the longer term, the field of generative models is research into perception. Technologies such as these are what finally allows machines to leave their very binary and logical origin as databases and rules and start interfering with nuanced concepts such as language and art.
For a long time, a concept such as language was deemed too complex for humans to really fully grasp. A lot of philosophical material has been written on the role of language in our perception of the world, and the idea of a computer programme fully grasping the complexity of language would have been laughable. Today, it turns out that “all” that language really required was 1.5 billion parameters – more than any human can hold in their cognition but which can easily be held by the brain in more broad terms. But with the advance of compute and data, it’s very doable for a machine today.
Admittedly, “grasping language” is a contested concept, and what we have witnessed from e.g. GPT-2 is language decoupled from common sense and a “world model”. And we respect anyone who says it’s unfair to decouple pure language ability from an understanding of the world. Nonetheless, that is the absurd technology we wield today.
What exactly this is going to be used for is really difficult to say. At the advent of the internet or electricity – or the Segway for that matter – it was also very hard to say exactly what the broader use case was going to be.
Below, we’ll give some examples of the most commonly envisioned applications of generative models.
As we start to grasp generative processes, certain generative product designs might see some new opportunities, e.g. radical customisation or instant generation of artwork or this airplane piece by Autodesk Research.
Right now, controlling the full range of requirements of a successful design, e.g. production costs or complexity, is not possible, and it requires either a very good simulation of the use case of the product, e.g. airplane simulators, or a huge data bank of similar products as basis for development. As an example, Zalando is successfully sparking some very cool research with their release of 70,000 pieces of design data. But this is a rapidly improving field of research. Thanks to large databases, faces are currently at the forefront, and while just a few years ago, we had no control of this, today we can generally control light, facial expression and to some degree pose and facial features.
Several Microsoft Office products already include “helpful” AI-powered design recommendations. Right now, they are somewhere between passable and annoying, but that might be about to change soon.
Language corrective technology is getting very good very quickly, and while it might not replace proper masters of language anytime soon, it might just lift up the masses to a surprisingly high standard.
Similarly, we might start to see PowerPoint make helpful recommendations for how to arrange things when lists and tables get disorganised or how to create basic icons and graphics based on keywords, mutation and style.
Anomaly detection in itself is a huge research field, and the discriminator side of a well-trained Generative Adversarial Network (GAN) is immediately useful for anomaly detection. If you have a system you rely on feeding healthy data to, then train a model to understand what healthy data looks like and use it like a filter – just make sure to check the filter occasionally. Or apply it in fields such as tax and fraud where the very presence of anomaly is noteworthy. In data science, anomaly detection is considered good practice to deploy on any data given before proceeding with further analysis.
Another cool use of generative models is style transfer. Currently, it’s mostly used for producing interesting artwork. Check out Prisma for a popular example or FaceApp for another creepy example. Also note that there is some controversy over data protection and Russia.
But style transfer algorithms go beyond style and images. In everyday photography, you can benefit from using image denoising , but you can also enhance the performance of a lot of operational equipment, especially in medicine. Another example is image restoration where some data is lost or simple image enhancement, allowing better resizing of images – so-called super-resolution images. In addition to images, it can be applied to audio as well as text.
Finally, what we’re most excited about in the short term is the use of data and model augmentation. One of the main things holding artificial intelligence back is the demand for massive and high-quality datasets.
Generative models allow us – with some care – to generate new data. Need to know the difference between fraudulent and legitimate emails? Only have 100,000 emails available? Teach one version of GPT-2 to write legitimate emails and another to write fraudulent emails. Now train a third model that’s being fed both real and generated emails of both kinds. If the generative model is doing a good job, it should improve the final performance of the fraud detection model.
This kind of data augmentation can be used for every kind of AI currently in existence, which is probably also why Yann LeCun was so excited about the technology in the first article.
What we’ve mainly touched upon here is the method of extracting early layers of the model to utilise some of the general knowledge of the model for other tasks. This is similar in practice to augmentation, and furthermore, you improve performance.
For now, generative models are fun and somewhat speculative. They are immensely useful to play around with to grasp concepts such as “latent space” or “embedding vector”. But we hope that most people will be content with enjoying the pretty pictures and await further breakthroughs as they happen. Personally, many of us are about as excited as Yann LeCun is about the prospect of GANs, but for now, we’ll just have to wait and see.