Article

The modern history of AI

From the dawn of ChatGPT to today’s generative AI models
Published

28 January 2025

The article is co-authored by Gemini Experiment 1206. 


The field of Artificial Intelligence has always been filled with promises, some kept and some broken. But this time something feels different. We are at the dawn of a new era, the age of generative AI, and the implications are profound – and still unfolding before our eyes. In this guide, we will cover the fundamental must-reads on the generative AI revolution: from conversation to creation.


While many traditional AI systems are designed to analyse and classify data, generative AI does exactly what the name implies: it generates new content. Text, images, music, code, even entire virtual worlds – all are now within the creative grasp of these powerful algorithms.


This generative AI revolution did not happen overnight – and one particular date stands out as a clear inflection point: 30th November 2022. That was the day OpenAI released ChatGPT to the world. ChatGPT was not the first large language model (LLM), and at the time, it was not even OpenAI's most powerful creation. But it was a watershed moment: the instant when sophisticated AI became accessible to the masses. The user-friendly chat interface, combined with the model's surprisingly adept conversational abilities, captured the public's imagination in a way that no previous AI system had.


Understanding the significance of ChatGPT


To understand the significance of ChatGPT, we need to rewind. The foundation of these conversational AI systems lies in the Transformer architecture, introduced in the now-famous 2017 paper "Attention is All You Need". This novel approach, based on the concept of "attention" mechanisms, allowed neural networks to process sequential data like language far more effectively than before. It paved the way for the development of ever larger and more complex language models. OpenAI's own GPT-3, released in 2020, was a significant step in this direction, demonstrating an impressive ability to generate coherent and often surprisingly creative text. But it was ChatGPT, a fine-tuned version of GPT-3.5, that truly broke through the barriers of technical complexity and became a household name.


The world quickly became fascinated by the possibilities. People used ChatGPT to write poems, draft emails, summarise articles, generate code, and even engage in philosophical discussions. The model's ability to produce human-quality text and to adapt its style and tone to different prompts was undeniably impressive. It was a "wow" moment for many, a glimpse into a future where machines could not only understand language but also generate it with a fluency that blurred the lines between the human and the artificial. At the time, it was also limited. While it felt eerily human compared to anything that came before, most would also agree it was a far cry from a professional or college-level intelligence. The underlying technological breakthrough enabling this was human-in-the-loop reinforcement learning. A technique in which large armies of human “annotators” would tell the model what a good response to a prompt would be.


Of course, the early days were not without their hiccups. The now-infamous "hallucinations" – instances where the model confidently asserts false or nonsensical information – quickly became a common talking point. It was a stark reminder that these systems, for all their apparent intelligence, are fundamentally statistical engines, predicting the next word in a sequence based on patterns learned from massive datasets. They lack true understanding, common sense, and the ability to verify the accuracy of their own output. As it stood, very little was known about how the model worked. This was when people first encountered things like hallucinations, but also the model’s ability to be creative.


Then, in March 2023, came GPT-4. This new iteration was not just a larger version of its predecessor; it was a significant leap forward. While OpenAI remained tight-lipped about the exact details of its architecture and training data, the improvements were immediately evident. GPT-4 was smarter, more nuanced, and, crucially, multimodal. It could now process and generate not only text but also images, paving the way for even more sophisticated applications. The ability to consume images as tokens, and truly reason over them, was an important milestone – a first at the time. GPT-4 represented the "god model", the one that, for the first time, fulfilled the promise that AI technology would improve right before our eyes. However, it should also be noted that GPT-4 was not available to most people; it was behind a paywall from day one and only a small number of users had access. Today, many people have likely moved on to newer models, but for a long time, GPT-3.5 was the primary version the public had access to.


The race was – and is – on


The release of GPT-4 solidified OpenAI's position as a leader in the field, but it also highlighted the rapidly evolving nature of the AI landscape. Other major players, such as Google with its Bard and later Gemini models, and Anthropic with its Claude series, were making significant strides. The emergence of open-weight models like Meta's LLaMA 3.1 and Mistral, a French startup, further complicated the picture by offering a different path – one where the model weights are publicly available for anyone to download, modify, and deploy. The race was on, and it was clear that no single company held a monopoly on innovation.


As we stand on the cusp of this new era, it is important to remember that we are still in the early stages. The capabilities of these models are evolving at an astonishing pace, and new breakthroughs are being announced seemingly every week. The initial excitement around ChatGPT was justified, but it is crucial to move beyond the "wow" factor and engage with these technologies in a thoughtful and critical manner. We need to understand their limitations, address potential biases, and develop frameworks for their responsible deployment.


The generative AI revolution is not just about chatbots and image generators. It is about a fundamental shift in how we interact with technology, how we create, and how we understand the world around us. The journey from conversation to creation has only just begun, and the path ahead is filled with both immense promise and potential pitfalls. It is up to all of us – as in literally up to humanity – to carve that path, to ensure that this powerful technology is used for the benefit of everyone.


From experiment to enterprise: Generative AI's real-world impact

The initial wave of excitement surrounding ChatGPT and its successors was undeniably driven by a sense of novelty. But as the dust settled, a crucial question emerged: how would this technology evolve from fascinating experiments to tangible, real-world value? The answer, it turns out, is multifaceted and still unfolding.


One of the first major signals that generative AI was more than just a fad came from an MIT study that quantified its impact on productivity. Researchers measured the time it took a group of college-educated professionals to complete a variety of business writing tasks. They found that those using ChatGPT-4 finished their work significantly faster and with higher-quality results than those without, specifically showing a 40% increase in terms of raw speed and a 15% quality increase. This was not just a marginal improvement; it was a substantial boost that hinted at the potential for AI to reshape entire industries. The experiment was small, but it marked a milestone that helped the broader business world and research community realise the impact this technology would have.


To get a broader view of the emerging landscape, we can turn to Sequoia Capital's "Generative AI's Act Two". Published in the wake of their "Generative AI: A Creative New World" from the year before, this report from the prominent venture capital firm provided a snapshot of a rapidly evolving ecosystem. It highlighted the intense interest from investors, with billions of dollars pouring into AI startups. But it also acknowledged the challenges: the high costs of training and deploying these models and the ongoing debate between open and closed source approaches.

Fact box

The jagged frontier – the unevenly distributed abilities of AI models


The jagged frontier
is a particularly important idea to grasp. It describes the reality that AI is not uniformly good at everything. Instead, its abilities are unevenly distributed, excelling in some areas while lagging in others. This makes it difficult to predict exactly where and how AI will be most impactful. It also underscores the need for careful experimentation and a nuanced understanding of each model's strengths and weaknesses.

Early on, businesses struggled with issues like user retention. Many people tried these new AI tools, found them interesting, but then abandoned them after the initial novelty wore off. This was partly due to unrealistic expectations, but also because many early applications were not well-integrated into existing workflows. We also observed some of the early challenges of generative AI, as highlighted by Sequoia: a lot of excitement but low retention rates as many customers tend to move away from the tools after a while.


However, compelling use cases gradually began to emerge, showcasing AI's ability to deliver real business value. At Implement, we have seen firsthand how companies are leveraging these technologies. The use cases range from Walmart using AI to dynamically adjust and optimise a staggering 850 million data points related to product information, to Klarna deploying a customer service chatbot that effectively replaced 700 full-time employees, to Amazon Transform streamlining the arduous process of updating outdated Java code. These examples demonstrate that AI is not just about automating existing tasks but also about enabling entirely new ways of working. In the case of the Owl children's book, created by a colleague of mine, we saw how AI could be used as a creative tool, co-authoring and illustrating a book that was both charming and educational.


The legal landscape also began to take shape, with two cases in particular setting important precedents. In the Air Canada case, the airline was held liable for a hallucinated discount offered by its chatbot. This established a clear principle: companies are responsible for the information provided by their AI systems, even if that information is inaccurate. The Chevron case, on the other hand, demonstrated the importance of user intent. Here, a customer used sophisticated prompt engineering techniques to trick the chatbot into offering a car for one dollar. The court found that Chevron was not liable in this instance, as the customer was clearly acting in bad faith and attempting to exploit the system. These cases, while seemingly minor, are crucial for establishing the boundaries of liability and user responsibility in the age of AI.


The Ezra Klein Show's interview with Ethan Mollick on best practices for prompt engineering offered valuable insights for individuals seeking to harness the power of these tools. Mollick emphasised the importance of articulating your needs clearly, engaging in iterative conversation with the AI and experimenting with different approaches. He also drew a distinction between useful prompting and "spell casting," the latter referring to the superstitious belief that certain phrases or keywords hold magical power over the model's output. While specific prompts can be helpful, true mastery comes from understanding the underlying principles of how these models work and how to guide them effectively.


As we move forward, it is clear that the experimental phase is giving way to a more mature, enterprise-focused approach. Businesses are no longer just toying with AI; they are deploying it in mission-critical applications, often with impressive results. The focus is shifting from "what can AI do?" to "what should AI do for us?" This requires a deeper understanding of the technology, a clear-eyed assessment of its strengths and limitations, and a willingness to rethink existing processes and workflows.


The generative AI revolution is not just about adopting new tools; it is about transforming the way we work, create, and solve problems.

The road ahead: Innovation & commoditisation

The generative AI revolution is far from over. In fact, we are arguably still in the early stages of a transformation that will reshape industries, redefine work, and potentially alter the very fabric of society.


As we look to the road ahead, two major trends stand out:

  1. Continued, rapid innovation in the capabilities of these AI models
  2. The simultaneous commoditisation of many AI functions

One of the most significant technical advancements on the immediate horizon is exemplified by OpenAI's GPT-4o. This model, with its real-time voice interaction capabilities, represents a major step towards more natural and intuitive human-computer interaction. Imagine being able to have a fluid, real-time conversation with an AI, complete with nuanced tone and emotional intelligence. This is not just a matter of convenience; it opens up entirely new possibilities for how we can use and interact with these systems. The multimodal enhancements, allowing the model to directly generate images and process video and audio inputs as separate, tokenised feeds, further expand the scope of what is possible.


Another key development is the expansion of context windows. Gemini 1.5, for instance, boasts a context window of one million tokens, with hints at even larger windows in the future, potentially up to ten million. This allows the models to consider vastly more information when generating responses, allowing the models to tackle bigger and more complex problems. It enables applications like analysing entire codebases, processing documents of thousands of pages in one go, or maintaining a coherent conversation over extended periods. The implications for research, writing, and complex problem-solving are immense.


Lately, we also got a much faster model, a much cheaper model, and began the journey of massive price cuts to these models, which has become an ongoing feature of the industry to this day.

The rise of agentic architectures


But perhaps even more transformative is the rise of agentic AI, based on the idea of tool use. Andrew Ng, a leading figure in AI, has outlined several key design patterns for building agent-based systems.


These include:

  • Reflection: Allowing the AI to critique and improve its own outputs, leading to higher quality results
  • Tool use: Enabling the AI to interact with external systems, such as search engines, databases, and APIs, greatly expanding its capabilities
  • Planning: Giving the AI the ability to break down complex tasks into smaller, manageable steps
  • Multi-agent collaboration: Having multiple AI agents work together, each with its own specialised skills and knowledge, to achieve a common goal

These agentic architectures, also sometimes called cognitive architectures, represent a significant shift from the single-shot, input-output paradigm of earlier models. They allow for more complex, multi-step reasoning, better integration with existing systems, and the potential for AI to act more autonomously in the world.


To this day though, this is largely still just a promise. The demos of agents we are seeing are incredibly promising, but in practice, they often fall short. They are unable to handle long-term planning or sufficient interaction with external systems to achieve what they need to be useful for business or everyday users. However, this is exactly the kind of thing that scaling the models has shown improves their performance. But once it gets there, the impact is going to be pretty profound. So, following the development of agents and agentic behaviours will be really interesting.

Rapid commoditisation and increased accessibility


Alongside these advances, we are also witnessing the rapid commoditisation of many AI capabilities. The recent price wars between major providers have driven the cost of using these models down to remarkably low levels. For example, models like GPT-4o are now incredibly cheap to use. This democratisation of access is making it possible for smaller companies and even individuals to leverage the power of AI in ways that were previously unimaginable.


However, this increased accessibility also raises important questions. As noted in the Danish study by Anders Humlum and Emilie Vestergaard and another Norwegian study, there is a significant gender gap in AI usage, with women being less likely to utilise these tools both privately and professionally. The Norwegian study indicates that this gap can be narrowed when people are explicitly told to use AI, suggesting that part of the issue may be a difference in perceived permission or a default assumption that using AI is somehow "cheating."


Another area of intense research is mechanistic interpretability, the quest to understand the inner workings of these complex models. Anthropic's work on "Mapping the Mind of Sonnet" is a prime example. By analysing the activation patterns of individual neurons and clusters of neurons, they are beginning to create a map of how these models represent and process information. While this field is still in its infancy, it holds the promise of making AI more transparent, debuggable, and ultimately, more trustworthy.


In his essay, "Machines of Loving Grace", Dario Amodei, CEO of Anthropic, offers an inspiring and cautionary vision for the AI-powered future. He paints a picture of an "island of geniuses," where AI systems accelerate scientific discovery, drive technological progress, and help solve some of humanity's most pressing problems. But he also acknowledges the risks: the potential for misuse, the exacerbation of existing inequalities, and the existential threat posed by uncontrolled superintelligence. It is a vision that is cautiously optimistic, emphasising the need for responsible development and deployment.

Concluding remarks


As we conclude this exploration of the generative AI revolution, many open questions remain: Will these models eventually achieve true general intelligence, or even sentience? How will they impact the nature of work, creativity, and human relationships? How do we ensure that the benefits of AI are distributed equitably and that the risks are mitigated effectively?


One thing is certain: the journey from conversation to creation is just beginning. The capabilities of these models will continue to evolve, the use cases will expand, and the societal implications will become ever more profound. It is incumbent upon all of us – researchers, developers, policymakers, and citizens – to engage with these technologies thoughtfully, critically, and ethically.


The future of AI is not predetermined; it is a future we are actively creating, one prompt, one experiment, one deployment at a time. The revolution is here; let us make sure it will be one that benefits all of humanity.

Related0 4