A Brief History of Modern AI

Derek Corcoran
19 hours ago
9 min read

Introduction: The ChatGPT Moment (2022)

On November 30, 2022, OpenAI quietly launched a new web tool called ChatGPT, based on its GPT-3.5 language model. Within five days, it reached a million users — faster than Instagram, faster than TikTok. Within three months, over 100 million people were chatting with AI, making ChatGPT the fastest-growing product in history.

The appeal of ChatGPT was immediate. It was a Google-style search box that felt like you were asking the world’s biggest brain ANYTHING. And, it was responding with creative conversational responses. It felt like the dawn of a new era – but not without its issues. AI Hallucination, AI Bias, and even Racist AI became topics everyone was talking about.

But ChatGPT didn’t appear from nowhere. It built upon decades of breakthroughs — in hardware, software, algorithms, and vision — from pioneers at Nvidia, Google, Microsoft, Meta, and OpenAI, and money from Elon Musk, Peter Thiel, Reid Hoffman, YC Research, and AWS. Yes, we had our ChatGPT moment. But it was a long time coming.

AI is such a huge story of our time, I felt I should understand it better. I studied LISP and Prolog as a Computer Science student - and was underwhelmed back in the day. But thi sis different. VERY different. So I listened to hours of podcasts (thank you to Ben and David from acquired.fm) and tried to summarize what I’ve learned about this story.

For me, it begins at Nvidia around 2009, when a scientist used GPUs for modelling, not graphics. Its a small story that changed so much.

Act I – The Spark: How GPUs Became the Brains of AI (2006–2011)

In the mid-2000s, Nvidia was best known for powering video games. But an innocuous event was about to take place that would change Nvidia (and computing) in a way no one could have imagined.

A chemistry researcher at Stanford named Todd Martinez was doing research on quantum chemistry and had

developed algorithms that took weeks to execute on supercomputers running CPUs. Frustrated by the time it was taking, Todd’s son (a gamer) suggested he go to the local electronics store, Fry's, to purchase some Nvidia GeForce graphics cards and try running the algorithms on GPUs instead of CPUs.

Nvidia had already begun developing CUDA (Compute Unified Device Architecture) – a programming language that allowed developers to access the parallel processing capabilities of a GPU directly. So Todd re-wrote his algorithm and ran it on the Nvidia GeForce cards, and it executed, in a couple of HOURS.

He ran the same algorithm on the Stanford supercomputer, and it took a couple of weeks to finish. He checked the results, and they were identical.

So Todd called Jensen Huang, CEO of Nvidia, to say “thank you Jensen, for making my life’s work achievable in my lifetime.” I’m sure Jensen’s response was “We did what now???”

Jensen immediately saw the opportunity to extend the application of Nvidia chips and cards to the sciences. CPUs, designed for serial tasks, were hitting performance ceilings. GPUs, with thousands of smaller cores, could perform massively parallel operations — perfect for the matrix math behind things like neural networks.

So Jensen directed the team to double down on CUDA — the software platform that lets developers use GPUs for general-purpose computing. Little did he know that CUDA would transform Nvidia from a gaming chip company into a computing platform company, opening a path that would later make it the beating heart of AI.

At first, the audience was small — scientists, physicists, a few academics experimenting with neural networks. But by the early 2010s, CUDA’s developer ecosystem was growing fast, setting the stage for the promise of AI to finally be realized. CPUs were just too slow for the algorithms to execute neural networks – GPUs, on the other hand, were perfect.

Act II – Alex Net: The big-bang moment for AI

In 2006, the ImageNet project was established and it created a database of approximately 14 million images. It was the brainchild of legendary computer scientist Fei-Fei Li.

In 2010, the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) began – pitting teams of researchers against each other to try to accurately classify the images in the database. Is it a “cat” or a “car” and so on.

Ilya Sutskever, Geoffrey Hinton and Alex Krizhevsky - AlexNet

In 2012, a team from the University of Toronto — Ilya Sutskever, Geoffrey Hinton and Alex Krizhevsky, — stunned the machine learning world. Their neural network, AlexNet, demolished previous records on the ImageNet visual recognition challenge. What did they do differently? They trained their neural network on two Nvidia GPUs, exploiting CUDA’s parallelism to crunch enormous image datasets faster than any CPU could.

To this point, neural networks held great promise, but they were impractical. They simply couldn’t run fast enough to be useful. But the AlexNet breakthrough, building on CUDA and running on GPUs changed everything. Neural Networks were back in fashion.

Act III – Big Tech Wakes Up to the Potential of AI (2015–2021)

By the mid-2010s, the giants of Silicon Valley were mobilizing. Many people associate the beginning of AI with the launch of ChatGPT and are stunned at how quickly Google launched (the flawed) Bard and then Gemini and how quickly Microsoft incorporated OpenAI to launch Co-Pilot. But the reality is, the tech giants had seen the potential of AI 10+ years before ChatGPT and began their investments.

OpenAI was founded in December 2015 (7 years before ChatGPT) as a nonprofit artificial intelligence research organization with the mission to ensure that artificial general intelligence (AGI) would be developed in a way that would benefit all of humanity. The founding group included Elon Musk, Sam Altman, Greg Brockman, Ilya Sutskever, Wojciech Zaremba, and John Schulman, among others. Early backers — notably Elon Musk, Sam Altman, Peter Thiel, Reid Hoffman, Jessica Livingston, and Amazon Web Services — pledged a combined $1 billion in funding, though only a portion was contributed initially. The founding vision blended open research with a cautious approach to powerful AI systems, emphasizing transparency and collaboration over secrecy or competition.

In 2017, Google researchers published a paper titled “Attention Is All You Need.” It introduced the Transformer, a simpler, faster architecture that scaled better than anything before it. Transformers unlocked a new concept: pretrain on everything, fine-tune for anything. The importance of this research paper from Google cannot be underestimated. “GPT” stands for Generative Pre-trained Transformer.

Between Nvidia GPUs, CUDA, and the Transformer paper - the foundation of AI as we know it today had been laid.

Google: From Search Company to AI Company

Google’s business is massively profitable – full year 2024 numbers showed $350B in revenue and $112B in operating income. But like any smart business, they use a portion of their profits to invest in adjacent areas so they are constantly on defense and offense.

In 2011, Google quietly launched Google Brain, a research unit focused on the deep learning branch of AI. And in 2014, it acquired DeepMind, the London lab that would create AlphaGo, the system that defeated the world champion in Go. And with that acquisition, they also acquired DeepMind’s brilliant founder and CEO, Demis Habbis.

AI investments included the development of an early version of Google Translate that was only available on Nexus One phones (made by Google and the precursor to the modern Pixel line). Google Translate proved incredibly popular but incredibly costly – from a ‘compute’ perspective. It was so CPU hungry that it led Google research scientist Franz Och to declare that supporting Google Translate on a billion Android smartphones would require “another Google” (in terms of data center capacity). But the potential was obvious. They just needed to solve the performance issues.

Under Sundar Pichai, AI became Google’s defining narrative: “We are transitioning from a mobile-first to an AI-first world.” Research like the Transformer paper and subsequent projects like BERT and later Gemini redefined search, translation, and even ad targeting. And in these early days, Google placed an order for an estimated 40,000 Nvidia chips to power their research and exploration of the possibility of AI. Google was doubling down on the AI future.

Microsoft: The Strategic Partner

Meanwhile, Satya Nadella was re-engineering Microsoft around cloud computing. And the company’s Azure platform quietly became an AI backbone. When OpenAI emerged in 2015, Microsoft saw an opportunity: by providing compute, Azure could become the launchpad for the most advanced AI models and the platform of OpenAI. OpenAI restructured in 2019 into a “capped-profit” company called OpenAI LP, allowing it to attract external capital to deal with the rising costs of compute, while maintaining its nonprofit governance through the OpenAI Nonprofit board.

With the launch of OpenAI LP, the relationship with Microsoft deepened through multi-billion-dollar investments in 2019, 2021, and 2023 — and would eventually make Microsoft the first tech giant to productize OpenAI’s work at global scale, embedding GPT models into Office, Windows, and Bing as Copilot.

In 2020 Microsoft added GPT to GitHub and launched a 2021 preview of GitHub Copilot. It was an instant success with developers who quickly learned that they could be more productive. 20-30% of Github suggestions were being put to use, and developers were automating Code QA. But the early adoption did lead to questions about security holes in the developed code and IP protection of the code produced.

Meta and Nvidia: Open vs. Infrastructure

At Facebook (Meta), AI research flourished in FAIR (Facebook AI Research – founded by Yann LeCun, one of the godfathers of deep learning) and Meta AI, producing models like LLaMA, which the company open-sourced to democratize access.

Nvidia, meanwhile, cemented its position as the picks-and-shovels supplier of the AI gold rush — every major breakthrough model was trained on its GPUs. As solutions evolved from research to production – the demand for those GPUs grew astronomically.

By 2021, the pieces were in place: massive compute, Transformer architectures, global cloud infrastructure, and corporate commitment. All that remained was the spark that would ignite public awareness.

Act IV – ChatGPT and the Consumer Explosion (2022–2023)

That spark arrived on November 30, 2022.

OpenAI, under the technical leadership of Greg Brockman, had already built GPT-3, but it was API based. Sam Altman pushed the team to productize and democratize the technology with ChatGPT (based on GPT 3.5) which wrapped the API’s in a simple chat interface and fine-tuned it with human feedback.

Suddenly, anyone could talk to a model that could answer, explain, brainstorm, code, or even write essays. And it took the world by storm.

The viral growth was staggering: one million users in five days, 100 million in three months — faster than any other product (technology or otherwise) in history. Students used it for homework, professionals for emails, and developers for debugging. The world’s collective jaw dropped.

Microsoft moved fast, embedding the next iteration, GPT-4, into Bing and the Office Copilot suite by early 2023. Meanwhile, Google declared a “code red,” rushing to release the ill-fated Bard (later improved, rebranded and relaunched under Gemini). Meta released LLaMA 2. Anthropic introduced Claude.

The AI wars had begun — and Nvidia was the quiet victor, as every model ran on its GPUs.

Act V – Infrastructure, Scale, and the AI Arms Race (2023–2025)

Behind the scenes, a different battle raged — the battle for compute.

Training large models cost tens or even hundreds of millions of dollars. GPU clusters became national assets; cloud providers signed billion-dollar supply contracts.

Nvidia’s H100 and Blackwell B100 / B200 chips became the gold standard for AI training.

Cloud startups like CoreWeave signed multi-billion-dollar deals with Nvidia to provide GPU access through their specialized AI data centers.
Nvidia acquired Run:AI to manage large-scale workloads by virtualizing GPU clusters.
OpenAI prepared GPT-5 while collaborating with Nvidia on new supercomputing architectures.
Google merged Brain and DeepMind into one AI division, focusing on a single model under the brand Gemini.
Microsoft integrated Copilot into every Microsoft 365 product.
Tech companies around the world were incorporating AI into their everyday products. For example: Product pages on Amazon and Home Depot now included AI generated summaries of customer reviews.

By 2025, Nvidia wasn’t just a chipmaker — it was the backbone of the global AI economy and had become the most valuable company in the world (at the time of writing in Oct 2025) and the first publicly traded company to reach a $4T market cap and then $5T before the end of October 2025.

Act VI – Where We Are Now (Late 2025)

As 2025 comes to a close, AI is everywhere — in browsers, in documents, in classrooms, in code editors. I watched my son use ChatGPT to help teach him projectile physics (he genuinely wasn't using it for homework ... it was explaining concepts to him with examples - helping him understand).

Google’s famous “10 Blue Links” are being replaced by conversational search results. Emails draft themselves. Homework comes with disclaimers about AI assistance. Boardrooms debate AI strategy as seriously as financial forecasts. And the creation of deep-fakes has become a boardroom topic as the world deals with increased AI enhanced fraud.

Regulators scramble to set guardrails. Nations race to secure chip supply. And business leaders grapple with a tidal wave of change.

But for most of us, it’s simpler: AI has become another everyday tool — like the calculator, spreadsheet, or search engine before it — only this one can think with us.

When historians look back, they’ll likely see 2006–2025 as the twenty-year arc in which computation and cognition fused — and intelligence became a platform.

FINTECHFANBOY