The Catalyst by Softchoice

Big moments in AI from Y2K to ChatGPT

Softchoice Season 6 Episode 20

The rise of AI isn’t just about smarter machines—it’s about reshaping the way we work, create, and innovate. 

In the season finale of The Catalyst by Softchoice, Heather Haskin sits down with Gil Press, senior contributor at Forbes, to trace the modern history of AI. 

From the early breakthroughs in deep learning to the rise of generative AI, Press explores the key moments that shaped the AI industry and where the technology is headed next. 

Featuring: Gil Press, Senior Contributor at Forbes 

The Catalyst by Softchoice is the podcast dedicated to exploring the intersection of humans and technology.

Heather:

You're listening to The Catalyst by Soft Choice, a podcast about unleashing the full potential in people and technology. I'm your host, Heather Haskin. Imagine waking up in a world where machines don't just assist us. They think, create, and even make decisions. That world isn't decades away. It's happening right now. In just a few years, AI has crossed a threshold with platforms like ChatGPT, moving it from an emerging technology To a transformative force, reshaping industries, creativity, and society itself. But how did we get here? In today's episode, we're tracing AI's recent history, uncovering the pivotal moments that shaped the industry. We'll explore the breakthroughs that reignited deep learning, the rise of neural networks, and the key milestones that brought generative AI into the mainstream, revolutionizing the way we do business. And the AI market isn't slowing down. With innovations, like China's DeepSeek, shaking up the industry, what's next? So how did AI go from niche research to a force redefining our world? Today we're joined by Gil Press, a senior contributor at Forbes magazine. Gil covers emerging technologies, startups, big data, and the history and future of artificial intelligence. So, Gil, I'm really excited to be able to interview you today. I looked into your background a little bit, and it seems like you have quite an extensive bird's eye view on the history of generative AI, and so that brings to light so many questions. As we think about AI today, it's such a buzzword, but there seems to be something different about the moment that we're in right now. And it makes me wonder, have we turned a corner? What's changed? And maybe talking about that history will help us understand that. So I'd love to hear from you what your thoughts are.

Gil:

What's changed recently is that, uh, even at AI has been around for at least 70 years. So the research on the right, some, some implementation of, of, uh, various AI programs recently, both the scope and the reach of AI. have expanded significantly. And now we have AI programs that are expanding the menu of what computers can do. For example, generate text or generate images, the reach of AI in the sense of, uh, I'm managing to reach millions of computer users around the world, mostly because of the very recent success of ChatGPT that's what really is happening now. It's been a long, long journey, and especially over the last 25 to 30 years, in terms of very specific advancements in what is called deep learning.

Heather:

It's interesting to hear about some of the changes that have happened over time, Gil, and I'm really excited to talk with you about some of them because there's some things that I hadn't really learned before. So today we're looking at the recent history of AI going back to the turn of the 21st century. An important paper was written in the year 2000. And I'd love to hear about that paper from your perspective. Why was it important? Why did it renew interest in deep learning? And maybe what was the initial impact in the early 2000s after it was published?

Gil:

That year there was another advancement in deep learning in a sense that there were three leading computer scientists that insisted for many years that this is the right approach to AI. And now they had a lot of challenges to that and a lot of skepticism. The three were Geoffrey Hinton, Jan LeCun, and Joshua Bengio. And Bengio and his team published in the year 2000, a paper on training the computer to understand language, text processing, language processing. And the breakthrough there was the idea that there will be a lot of examples that mix together phrases with similar words, but not similar meaning. We humans understand it. Computers at that time could not understand it at all, got confused, and then devised a model that overcame that particular challenge. This was basically the launching pad. For what today we call large language models, the ability of the computer to understand context, to understand the different meaning of words, even if they're the same words.

Heather:

So we see the next important moment in AI between 2007 and 2009. What happened then? How were data and processing power factors in the technology of the day? Why was deep learning a step forward from machine learning?

Gil:

It's actually an easy question. Deep learning is part of machine learning, but there are a lot of approaches to machine learning. Uh, the advantage of deep learning can be summarized in two words. big data. So most of the traditional and very successful machine learning approaches relied on analyzing small sets of data. Deep learning is different in the sense that it shows its advantages, it shows its benefits, specifically when we deal with lots and lots of data. And by the year 2005 2010, we had lots and lots of data because of the invention of the World Wide Web in the early 1990s. By the way, what we see today in this sudden expansion and triumph of modern AI is very similar to what happened When the World Wide Web was invented, it was basically a piece of software that was installed on top of an existing worldwide network that was established 30 years before the internet. So today we see this triumph of modern and I also based on not a lot of work for many, many decades. But specifically in around 2005 to 2010, and specifically one AI researcher, Fei Fei Li, came up with the idea that if we have all these images on the internet, and all of these images are labeled, annotated, because people put a picture of dog and they said, this is my dog, Sasha, which is my dog. But then you see it's, they are identified, the images identified. That's very important when you do deep learning, because otherwise you have to do it by hand. Each image, because you train the computer and you say, this is an example of a cat. So, Fai Fei Li and others. started collecting, scraping the internet for all of these labeled images, and they put together a database they called ImageNet. By 2010, they announced an annual event, an annual competition, in which AI researchers could submit their program, specifically image recognition programs, to compete with each other. Also, at that time, we had another very important element, as I mentioned before, the challenge for deep learning for many years was the lack of computer power. When you deal with lots and lots of data, you need a lot of computer power to analyze. A company by the name of NVIDIA, already in 1993, was established to develop what they called graphic Processing unit. GPUs. Graphic processing unit, originally, when NVIDIA was established, were used for computer games. So all of this came together, the GPUs, big data, and deep learning, advanced algorithm, to create a perfect storm. A perfect storm that hit us in 2012.

Heather:

It's interesting that the computers that gamers are using, this GPUs, they're nicer video cards, seem to somehow advance our deep learning. So it's so cool to see how different channels of technology are converging together during this time. So when we come to 2010 and we're thinking about ImageNet, you mentioned that competition. Why was image and computer vision harder to do?

Gil:

Identifying images was hard to do simply because other approaches other than deep learning did not use, for example, lots of data. Deep learning itself had a challenge in doing what it's doing best, which is analyzing lots of data simply because it didn't have enough computer power. But when you put all the three together, and that happens in 2012 in, in various settings, it's bad the most. Impressive one, the one that influenced the later developments was the ImageNet competition at the end of 2012. And in this competition, Geoffrey Hinton and two of his PhD students submitted their program, which was a deep learning program, but based, this is the first time, based on GPUs. Which means basically they processed whatever the images from ImageNet very, very quickly. And most important, it was more accurate than all the other competitors. Its error rate at that time, at that particular competition, was 15 percent error rate. The second best program in that competition had an error rate of 26%. Using a different approach. Machine learning, but different approach. So, um, at least the academic world completely changed its mind about the benefit of deep learning. It was huge excitement in the academic world. Then the venture capitalists paid attention and other investors. The government started to pay attention. And we were off to the races of modern AI.

Heather:

It makes me wonder what is going on in academia right now that I should be paying attention to, that we'll hear about in a few years time when investors make those big moves. So as we think about that change with ImageNet and these, these big updates back in 2012, what did we see the big industry companies do? Google, I don't even know if it was AWS or Amazon back then, but what are the big names doing with all this back at that time?

Gil:

This is a very good question. And I actually omitted mentioning big tech in my previous answer. It was the venture capitalists, it was academics, and most important, actually, it was big tech, the Googles of the world, because at that time there was a very interesting shift from a focus of this research in academia. And for 70 years, the focus of, of AI research was mostly in academia. There were periods where there was more of a business interest, especially with expert systems, so called expert systems in the seventies and eighties. But mostly it was a, an academic focus. But by 2012, we had Google as a good example of a company that has always, since its inception, almost modeled itself after a university. Publish or perish was a motto, not just in academia, but also at Google. They immediately, from the beginning, not only invested a lot in research, of course around search technology mostly, but also about deploying information technology. They did a lot of internal innovations. So then you see, there was already a huge shift. And certainly around 2012, it was established that to attract talent, a company like Google, like Facebook, like Amazon, pay a little bit less attention even to the implementation, to application, the business value of that research, but also allowed them to build their own reputation just like academics do, by publishing. First of all, a lot of academics, the ones who had already experience in deep learning or other approaches to AI, left academia and went to Google. Yann LeCun was a good example for a long time, being a tenured professor at New York University, but he became the chief AI scientist for Facebook. Many of them just left academia and went to work for a lot more money for these big tech companies.

Heather:

I'm starting to wonder when we're getting to the start of the modern AI boom. So as we talk about 2012, are we getting there yet?

Gil:

The ImageNet competition of 2012 is the start of the modern AI boom. The big, I think, event that made it more accessible Was in 2016 when a program from Google, its AI unit at the time, DeepMind a program that managed to beat a champion of go a game that was considered to be a much, much bigger challenge for AI programs than chess, for example. But the AlphaGo program from Google gained a lot of attention, a lot of headlines. And I think this is where the average newspaper reader found out for the first time about this, what was basically deep learning. People stopped using, by that time, deep learning as a label and started using AI.

Heather:

Going back to 2014, a couple years later, a man named Ian Goodfellow. had a big idea that changed the AI game. I'd love to hear more about that. What did he do? What did neural networks talking to each other allow AI to do?

Gil:

So this is when we have the birth of yet another buzzword, generative AI. This is the first time that he and some other people came up with the idea that if you have two models compete with each other. that will help the program generate new data, generate images, generate text, generate eventually videos, and so on. A very powerful idea that was taken by a lot of other people, a lot of other researchers, a lot of other companies. This was in 2014, but the real big milestone, and maybe the last milestone we have in terms of the progress of thinking and developing deep learning, what we now call AI. It came in 2017, and there are a number of, uh, researchers at Google published a paper titled Attention is All You Need, in which they suggested a new type of architecture, a new way to design the deep learning model. that allowed the program to understand the context of whatever was written in the text. Context meaning it could read, so to speak, a whole paragraph and understand the connection and the relations between words far apart from one another. Before that, it was really words one after another that the program could analyze. Now, with this new, actually simplified architecture, This was the flourishing of large language models started because other big tech researchers, other entities started competing with each other by releasing better and better large language models, more accurate. Performing better and so on.

Heather:

Is this the moment that AI entered the mainstream with ChatGPT? That's what it feels like to me.

Gil:

Yeah. When ChatGPT was released in November of 2022, it became within two months the most popular consumer application ever. I think a hundred million users within two months. It suddenly reached a very wide audience. People got excited about it. People got upset about it. Recently, we had surprising development from a Chinese company, Chinese AI company, DeepSeek. It managed to do the same with much less computer power and smaller amount of data.

Heather:

So DeepSeek is a Chinese company. It's also open source. So how does that impact the AI industry? Is there controversy about it being open source and what is the larger business world doing with that?

Gil:

Yeah. So open source is not something that Deep Seek brought to the table a few years ago. Facebook, which is investing enormous amount of resources in modern AI, made the decision to open source everything that it's doing. So open source is a very important trend or approach to business within AI. And it's been, before DeepSeek, it's been, let's say, a threat. to those companies like OpenAI that keep the nature of the models, the innards of the models to themselves.

Heather:

When we look at the future, though, what do you think we're going to see in the next few years with AI models? And how do you think that will change the way that we do business?

Gil:

As you know, prediction is difficult, especially about the future. I tend to focus on the history, but history gives us some indication of where are we going. I do think that we will see steady and slow improvement in the accuracy and the performance of deep learning models. And maybe even more important, we will see steady and very, very slow adoption by businesses of whatever AI can do for them to save money and to generate new revenue streams.

Heather:

Well, that's the big question right now is how businesses can utilize generative AI to bring ROI to their business. So I'm excited to see how. The world changes with all of these incredible technology changes. And I appreciate having you here on The Catalyst.

Gil:

Thank you for having me.

Heather:

That wraps up our journey through the modern history of AI, from its quiet resurgence in the early 2000s, to the breakthroughs that brought us today's AI driven world. We've seen how deep learning, neural networks, and large language models have shaped industries and transformed the way we work, create, and interact with technology. But this is just the beginning. AI is evolving faster than ever, and the next big shift could be right around the corner. Thank you to Gilpress for coming on the show to share his insights. This is the last episode of the season. We'll be taking a short break and we'll be back soon with more episodes on everything tech and AI. For The Catalyst, I'm Heather Haskin. The Catalyst is brought to you by SoftChoice, a leading North American technology solutions provider. It is written and produced by Angela Cope, Philippe Dimas, and Brayden Banks in partnership with Pilgrim Content Marketing.