Opto Sessions – Invest in the Next Big Idea

SoundHound Stock: Is Voice AI the Untapped $50B Frontier?

Haydn Brain & Ed Gotham

Nitesh Sharan, CFO of SoundHound AI, returns to OPTO Sessions to discuss the company’s accelerating momentum and why Voice AI is fast becoming the default interface of the future. As pioneers in the space, with deep AI roots, a strong patent portfolio, and an agnostic architecture spanning cloud, edge, and agentic AI, SoundHound is leading the evolution of natural human-machine interaction.

*$50B Frontier: https://www.marketsandmarkets.com/Market-Reports/conversational-ai-market-49043506.html

If you enjoyed this interview, consider subscribing to our Substack channel for more in-depth insights designed to help you invest smarter: https://optoforesight.substack.com/

Create your Own Stock Index & Invest Smarter with OPTO Folios: https://optothemes.onelink.me/BZDG/ti2lb2fd

LinkedIn: https://www.linkedin.com/in/opto-invest-in-innovation-308416193/
X: https://twitter.com/OptoThemes
Instagram: https://www.instagram.com/opto.themes?igsh=MXhwenU4dTk4aDBqMw%3D%3D&utm_source=qr
Facebook: https://www.facebook.com/OptoThemes

-----

The content in this podcast is for informational purposes only. Opto Markets LLC does not recommend any specific securities or investment strategies. Investing involves risk & investments may lose value, including the loss of principal. Past performance does not guarantee future results. Investors should consider their investment objectives and risks carefully before investing. The information provided is not an endorsement of this product and is for information and/or educational purposes only.

Welcome back to the show, Nitesh How are things? Things are great. Thanks for having me. Good to see you. Yeah, great to see you again. we spoke to you, my co-host, Ed Gotham spoke to you in October last year. So what we'll do is we'll link to that episode in the episode description for anyone that wants that previous interview and some extra context. We'll cover some of the topics from the previous round, but then we'll try and move on to the latest goings on at SoundHound. But for that extra context, for listeners new to SoundHound and the company at large, can you give us a quick overview of what the company does and where it sits in the voice AI landscape? Yes, Soundhound is a pioneer in voice AI. We've actually been around for nearly 20 years. We've been building ecosystems around voice AI for many years, and now we're public for the last few years, NASDAQ listed. Our vision is to voice enable the world with conversational intelligence, and we've been doing that through an architecture where we voice enable products, we voice enable services, and we bring those together in ecosystem. uh to enable more seamless transactions. If I back up one step of the impetus and sort of what catalyzed Soundhound, our co-founders, uh engineering background, Stanford PhDs, they had this vision many years ago, these are pre-iPhone days, where inspired by science fiction said, you know, what do we not have in the world that you see sort of permeated through Star Trek and Star Wars? And, you know, The story is much more elegant when my co-founder, uh Kavon Mahajar says it, but fundamentally it is, you he didn't look at Star Trek and say, I think that we're going to travel faster than the speed of light in his lifetime. Not sure we're going to have. teleportation from this planet to that planet or replicators and so forth. But one of things that just sort of embedded in the background of all those movies and shows is that uh people have conversations just naturally to anything and everything. Robots, your elevator, coffee machine, this voice AI through natural conversations was a vision that he had. And he said, OK, I'm going to bring that to the world. It doesn't exist today. The sort of state of the art technology at the time was very limited. It was very utility based. And he wanted to bring that to bear, but it's hard AI. The human language is very complicated and being able to have technology to fully understand and comprehend and then act on natural conversations is very difficult. And so we believe that that's really what the ultimate vision is for how humans will interact with technology. We believe now with generative AI and large language models in particular, we're now catalyzing forward into that new horizon where humans and technology will be. exchanging natural conversations to get things done. And that's really what we've been bringing to bear for a couple decades. And now we're seeing exponential growth and opportunity because of where the state of the technology is and where the state of the consumer is. Yeah, fantastic. to that point then, I mean, you described, think, voice AI in your previous interview as in its adolescence, you has it now entered adulthood? Where are we on that kind of maturity scale? Yeah, the analogy I was trying to convey, which I think is still apropos and my short answer is yes, I think it is now growing into adulthood with a lot more to go, is that if you look at the first horizon where we got most traction, think of the 2010 through 12 horizon with. Alexa with Siri with some other technologies, you know, it was kind of like having a new kid you sort of if you have children, know when your kids start speaking it, you know 18 months or two years you're a little bit blown away like oh well I can you know, can have a conversation and you're very excited and then you quickly find out it's sort of limited in utility. It's like, you know eat or You know, so, that's sort of what the state of a lot of that technology was. was, you know, I can, even today, oftentimes for people with their lex in their kitchen, it's, can I set a timer? Can I add to the shopping list? Can I listen to music? And it's sort of limited utility. Well, you know, I'd say over the last, certainly over the last handful of years, but even over the last couple of years, you're seeing massive inflection in how this technology can now just handle complicated compound complexity that is part and parcel of how humans interact. And that's been the core of our technology sort of differentiation for our history, how we built up hundreds of patents and deep data sets was around the understanding that humans use negation. say this, not that. They use uh compound queries with multiple elements to it. And uh the technology, I'd say, catalyzed by large language models, most notably, have enabled you to hold massive, long conversations. I made the metaphor to a kid getting through into adolescence. And maybe puberty is good analogy, because you get exponential change. And my kids are sort of in their teen years now. When you started engaging with ChatGPT or some of the new chatbots that are out there, you're holding longer, complicated conversations. You're asking it to explore the theory of relativity one minute and what's going on with some geopolitical dynamics and giving you better context on historical. And you're somewhat, holding longer conversations and at the same time, you're sometimes surprised at how much insight it has. It's kind of like a kid coming home from school. You're like, wow, okay, where'd you learn that? And then one of the things that has been coexistent with this new wave of technology has been hallucination rates. So it makes up stuff a lot. And that's similar also to teenagers. know, they'll kind of, there's a whole evolution of learning and growth and storytelling that seems to be aligned into your question then of where are we? Yeah, I absolutely think over the last couple of years, this technology has continued to grow exponentially and improve. Number one, sort of that hallucination management is getting contained. New models are actually, finding more and more complexity around that. At the same time, one of the biggest breakthroughs even from the last time we spoke, has been the massive acceleration of agentic AI and how these systems are now able to do planning, reasoning, autonomously acting, taking action. Again, similar to how a kid can now go live on his or her own or start to do investments or whatnot. Similarly, the technology is able to do that. We're seeing that in our uh manifestations with a lot of our enterprise contracts, which... uh It's amazing to see the technology not only be able to hold more natural conversations, but you can actually do a lot more ultimately in service of end consumer goals. Yeah, fantastic. So to work out kind of the practical applications, I suppose, of that technology and how you're actually servicing this to clients, perhaps you can just talk us through or characterize the company's product offering today. Yeah, maybe I'll use this as the platform to just describe a bit more of how we're organized and our revenue model. And then I can unpack a few examples of where this is real commercialized today. Number one, and most of the original journey up until like 2023, our our greater share of revenue came from voice enabled products. So think of cars. TVs, other IoT devices of which there's massive growth going on in terms of the number of IoT devices around the world. And one of the differentiating factors of voice AI versus other modalities of keyboard input or touch type swiper, all those things that we've of grown to live with in terms of interchange with technology is that voice AI only requires a small inexpensive microphone. So all these infrastructure assets or other technologies that are out there to be able to unlock the power of the internet and being able to do more and more. AI has got a competitive advantage because it's so easy to access. You just have to speak. And humans don't need to learn. We've learned to type with our thumbs really fast over the last decade, but you don't have to learn how to converse and hold a conversation. So voice enabled products was a huge area of growth, particularly auto. And that's where, again, up until 2023, was roughly 70%, 80 % of our business. What we've seen a lot of growth is what we call pillar two with voice enabled services. And that's where we enable, think of customer service, but actions with respect to the restaurant industry and food ordering, healthcare, financial services, insurance. This is for this year and next year, sort of our greatest growth engine. This is where think of you call in and you're used to the traditional IVR, where you have to press one for this, press two for this, you're screaming at like operator, I need help. Now the technology can just hold any conversation. It picks up immediately, it's on 24 seven. It can serve all your complex, challenges that you might have or a return a pair of shoes or you need to update your insurance plan or you need to transfer money in a banking portal. All those things can happen through just natural conversation. So that's where we're seeing inflected growth. And then our third pillar, which we launched earlier this year. So at our consumer electronics show in Las Vegas and in the US, uh we launched voice commerce, which is really bringing together these two pillars where just to use an example of connecting voice enabled services to voice enabled products. Imagine you're driving into work and you want coffee because it's really early. Well, if you're conversing already with your car in terms of navigation, destination points, weather maybe, and any other insights you want to know what happened in the game right now, uh you can do that. And then if you say, look, I'm looking for coffee, well, your car already knows where you're going. It could say, hey, there's a Starbucks at the next exit or there's a Pete's Coffee closer to your office. Would you like me to order your cappuccino for pickup? And that seamless transaction, you keep your hands on the wheel, no complication there. The coffee shop is excited because they now got a new lead, a new customer. And then uh our model is actually to share the economics of that transaction with the car manufacturer. So now even a car manufacturer or TV device maker can generate new revenue, which is completely a new opportunity for them. So they're leaned in and very excited. So we have a lot of proof of concepts going on in that space right now. Ultimately, that's the big ecosystem we're trying to build. And underpinning all of this, I'll just say, is again, the spirit of where is the world going? We do believe this is one of the major inflections, maybe the greatest inflection of technology of our lifetime, with generative AI and large language models. In this new gen AI world, we know that natural conversations, as I've been mentioning, is the sort of new modality of how humans will interact with technology, and voice AI is the killer app, and this is what we've been building. So it's sort of our time is now, and we're seeing that in our business. Yeah, okay, really exciting time. just to kind of close the conversation on the three pillars in your business model overall, is there a particular kind of industry or pillar or area you're seeing most traction and talk to us about how the business model is working? Yeah, one of the great things I'll say is that we've gotten over the past several quarters in particular, really strong diversity of our customer segmentation and concentration. So going back a couple of years again, we have three customers alone would comprise over 70 % of our revenue. We now have no customer greater than 10%. ah And that's been a consistent story that we have continued to penetrate and gain share in the auto market. We've really inflected and seen a lot of disruption in the restaurant space and food ordering restaurants. have so many challenges with labor shortage, with commodity cost pressure. So AI makes a lot of sense for them for consistency of uh performance. And then beyond that, in enterprise, whether it's financial services, health care, insurance, uh we're seeing a lot of traction. In fact, what's been really exciting the last couple of quarters in particular is the new areas we're able to penetrate and grow as. basically everybody across the board is saying, how can I utilize AI to improve my business, whether that's for cost efficiencies or revenue generation? So we've gotten traction now into the energy sector. In Q1, we announced sort of a new entry and a large deal in that space. uh And the other thing we're very excited about is because the technology is becoming more standardized and ability to be packaged, we're seeing a lot of opportunity within direct channel partners too. And so that's a tremendous area of growth for us. Yeah, really interesting. OK, so to almost circle back then, I want to just underline your kind of differentiation within the over way. What would you pinpoint as the company's core competitive moat if you had to single one out? So I think it starts with technology and I'll unpack that a bit more, but our roots, again, we're in core engineering. This is hard AI, took a long time to develop. And so we built up hundreds of patents across all elements of deep neural networks to sort of voice recognition, to natural language understanding and the text to speech elements. So uh we've got that. We've also built a deep patent stack on monetization. that voice commerce dynamic I was talking about, we've got a lot of just innovation that's built up over time. and associated with that, and as important as just the data assets we've accumulated and the integrations we've built. So all of those are, I think, motes that we can differentiate. On top of that core technology set is the product suite and our architecture. So first of all, I think if you look at what's happening in the LLM space, you're kind of seeing fiefdoms being created, right? OpenAI is building its own fiefdom. Anthropic is coming in. Google Gemini. uh We have an agnostic architecture that can pick and choose the best of all. And we have our own language models also that we can deploy for certain use cases for speed and efficiency. So as one example of that, we were the first company to partner with OpenAI and ChatGPT to integrate into the automotive space with Stellantis in Europe and their premium DS brand. And now they've deployed across several of their brands, their franchises. So. Because we can pick and choose, also integrate with others like Perplexity and we built our own models off of Meta's platforms and so forth. So that agnosticity, I think, is a differentiation. Number two, our platform diversity. We don't only provide cloud solutions, which makes sense in certain environments where the customer sort of wants to get an efficiency play in terms of where the hosting happens. And then we also deploy edge solutions. In fact, uh When we first got our foray and breakthrough the technology into product set, was because we had cloud differentiation, but over time in the auto space, we've been able to permeate and disrupt the edge solutions, which happened without an internet connectivity. We, for example, have a partnership with Nvidia to bring edge solutions into the automotive space. We have a lot of channel partners. So I think that's sort of not only the core technology foundation, not only the agnosticity of where we get sort of the responses for what use case. but also then the diversity of the products that are our three factors where usually when we go head to head on deals, we went on those elements. And then obviously we're increasingly building up capabilities and deep benches in all of the industry sectors. Yeah, fantastic. I'm glad you picked up on the edge point. I wanted to quickly get your insight on that. What's the strategic advantage of deploying AI on device versus cloud? Yeah, there's a few of them and depends on the customer environment. First of all, the choice is important because in certain environments where you need highly secure, uh you know, uses or, you know, applications without connectivity, is imperative cloud, you know, I'll give you one, two of them. That's okay. One in the healthcare space. Oftentimes you have people, you know, going around a hospital, maybe don't have the perfect connectivity and you're dealing with patient issues. You need to be able to do things with edge capabilities. The auto is another one where you're driving and sometimes the connectivity is great and you can get your navigation and so forth. But sometimes you lose connectivity and you don't want to lose the capability to ask your car questions. So for example, one of the things we brought with our own small language models was the capability to ingest a car manual, an operating manual, so that if you have a weird light that shows up on your dashboard or uh you start to hear some squeaking in your brakes, you can just talk to your car. Well, you don't want that only on when the connectivity is there fully, right? You might be in the... driving in the middle of an area that doesn't have great connectivity and you want to still have that full capability. So those are the types of examples where edge differentiation is important. And again, the fact that we can provide all these capabilities is a differentiator because a lot of the peers that we work or a competitive set that we uh navigate around, uh oftentimes don't have full suite solutions. Yeah, makes a lot of sense. let's talk about the next evolution of this technology, suppose, and something that's embedded into one of your core product offerings. With the launch of the Media 7.0, you're talking about full, agentic AI capabilities. What does that mean in real terms for enterprise clients? Perhaps you can give us some examples. Sure. Agentic is, again, the new breakthrough. It's kind of the software layer on top of LLMs. And uh one of the major breakthroughs, even just over the last year, that we've seen is the ability for these models to independently plan, reason, and act. So it's not just immediate responses of what's the weather here or what's going on here. uh The ability to take a more complicated query. and be able to think through the logical set of how to go and act on it. And the other real benefit that we're seeing with Agentic is the speed to realization. uh And uh so for example, a lot of our enterprise customers will say, hey, you're doing great. You're automating a subset of my interactions. I really want to see more and more of this call flow that might be going to human call center operator. I'd love to get that automated. Well, in previous generations, that might have taken three, six months. We're seeing those happen in weeks, if not days. So just the speed of uh realization, speed of automation is accelerating a lot through agentic. But to use an example of uh maybe just taking again, I'll go back to the analogy of where the voice AI architecture is sort of like family building and you have a new kid. So if imagine you have a kid, you generally have a lot going on in your head, but imagine from a financial standpoint or a health insurance standpoint, you uh need to go and say, maybe I used to have an HMO insurance, now I want to have a PPO. Can I go and investigate what those choice points are you have a new kid? can actually talk to a system and it say, know Maybe you also need to think about your deductible and you also need to think about you know What type of health savings account you maybe near our family savings count in the US we have you know that they previously didn't need to worry about but now with the new you kids you're worried about it maybe on top of that you need to start thinking about investing for their education and so in the US you put together a 529 C all those interconnected systems used to be you know, five different conversations, and now just through sort of one integrated architecture, that's the type of uh example, or that's the type of use case that you're seeing more and more uh ability to handle through agentic solutions. And there are many other applications. If you're in a healthcare setting and you had a checkup, you want to follow up, you want to get the results of a scan, you need to figure out what's going on in your billing, and you need to set up the next appointment, these things can now happen. through more thoughtful planning and reasoning capabilities that agentic solutions are bringing. Yeah, fantastic. There's probably one point that I think we'll come back to on that. But before we do, let's just take a quick look at commercials and your outlook moving forward. You did mention how SoundHound have reduced your dependence on any single customer. Perhaps you could flesh that out for us, talk about your current customer concentration and how important further diversification is for the business. Yeah, just over the past two years alone, em we have shifted from almost entire dependency in the automotive space to now diversification with great penetration in restaurants. We've now scaled to over 13,000 restaurant locations. um We've moved into the enterprise space where we've got seven of the top 10 financial institutions that we work with. We've got great uh healthcare partners, uh insurance partners. So again, where previously maybe just one industry was 80 % of our revenue, you know, a couple of years ago. Now we have five industries that represent double digit percentages of the total mix of business. you know, your question, like where are we openly, like we saw a lot of runway, even just with the auto business, we saw a lot of growth opportunity in the restaurant space, but our technology is fundamentally horizontal. And so there is a lot of opportunity across different verticals. Again, I mentioned we penetrated newly into the energy sector and the utility space, and we see a huge runway of other similar um providers that are looking for our types of solutions. So we want to go deeper there. And across industries, there are still some that we don't touch. not actively, we've got dedicated sales teams and indirect channel partners that are going after. certain vertical sectors, but I think our mix and diversity across auto, restaurant, healthcare, financial services, insurance, now in energy, that's a pretty diverse set. Obviously, if there's great growth opportunity and there's an adjacency that makes a lot of sense, we'll welcome the entry into new verticals, but I think we've got really good coverage at this point and we just want to go deeper in each. Yeah, yeah, OK, I was going to follow up on that point because I guess there might be a kind of conflict in your strategic kind of direction in terms of you might think actually to uh properly service this particular industry, we need to just hone in and, you know, deepen our level of service to that particular industry versus trying to cover a broad set of industries. for you and I guess the rest of the execs, is it a case of just trying to balance those two things, having enough diversification to give the business longevity, but also give each industry a deep level of service. Yeah, it's definitely as a smaller early stage public company managing resources and limited resources is really important for us. And I'd say it's also a differentiator because sometimes having limited resources forces hard choices and prioritization and we're having to make that, which I feel is again a healthy thing. I'll get to your question in a second if I can go through this journey. Number one. You know, as I said, we're native core voice AI. That's what we do, that's what we know, that's what we're gonna continue to deliver. We do believe this next major horizon, and again, possibly as big as the internet, you know, boom, maybe bigger, is generative AI and what it's enabling in terms of expansion of use cases. And again, Any human anywhere, generally through natural conversations, can quickly adopt technology and get things done. Whereas if you needed access to other technologies or a keyboard or all that, that's like one incremental hurdle. So natural conversations is maybe the biggest inflection of where human technology interaction is going. And we're just at the early, early stages of that. Within that, our vision ultimately is to voice enable the world with conversational intelligence. So that doesn't mean, that's not saying unique to any one industry. Ultimately, the vision is to do all these things. And that voice commerce pillar I talked about, we're trying to enable transactions, commerce and uh discovery through all the ways consumers want to discover. So I mentioned the example of coffee on the way to work, but you can imagine also you're stuck in traffic and you need to book. an upcoming vacation, you want to have reservations or you need to reorder your contact lenses or there's so many transactions that ultimately can happen through voice. And so we're trying to build the infrastructure for that to scale and build and generate that flywheel. So from one standpoint, and again, the platform is horizontal. ah is agnostic. sort of very... And we work in, by the way, dozens of languages and over 100 acoustic variations. So we're a global company that's been... have businesses in uh several continents around the world. So we want to keep generating. And we know that voice AI's moments now. We're seeing that massive inflection. So we want to stay fully on the accelerator. Then to your question, like, OK, well, how do we know how to prioritize and pick and choose where the greatest opportunity is? We let a lot of our customer conversations dictate that. So if I go deeper into uh restaurants as an example, an area we've been very excited we're seeing a lot of great traction. uh We're getting, again, 13,000 locations, many of the largest QSRs, quick service restaurants that we're partnering with, and we're working at their pace. So in the phone ordering space, this technology scales very rapidly. We've integrated with point of sales partners like Square, Toast, Olo, Oracle Micro Symphony. working with the bigger brands that have their own custom point of sales. Because generally what we do is we integrate our voice AI into the point of sales. That gives us the latest updates on menu structures and pricing. And then all we're doing is you can call in and order your pizza with half mushrooms, half pepperoni and a Coke and get that delivered to you. Like that can scale and we can do that rapidly as long as we're getting partnerships with the corporate levels and the franchises. Hmm. Drive-through has different requirements of hardware and sometimes certain drive-throughs need to retrofit their drive-through, get the large confirmation display board, maybe they need to update some headset technology. So we have to work at that pace. So it's kind of the customer pacing that's sort of dictating how fast and how we allocate resources. uh With enterprise, know, great solutions. you know, there's sometimes higher rigor in terms of security and privacy controls. And that's another thing that's a differentiator for us because we put custom controls on that. A lot of our customers require sensitivity to who owns the data and what data is shared. And we work fully in partnership with our uh customers to sort of build that type of customizable privacy controls, for example. uh A lot of our competitors don't do that sort of one, you know. shoot fits all type of idea. So we will continue to penetrate across these verticals. We'll keep going at the pace of the customer. We want to aggressively grow because we know there are massive tailwinds in this space. But we're just constrained on the size of the company and the resources we have. So we want to always make sure that we have uh focus. But we also want to be thoughtful about moving fast. Yeah, absolutely. And to that point then, I mean, you've seen very significant impressive growth in recent times. Perhaps you can give us a top level view on whether you expect that momentum to continue throughout the year and to what extent. Yeah, we have guided this year to grow our outlook for 2025 was 157 million, 177 million of top line. revenue and also to get to break even by the time we end the year and that'll be an important milestone for us investing in research and development for many years and so to kind of get to that threshold is important. So that is continued very healthy levels of growth nearly doubling the company. Historically up until the beginning of last year we were growing 40 to 60 percent consistently fully organically never bought any companies and last year we went out and we acquired uh companies and that has inflected our growth to over 100 % and that's what we've been reporting the last couple of quarters. I think that again as we're early days if you look at our penetration whether that's an auto or restaurant or even in the healthcare space you know we are just in small we're very there's a small penetration of the total addressable market and so we see just massive opportunity continue to grow at very healthy levels and I think consistently for us to grow north of 50 % for the foreseeable future is, I don't even, we don't consider it a high bar internally because of the opportunity, the competitive differentiation, the technological differentiation, and then also the massive tailwinds that are going into this space. And then we'll be thoughtful about if there are acquisition opportunities that make sense for us, that can help catalyze us to even greater levels of growth like we've been seeing the last few quarters. Certainly we'll take advantage of that as long as the return on capital makes sense relative to the. risk adjusted cost of capital and that's sort of the framework we use. But yeah, think ultimately because the tailwinds, this is the way the world is moving because we think we have differentiation, continuing excellent growth rates is something we certainly expect. Yeah, yeah, fascinating and really exciting times for the business. And I think let's leave it there on that fascinating insight. I think that just leaves you to say thank you very much for joining us on the podcast, Natesh. It's been a real pleasure.