This is the podcast for digital creators by digital creators with me, your host, Dylan Schmidt. Let's dive into insights on content entrepreneurship, creator strategies, and the occasional random topic because why not? Whether you're on day 1 or 1,000 of being a digital creator, Let's learn and grow together. You're listening to Digital Creator. Coming to you this week with an episode about AI voice cloning. Now if you don't know what AI voice cloning is, It is when your voice is cloned and then used, by artificial intelligence to recreate what you might say. And it is typically done like speech or text to speech. So you type something out and then your voice says it. There are quite a few ways of doing this out there. But I'll share the 3 that I've been experimenting or 2. 2 or 3 I've been experimenting with that, you might find interesting. And what got me going down this rabbit hole was Arnold Schwarzenegger. Obviously. No. He launched a podcast, and it's like a daily podcast. And it's all about, like, health and fitness, of course. And one of the key things about it is that it's not actually Arnold speaking in the podcast. It's a clone of his voice, which I found super fascinating because, you know, like, how does that work? How does it sound? If you wanna check out Arnold's podcast, let me look up the name right here. I can tell you it. It is Arnold's pump club, and that's It's the, official email podcast and merchandise store of Arnold Schwarzenegger, Arnold's pump club. And it sounds really good. I played it for my wife, and I was like, listen to this. Arnold has a podcast. Didn't say anything, you know, that it was AI. And then I was like, you know, it's AI. What do you think about that? Now she's not like some big Arnold Schwarzenegger fan, so wasn't like you know, she was like making breakfast. So, You know, just wasn't like a hard hitting reaction, as one would expect that I just pop out of my office and say, listen to this. Arnold Schwarzenegger's AI voice. You know? That's but that's probably typical on this household. But it got me thinking, like, the capabilities of what we have available without spending a bunch of money. So I looked up 11 Labs, which is like the highest regarded consumer level AI voice cloning company. And they have you can mess around with this. They have, like, free options, and then they have, like, priced options. And I'll look up the different the different options they have here so you can kind of, like, get a better under standing of this. So for 11 Labs, it's free for people who just wanna try it out. And it says you can create up to 3 custom voices. You can use 10,000 characters per month, which I am assuming that 1 character would be 1 word. And you can create random voices using voice design, access shared voices in the voice library, API access, automatically dub your content from 57 languages, and awesome. And that's all free, so you can mess around with it. Then they have the starter, which is $5 a month that includes everything in the free, and then 30,000 characters per month. I don't have any affiliation by the way with 11 labs. I'm just sharing this because this is Interesting stuff. You can also get on the starter plan, which, again, is $5 a month. Commercial license included, create up to 10 custom voices, and access to instant voice cloning. And then at the creator level, which is what I signed up for just to try out, is$22 a month, and that includes everything before and a 100,000 total or yeah. Wait. A 100,000, yeah, 100,000 total characters per month, and you can create about 30 custom voices, access to projects, which is their new long form speech synthesis editor, and then you can do professional voice cloning of your own voice, which is What interested me the most was the professional voice cloning because I wanted to do a professional cloning in my voice. Now you're gonna have to, like, do your own research in terms of, like, what you, are signing up for when you do something like this because, You know, the terms and conditions. Right? Are you signing your voice away? Can they use it in all these ways? Those are, like, real things that you'll have to do your own research with and see if you're comfortable doing it with your voice. I'm not here to, like, talk about that part. I'm here to talk about, like, you know, what you can do, I guess, and kind of, like, the different options here, which I find interesting, and, like, just kind of at a higher level, I guess. So I signed up for the creator plan, and I was like, the the professional voice cloning really interested me, and it takes I didn't know this because it doesn't say this on the sales page. It says it takes, or it might be deep in the FAQs. It says it takes, like, 4 weeks after you submit, your voice samples to get your own professional voice cloning back, which then is, to my understanding, a higher quality AI voice clone to use than the instant voice cloning that they use, which I'll talk about in a second. So for the professional voice cloning, the samples they needed were good quality recordings, just like you would find on this podcast, which I used this podcast for that. And it said, like, 3 hours was optimal in terms of you speaking, no music, just high quality audio, and just you speaking also in the style that you would want. So for example, if you're doing an audiobook, it would be best to have, you know, 3 hours of you reading, in a way that you wanted your audiobook to sound like. If you're doing podcasts, you would want it to sound like what you would wanna sound like on a podcast, or whatever you were doing. Right? So I went back through, like, my last 20 episodes and then cut out any music, made sure that there weren't interview episodes. For this podcast, all I'd had to do was just chop out the beginning and end and then skip over any interview episodes. And then I uploaded it. It was like, great. Also, it asked a little bit about how I would describe the voice. So I didn't know exactly what to say. So I put 34 year old American male living in Los Angeles, talking into a sure s m seven b plugged into a road road caster duo, and it's a podcast. I didn't know what to say, so that's what that's what I put. Wasn't a whole bunch of instruction there. But Besides getting the files ready to upload because I didn't have those ready, I had to do some light editing there. I would say, like, the actual process took 10 minutes or less because I already had the, you know, hours of audio. And we'll wait to see what it sounds like. Will I be using it for this podcast or anything? I would definitely Let you know if I was going to use it. And I think I want to share it. I don't know exactly how I'll share it. I'll definitely I've been already starting a discussion about this in the creator club, but I also would just like to play around with it. I guess I just wanna see. To me, I sign up for a lot of this AI stuff for at least like a month just to play around and see and kind of push the limits of what's out there. I don't know if I'll actually use it. It would have to be on a way that you're like, oh, yeah. This is cool. Not like this is phoning it in, And, it's just a professional voice clone. That's not that's not cool. So which leads me to believe, like, to talk more about Arnold for a second here. So he's doing this, like, daily podcast which, this is, like, highly speculative. I'm not like, I don't know behind the scenes here in the show notes of the episode description of every podcast he's done. There's a link to the people, I think, who produce it or the company that produces it. Just this from, like, an outsider looking at what he's doing. To me, it kind of has this, like, Ryan Holiday, daily stoic, daily dad feel to it, where it's just like 1 topic. His episodes are, like, 5 to 9 minutes on average, and It is just simple, like 1 topic, little chunk of motivation to get you going through your day on, you know, and you hear it in Arnold's voice. What's a little off about it? Besides, it sounds great, but what's a little off about it is, like, the topics. And it just seems a little weird to me because Arnold is like, what? Let's look up how old Arnold is. And not that age plays, of course, like The whole number in here, he is 76 years old, and the topics are, like, current. You know? And it kinda has this Andrew Huberman vibe of, like, nutrition and science. And, you know, he's talking about, like, the original Ozempic, And I say he's talking about, but it's people who are writing this. And so from a workflow standpoint, you have a person at a computer, basically typing like a blog that's then read in Arnold's voice. Meanwhile, Arnold does not have anything much to do with it other than it's his name and likeness, which is To me, like, one of the most interesting parts about it, it's just using his name and likeness, and he built that up. Now will we see this from more celebrities like The Rock or, you know, Kevin Hart. Could you imagine, like, a comedian doing jokes using something like AI voice clones. I don't know how comedy will work, you know, at least for a little while. But the scariest kind of, like, part about all this and I I say scary, but I'd say scary with a mix of excitement is we're still on the super early stages of this stuff, and it's already sounding really good. You know? So I thought I would play a clip of so I did the instant voice cloning, which the file has to be, like, under 10 megabytes. And Basically, it doesn't get too much information, but it's kind of crazy. So the other one, like, the the professional voice cloning, again, it took, like, 3 hours. The instant voice cloning took like 10 minutes of my voice, and this is what it sounded like. This is the 11 Labs AI instant voice cloner reading my Are you a New Year's resolution type of person? Personally, I'm not. Yesterday, I posted a video on Instagram that I think points to why a lot of resolutions fail, but I do love streaks, challenges, and rituals. For example, I've been writing in my daily journal for 1,072 days. I've been releasing a podcast episode every week since September 2021, and I've been writing 3 pages stream of consciousness after I wake up every morning for the past 7 days, which I will continue for the next 90 days. When I start something, it's easy for me to keep going as you can probably tell from this newsletter. Which part of the doing do you struggle with the most? Is it the starting, the keeping going, or finding the balance in between? How great was the reading of those numbers? When I at first, I I feel like the first, like, sentence or two with Solid, and then it just kinda fell off a cliff towards the middle and then end of, like, this does not sound right. It almost kind of sounds psychopathic, the way the way it asks questions of, like you know, as you can probably tell from this newsletter, which part Of the doing do you struggle with most? You know? And I just have, like, a certain way of talking that I I don't know if it's the right way of talking, but it's gotta be hard to duplicate the way I talk. Just the cadence, the inflection I use. It's very me, I feel like, and it's not orthodox probably. It's not super predictable. And it will be interesting to see what this voice clone sounds like with 3 hours of audio because, again, that was 10 minutes of audio. So we'll see, and I'll keep you updated on that. Yeah. Also, Descript, which I'm using as I record this episode, Also has, like, an AI voice thing cloner that you can use. Overdub is is what it is. And, basically, if I messed up in something I was saying, I could just type it out and then it would say it. So, normally, you know, instead of typing out, like, the whole text to each like a whole podcast episode, for example. You would just have like, oh, I wish I would have said this word here. You would just type it in, and then It would overdub with my AI voice in just like a spot, you know, or you could do a longer piece. But how you set that one up is just in Descript in the app. It has you you can upload your own files or you can read what Descript gives you, and that just takes a couple minutes. To also, as far as time goes, that AI 11 Labs instant voice clone, that took, like, 3 minutes, I think, from start to download, maybe even a little bit less. Definitely less once I actually, like, Got it set up. It was only, like, 3 minutes. That's with, like, dragging a podcast file of my voice in there and then copying and pasting the newsletter, and then it doing it and then downloading it in about 3 minutes. It would take, like, 30 seconds if I already have the text ready to go and since my voice is already in there. So it's very interesting stuff. And I posted on my Instagram threads, but I've been messing with mid journey AI. They had an upgrade. They're on, like, version 6 now which, You know, all of this stuff is just getting better and better. I'm creating these realistic photos in there just to see how realistic I can make them. And then I'm upscaling the image with magnif I don't know how to say the word. It's like magnific magnific AI. I guess that's what it'd be. So it adds a little bit more clarity to the photos even though they're already really clear. It just brings these portraits to life, and they look, I would say in a lot of, like, use cases, probably indistinguishable. They don't look like general AI photos, which is just it's just crazy. So you could scroll back, look on my Instagram threads. You probably have to scroll back a little bit to see those. But this stuff is like just I think in 2024, we're just gonna see it get crazier and crazier. And we definitely are not equipped to deal with what's coming this way. Because with the Arnold podcast, A lot of people that will, you know, tune into his podcast won't know that it's AI even though I believe on, like, 1st episode, he had said, basically, like, I'm too busy, so this is made by the team or whatever. It kinda signs off on it, but he isn't trying to pass it off as AI from my understanding, I don't know that every episode makes it extra clear because it kind of hops right in. But, you know, most people, They don't know, and they just kinda go in. And I don't think most people care. Although when I was looking up the tech behind his podcast, I did see comments of, like, oh, it is AI. I thought something was off. Or people were asking, is Arnold's podcast AI not actually him? So people are aware of that, and then the other people were like, yeah. You know? On the 1st episode, he talks about it. So you have these people that don't know that it's AI and are listening to it and are, like, thinking it's one thing, and then you have people that are just listening to it, I guess. And it's also a blurry line too of podcasting in general with reviews, and how podcast work is because there's not a lot of transparency. Unlike YouTube, you can see comments in that engagement. You can see how many people liked it, and then you can see views. And there is these indicators of, You know, the social proof of engagement for videos in that type of content. Same on social media. Podcasts don't have that cue of how good things are except for the reviews, which if you're on like Apple Podcasts or Spotify, you can see. And in the case of, like, Arnold's podcast, there's a few 1,000 reviews, and it's, like, 4.9 or 5 stars. And, you know, you assume those are all in good faith. You know? I'm not again, this is nothing against Arnold or the like, the company. Like, You assume they're all in good faith. There's no reason to believe that those wouldn't be real, but I come across, like, a lot of fake engagement out there. And I'm not accusing, you know, this company or Arnold. I'm just saying, like, in general, like, the space is shady, because I see and I've worked with, podcasters who are less than transparent and truthful around the reviews that they're getting for their podcast. People are buying reviews. They're buying these things like they're gonna help, but it's because it's not a lot of transparency going on. People do the Same for YouTube. People do the same, of course, for Instagram, TikTok. Anything out there that has, you know, reviews built into it or any type of engagement, There's people celebrities still are guilty of it, of faking the engagement. So I don't know. You know? Like, I have Absolutely no clue if the engagement's real. I didn't read any of the reviews, but it still exists that, you know, it's less than truthful. So I don't know how of an engaged audience podcast audience that it brings, and I think it's super Interesting to look at because with this podcast that you're listening to right now, me, I thrive off authenticity and doing things like from the heart. And I I, you know, have the marketing brain, but I also really have I'm like an artist at heart, and creativity is very important to me. So the idea of, like, phoning in something like that, absolutely not appealing. But in certain use cases, I could see it beneficial. I was just getting my haircut. Now it sounds like I'm all over the place. Arnold Schwarzenegger, my haircut, AI. Midjourney. And my barber was saying that he will watch videos on YouTube of someone that is just an AI voice, and It's like a murder mystery kind of type of podcast, and there'll be AI generated photos on there that don't always match up to it. And I was like, do you think he this person edits them? It's like, no. Like, there's no way because the they don't match up with what's saying. Now I don't know what program that is. I asked him to just send it to me because I'm like, this sounds interesting to look at. You know, this this have a big following. I I just wonder in the current iteration of AI voice cloning and AI voices, You know, how deep of an audience can you engage with AI voices? My bet is that the current kind of, you know, thing is Probably not that deep. Like, people will probably enjoy the content. I don't I'm not saying that people out there would be, like, not listen to it. But where does that take you next? You know? If you're doing it for a company and it's just, like, just to get it out there, I don't know. Like, I I can't say that I'm, like, really against it. For someone that's building a personal brand, doesn't make a lot of sense. In, like, 99% of probably not 99. Probably less than 99%. In most in a lot of situations, it doesn't make sense. But then you look at someone like Ryan Holiday, for example, who publishes The Daily Stoic, Daily Dad, and he might have some other ones I'm not thinking of off the top of my head. And he publishes, like, these super short podcasts. Someone I heard I'm forgetting who to credit here, but they called them blog casts where it's essentially like you read off your blog. He it feels like he does a similar thing. It feels like he's reading what he wrote. And at that case, like, do we really care if it's him reading it, what he wrote, or if it's like an AI voice and it sounds the same as him reading what he wrote. In fact, it might even he might even be using AI. I don't know. So I just kind of wonder, like, how much it matters. And I guess it comes down to your style of what you're doing. I'm not, you know, the end all be all in terms of how you go about your own process. How you go about it is how you go about it. It's not how other people say you should go about it. But I think it's worth exploring the impact it has on your audience, and being aware of that impact rather than just blasting out more content for the sake of blasting it out. Because I think it's our duties as humans to curate what we're putting out in the world even if we're getting help through technology like AI voice cloning, or AI image generation, or chat g p t, or these, you know, AI text tools that help you write podcast show notes and titles. Like, those are all great, but we still have a duty as humans as to deliver our best content. And if it helps us do that, then I don't see any problem with that. But if it's a crutch or if it's just this, you know, duct tape just to get anything out there that has no substance and it's just more noise. I can't really get behind that. And I just think, as creators, we have to, like, really kind of analyze what the mission is behind what we're doing, and look at it through a critical lens. Does this need to be me doing it? Can I get help with this? And if I get help with this, can it be AI? And only you know the answer to that. With this podcast, I don't I don't want to outsource everything to AI. I wanna mess around with it. I wanna experiment with it. I wanna, like, see what it does, but I'm not looking to outsource my thinking, my creativity. And I'm not looking just to, like, pass this off to somebody because I enjoy it. That's my thoughts about AI voice cloning. And, yeah, if you have, you know, published your own AI voice, I'd be curious to know. So hit me up either on social media or in the creator club, and I appreciate you listening to this human made podcast. I just got really inspired today, And I was like, you know what? I gotta record this now because we're living in interesting times, and 2024 is gonna be even more interesting in terms of content generation as we have access to more tools, as this gets cheaper and cheaper, stuff how much would have AI voice cleaning cost like a year or 2 ago. It would have been, for sure, a good amount of money. We're talking 100 and 100, possibly 1,000 upon 1,000, and now you can get it for, you know, professional or instant voice cloning for $5 a month or professional voice cloning for $22 a month, which is just quite Crazy because prices will only come down, and more competition will only come down, which will then drive, you know, the innovation for us consumers because they'll want us to sign up for it because there's a demand. Right? Supply and demand. Right now, 11 Labs is like the top, but it's not gonna be like that forever. And how long until Apple releases something that is just, like, amazing and it feels supernatural. It's not out of this world to say within at least a couple years, This is gonna be unrecognizable, and we're gonna be laughing at how silly this seemed, in this moment in time. So not all that to overwhelm you, but also a clear kind of check-in of where we're at, and, let's keep talking about this stuff. Alright. Catch you in the next episode.