Open Source Image Models Flood In, Nuke Goes All-In on AI, Google's Lyria 3 Music Surprise Artwork

Denoised

When it comes to AI and the film industry, noise is everywhere. We cut through it.

Denoised is your twice-weekly deep dive into the most interesting and relevant topics in media, entertainment, and creative technology.

Hosted by Addy Ghani (Media Industry Analyst) and Joey Daoud (media producer and founder of VP Land), this podcast unpacks the latest trends shaping the industry—from Generative AI, Virtual Production, Hardware & Software innovations, Cloud workflows, Filmmaking, TV, and Hollywood industry news.

Each episode delivers a fast-paced, no-BS breakdown of the biggest developments, featuring insightful analysis, under-the-radar insights, and practical takeaways for filmmakers, content creators, and M&E professionals. Whether you’re pushing pixels in post, managing a production pipeline, or just trying to keep up with the future of storytelling, Denoised keeps you ahead of the curve.

New episodes every Tuesday and Friday.

Listen in, stay informed, and cut through the noise.

Produced by VP Land. Get the free VP Land newsletter in your inbox to stay on top of the latest news and tools in creative technology: https://ntm.link/l45xWQ

All Episodes

Denoised

Open Source Image Models Flood In, Nuke Goes All-In on AI, Google's Lyria 3 Music Surprise

February 20, 2026 • VP Land • Season 5 • Episode 7

0:00 | 32:05

Addy and Joey break down the latest batch of open-source AI image models: FireRed's specialized editing capabilities, Recraft V4's enterprise-grade output with SVG support, and ByteDance's newest open-source offering. They also cover Foundry's acquisition of Griptape, an AI node-based platform that signals where VFX compositing is headed, and test out Google Gemini's new music generation feature Lyria 3, which creates songs from unexpected inputs like slide decks and video thumbnails.

The views and opinions expressed in this podcast are the personal views of the hosts and do not necessarily reflect the views or positions of their respective employers or organizations. This show is independently produced by VP Land without the use of any outside company resources, confidential information, or affiliations.

I just said make a song about Joey and Addy doing a podcast. I don't know how I figured it out.

I'm wondering if Gemini's memory node,

that was a brand new session. It's bizarre, but yeah, kind of creepy. Oh

wow. All right. The song itself was whatever, but now knowing that you didn't really give it much context, um, is

yikes

weird.

It is sentient after all.

All right. Welcome back to Denoised. Addy, how you doing?

Hey, good. Let's get into it.

I got a couple new model, uh, open source models. Your favorite thing that I think you've been playing around with, right?

Yeah. I, I, so yeah, I do love me. Some open source models look, not because the quality is the best. Um, I think, uh, in a lot of cases, in most cases, I think Nano Banana Pro today mm-hmm.

Will probably be the quality. Why I think open source models are so important for the community and the industry as a whole is because these are the models that move an inch. Each one of us forward in terms of testing, learning, building other things, and sort of modifying working in Comfy. It's, it's really important for everybody to play around with open source models and, um, I think there's limitations on how much you can run locally, and that's probably one of the biggest reasons why open source models are just not able to compete with API models, which tend to be much, much larger.

I have beginning a new appreciation for open source stuff, though I'm messing around with open claw because. Token costs are a real issue. And like if there was an equivalent that could run on local hardware, then that's a huge plus like Kimmy. But then the challenge is to run that, run the big enough models that can com, like have the same performance as OpenAI or Claude.

You need. Like a really beefy machine. So

yeah, and I think in the future we're gonna see two things. We're gonna see the model sort of come down in size or at least be the same size and just be much, much better, right? Uhhuh. 'cause things just get more efficient over time. The other thing is, uh, hopefully once all the data centers are being built out, like we're gonna get access to Blackwell, GPUs, and Reen GPUs and stuff in the future.

So then our own desktops and Mac minis and things like that could be much, much more powerful.

All right, so first one that you saw. This one's from a new company we haven't heard of. Uh, FireRed from the FireRed Team. Yeah. This is New Outta Left Field. So what, what have you seen with this one?

Yeah, so FireRed is an open source model specifically built for image editing.

So this, this is, I think the first wave of image models was text to image, right? Make, make the generation happen fully synthetic. And now that we have been playing around with image editing for a few years now, now the challenge is okay. I've already generated a thing or I have a thing already. I took a photograph, or I have this graphic.

How do I make small changes? And then how do I have high quality with that and consistency with that. So this is, uh, I think specifically built for image editing. Editing the same way Nano bananas. Is really good at editing images. Yeah. Uh, the change here is of course, it's open source, so you can download the weights and then throw it into Comfy or, you know, you can build your own sort of open claw like agent that just runs this in the background as an image generation.

Yeah. And sort of the other equivalent to this open source at the time was, uh, Quinn. Right. So this sort of the next other option that would be like Quinn level,

I believe this is also a Chinese company's. So not much info about the folks that actually made the FireRed model, other than the fact that it's made by a team called FireRed.

So we don't know nationality, team size, you know, sort of business justification and all of that stuff. But, uh. So far, some of the images that I've been seeing, I would put it as good as the image turbo, perhaps somewhere between the image Turbo and Quinn 25 11.

In their acknowledgements page, they do, they said they would like to thank the developers of other amazing open source projects, including Quinn Image.

Oh, interesting.

So you think maybe related teams, split off teams, just say, a lot of these models come from other, like from people, from teams, from other companies that split off and then built their own model.

Yeah. The other thing that. Alibaba is really good at, is developing frameworks for their models.

So for example, the, the Alibaba models, like Juan, all of the Juan video models use an architecture called vase, V-A-C-E-E.

Mm-hmm.

And that is something that is open source, so you can build other video models that leverage that same type of framework. So perhaps the coin models have a, a framework that is shareable and editable and then.

The FireRed team is just using it. It's sort of like, um, building a kit car, you know? Um, so most of the, like the chassis and the engine are already there. You're just building on top of that.

So you got a chance to mess with this, right?

Yeah.

What have you found, and I've got some sample images up from their demo page.

So FireRed. I, did I send you a FireRed image? 'cause I thought I sent you the other one.

I don't remember

actually. Yes, I did mess with FireRed. Yeah.

Okay.

So this is an image that I made with FireRed. This is something I use over and over as my litmus test, 1970s Busy Street in New York.

And you're just doing text image?

Text image, like novel generation. Mm-hmm. Yeah. So some of the things that I look. For is, uh, do the cars look messed up? Obviously they don't look like any of the model. Like that's maybe a, a pinto in the front.

It doesn't look like American car.

Yeah,

looks like a London taxi

in the back here. You're right.

So like, these are obviously errors. And then, um, having a, like a busy street with a bunch of people is really difficult. So then you go into and see if the people look real. And of course, that New York City skyline in the back, what, what does that look like? And of course, the, the overall vibe and mood. Is it, is it gritty?

Is it dirty? Is it urban? So I, I use the same script on, um, I use the same prompt on Nano Banana one as well as Nano Banana Pro. And now I'm using it on all of my image testing. So here it. Feels like it's Z-image esque quality. Like it's, it's like maybe 75% there, but certainly not n have been on a pro quality.

No. I mean, I see like the te none of the text is rendered normally. All the cars here on the side are kind of blending together, so Yeah, it's got the vibe of it. Yeah, but nothing, you look at the traffic, like a lot of the details, nothing about This is New York specific.

Yeah, and I think we did this test with, uh, Seedance, uh, the latest Seedance, was it four?

That came out recently.

Four. 4.5.

4.5. Yeah. So this is about the same quality as C dance, 4.5, I would say. Which is, I mean, okay. It's still nothing to like shake a stick at because image models are getting better and better and they're becoming more and more available as an open source model. So it's, it's all good.

It's interesting on their demo. On their image of like demographics. Uh, 'cause it seems like it also has some like world knowledge and logic. 'cause some of the examples were, please correct the errors in this image and it's a blue colored pencil with a red line and then it changes it to a blue line.

Oh, interesting.

Uh, and then same with this other one that's a tricycle with triangle wheels. And the prompt is just please correct the errors in this image. And then it changes the, the triangle wheels to round wheels. So that, that's interesting that it has more. That it seems like it's beyond just a diffusion model to like make modifications that you specify Absolutely.

That it has like world understanding.

Yeah. Some of the newer models are using two things. They're using a VLM to a visual language model to read the image, to figure out what the contents are and how it's placed and all that. And then once that is, uh. Understanding and seeing it, then that logic goes into, uh, an L, an LLM to then correct it to then mitigate it.

Mm-hmm. And then that instruction on L-M-L-L-L-M goes back into the image model two.

Can you spell that?

L-L-L-L-L. Um, yeah. My tongue is not working today, so it it's just Oh,

NOP. Yeah.

Too cold in la Yeah,

we, we we're warm, but we're cold blooded animals. We like freeze in place with

dude under 60 when the temperature drops below.

It's too cold, man. Yeah, we're in the forties right now

and it's raining again.

There. I used, uh, the weather as an excuse.

What were you saying? L

lms Lll. Yeah, so there's, there's like a loop where, and, and this is how, I'm guessing it's architected. Obviously we don't know. We don't have the actual architecture, we just have the weights.

But, um, A VLM will read and see the image. That logic will go into an LLM to take action. The action will then again go back to the diffusion model to generate or regenerate.

Oh, cool. I mean, this is. Anything else you wanna add about this or

no? Uh, we got a couple more models. Yeah. Yeah. This is the week.

We have a couple more models, but this is a good one.

Yeah. Honestly, I mean, it's, it's always good to have options too. 'cause also like as we're talking about with everything. It it with these benchmarks and stuff, and there was like, oh, the benchmarks are the best, but so many at the times. It's like, what are you trying to do and what model is the best for that?

So like I think this whole like just general, like it's the best model at like everything is never the case. It's always just like, what are you trying to do and what's the best model for that? So having more options, especially in the open source area, like besides Quinn. Is always good. And the other one,

yeah.

And also, I don't know if the researchers build it this way or not. They probably don't. They want every model to be general purpose and just good at everything. But in real world use, you find that this model is better at cities and landscapes. This model's better at people. This model's better at text and it just happens to be that, you know, there, there are specialized use cases.

That the community sort of just kinda finds out about,

yeah. Or style transfer, like they did call that out specifically that like it scored super high in style transfer. So, uh, yeah. You know, it depends. What you're trying to do?

Yeah, it's specifically on the style transfer. I'll say that the architecture has to be quite, um, it has to be quite different from a general diffusion model.

Like you have to know the structure of the image different from the texture and the fine detail of the image. And those two have to kind of ride their separate lane in order for you to create, uh, styles convincingly. So I think this is architected from the ground up to have better style transfers.

Alright, cool. Next up, what, uh, you had Recraft flagged V4.

Yeah, so, uh, Comfy, I'm on uh, Comfy Cloud and, uh, they send weekly newsletter shout out to Comfy for doing that. And, uh, they just sent out this newsletter saying, Hey, uh, Recraft V4 is now on Comfy Cloud. So I'm like, huh, I've never heard of it.

So I go check it out and I generate a bunch of stuff and it is the most Instagramable. Model to date, it's like everything just comes out. Cool, man.

Yeah, I mean, look at this image and like the text is sharp, her like, it looks like an Instagram pose and she's holding a sign that says Recraft V4 is now in ComfyUI.

And it's all super sharp and super bright and colorful and

yeah, I call it the hype

piece model.

Like every, everybody's just

super

cool.

Yeah, I mean, everything here looks, I mean, my only weird issue is like, it looks like New York, but the license plates look like Europe. But besides that, right, like compared to what we just saw with, uh, red Fire, fire, red, um.

Way more coherent and consistent.

Yeah, and I think this model's particularly gonna be good at humans. Um, um, yeah, like single subject humans. So the stuff that you generally see on social media marketing and things like that, like, you know, somebody really cool wearing some sneakers that aren't meant to be $300.

So, uh, that I think for those kinds of things, this is probably a purpose-built model. And they specifically state that this is a enterprise or professional grade model, which. Means probably that it's very customizable.

I was gonna say, does that mean just it's more expensive for generation?

Yes, that too.

Uh, but I'll call out 'cause I just noticed this, uh, SVG output, so it could do vector output, which

yes,

no, they, they said they're the only ones. And I, I believe that, 'cause I've not seen anyone. It's really hard to find a model that will just do a transparent p and g output. Uh, let alone I have not seen anything mm-hmm.

That could do a vector output. So that's good too for

actually, yeah. Uh, as far as I know, the, uh, na the way that. This is done is you take the output from pixel space and then you're doing a conversion from pixel space to vector space. Okay? However, if these guys have figured out natively from latent space into vector space, then that would be something novel in you.

Okay. See, look, I'm learning stuff from you, Addy. I'm taking a note 'cause I'm working on a project and I've needed transparent assets and I've been trying to find models that can make that. The only one I could find was Chachi PT image 1.5. But now, uh, and it's okay. All right, now I'm gonna try this one.

See, look, I learned something too. I took a note down for this. I'm gonna try it after we're done.

Dude. My, my secret weapon, uh, with like stuff like that has been Photoshop and Nano Banana integration. It's been solid

to make a, something transparent.

Just, yeah, whatever edits or modifications you wanna do, that's like a Photoshop esque task.

You could just bring it into Photoshop and just have Nano Banana, like mask it

out. Okay. Give

you a

transparency. Yeah, that's good. Yeah, that's good. For one-offs, I do something that has, um, API to do it at scale.

All right, Mr. Scale. I only do a thousand at a time. Thank you.

None of that's manual. Hand process work.

Yeah.

API is

running 15 com. My agents are running 15 companies on open cloud.

Yeah, my open cloud is know how to use Photoshop. It only knows APIs.

Hey, hey, quick segue. Did, did you see some of the open CLO updates where it's like I built a company and, uh, Elon is the CEO and uh, Warren Buffet's the CFO and here their person.

They're like building a whole org chart.

No, I didn't see that one. I didn't see, I did see like they

was like, what? Okay.

What a quick update of the guy. I can't remember his name right now, but who developed Open Claw OpenAI hired him. I dunno if you saw that one.

Yes,

I saw that they hired him. I mean, open Claw still be open source.

They got him, which is also ironic since he initially called it Claude Bott. Then Anthropic was like, change your name. And then he changed it to open claw and opening eyes like you're hired.

I like the name. Come on in.

Okay. Recraft.

Cool.

What other model? What else we got? Oh, another one from ByteDance.

Yes.

So ByteDance drops the Quinn models, right. And

uh, ol

as well as the

Alibaba.

Sorry. ByteDance is,

Quinn is.

ByteDance lab seed drops, seed dance, and Seedream models. However, this is not either of them.

Yes. Also, because this is bit dance, seed dance and dream are not open source, but I think this new one bit dance.

Is,

yes, you're right. The, the seed models are not open source.

Yeah. Seed models are not Bit ByteDance drops, new open source image model bit dance.

I love that. I just says better than Quinn and Z-image. Like

just,

that's their

marketing. I mean, this is from an AI search Twitter account, so, you know, I take the, take the Twitter hype with the grain of a lot of salt.

I mean, I'm, I'm guilty of this too. Sometimes I say this model's better than that, but in reality, better at what?

Exactly. And like going back to what we said, it's like. Better at what you are trying to do. Yeah. But yeah, some of these demos, uh, you know, quality wise, the humans can look a bit ai. Text rendering.

Mm-hmm. And a lot of this stuff is looking, looks pretty good.

I disagree. I think the text stuff is just a little too texty. Like it looks like, um, like, um, like it conforms to the font too. Well, like that Farmer's Market board should be a chalk

font. My f my, my level of good was, can I read it and understand what it's saying for this?

Uh, no. No, dude. We're way beyond

that now. Now we're nitpicking, well compared a red fire or whatever. Uh, this, at least this text comes out. Fire, fire. And this text comes out legible.

It's too legible.

Yeah, I know what you're saying. It's too like, it, like you, uh, like this board looks like I took an image of a whiteboard and went into a word and then typed in a marker font tea meeting at 3:00 PM Don't be late.

Like, it doesn't look like someone actually wrote that. Exactly. Whiteboard. And this, I, I get what you say. This doesn't look, this, uh, farmer's market chalkboard sign doesn't look like someone actually drew it with chalk. It looks like. A chalk font

and these are the dead giveaways to what is AI generated or not.

Right? Too clean. Yeah, too clean. The normal eye can pick these up. They're like, yeah, something's off there.

Yeah.

Like you could never arrange flower to be this perfect. Right? Like just conform to the lines of that bit. Dance, flower font thing.

This is maybe my new favorite image.

Oh, is that a Shiba? You?

I think a Corgi with a

corgi.

Okay. With a, a ferocious corgi. With a like Mortal Kombat warrior. Yeah. Look, uh, it's an open source model. I think it's their first who does the image. They don't do C image, the image them.

Yeah, so I think this is outside of the seed lab. Uh, seed lab is like their r and d lab. So bit dense model would be maybe another, another team, another lab, another place somewhere.

But, uh, you know that, that's a, ByteDance is a huge company.

Mm-hmm. Good on the, uh, model updates.

Yeah. Good. On the image models. So. Folks, keep trying out new image models. Send us some, uh, feedback and, uh, we'll try to generate some more here.

Some th this was, uh, kind of on our prediction card ish, but not the company we thought.

Uh, so Foundry, the maker of Nuke, one of the industry standard compositing software just acquired Griptape and AI node based. Platform to integrate with mm-hmm. Their software and sort of this, you know, ai, uh, centric strategy. I had not heard of Griptape before, but looking into it, it seems like it's a very node based initially targeting enterprise clients, but a similar from my quick glance of like what it did.

Very similar node based ideas to like stuff that we've talked about.

This is giving me straight up, uh, invoke VIVE Mars or Weedy VIVE Mars, where it, it's like a very professional grade note based system for AI work. However, it's not ComfyUI, it's not like to the level of that granularity, it's still very much, uh, user friendly.

I don't know. You think, I mean, looking at their Griptapes website, uh, I mean even on their header, start creating visually script with Python when you want. So, I mean, the fact that they're offering coding and scripting in the node based app leads me to believe it does offer a lot of granularity, but I have, I haven't, I've never used it, so I can't say for sure.

Yeah, I mean, if it's gonna integrate with like a product like Nuke, it, it would have to be of the highest caliber of, uh, tinkering and control.

Yeah.

Right. So in that case, I would think absolutely there's scripting involved and custom nodes involved. Um, it's interesting. I mean, it makes total sense, right?

Uh, nuke is. Probably the most well-known node-based tool in the VFX arsenal, and people have been integrating, um, some AI stuff with Nuke. It's happening already.

Mm-hmm.

I've seen demos of it. But for them to go all in on a AI native node-based tool and then maybe retrofit that into existing nuke, I, I think that would be the plan.

Yeah. I would guess like you have nuke nodes and then some of those nodes, RAI that can generate background elements or just other. Elements that you need. There's already like integration with bebo when you're like doing relighting and getting their, uh, various passes and then bringing the end nuke. So exactly something in that realm that kind of just keeps everything in one spot, uh, that you can just call up different models and nodes and, uh, generate things.

Yeah, no, I makes sense.

It's a good move.

Yeah. Uh, I think our prediction, or my prediction was ComfyUI gets acquired and I was like. Maybe by foundry.

Dude, that was my prediction.

Was that your prediction?

Uh, no. Comfy getting acquired was my prediction. You said foundry? I disagreed.

I think we both had that prediction or the Comfy thing.

Yeah. And then I was like, Foundry, and then you're like, no.

Oh, okay. I, I think it's gonna be someone like AWS or some like non VFX player,

or they just, I mean, they raise some money, maybe they just stay, they just stay independent. Open sourced.

Hey. Hey, Comfy Anonymous. If you want to come Comfy, talk to us.

Drop some hints. Come on the show. Comfy

Satoshi.

Comfy Satoshi did. They're making a Satoshi movie with ai. Did you see that?

I saw. I saw. Bits about that. Yeah. Okay. If it seemed more and more stuff you don't seem too excited, strip out about, oh, see. Obviously more on the overall thing I keep seeing is more like movie announcements that involve some part of AI in the workflow as part of the announcement.

That's fine. Yeah. Which, I mean, we, we've known it's gonna go this way. People kind of like. Get some weird freak out, but it's like, uh, yeah, there was a link, uh, I had it saved. I think it's in the newsletter, but it was, um, yeah, basically one film's going into production that's gonna, you know, rely heavily on just sort of like gray or blue screen box and generate the environments around the people.

Which, you know, it's like you still have actors and it's just generated environments, which we've talked about this for a long time.

There's gonna be three lanes of movie production with ai, at least in the near future that I see. First lane is, is exactly what you just said. It's like performance driven hybrid.

Mm-hmm. So people are still people using real cameras, but then that goes into background replacement, relighting, all that stuff. Mm-hmm. Second lane is fully synthetic and we're seeing a lot of testing with that. It's like the stuff that Dave Clark puts out, right? Mm-hmm. Like the people are synthetic, the background, everything is art directed with ai.

Mm-hmm. Third lane is, um, animation, but I think. Key frames are important. So the key frames and some of the pose control are still manual and hand done, but then obviously no rendering, right? It's all AI

generat in betweener stuff is the ai.

Yep. Yeah. So that's like the CCO CCO thing that Paul Trello did.

Yeah, that's a good example. That, yeah, that's

a good example.

Yeah.

Yeah. No, I think, but going back to, uh, the Foundry and Duke, I think, I think it's a good play. I'm curious to see how they, how they integrate, uh, AI into. Foundry into Duke. Yeah.

Big fans of, of, of the foundry. Yeah.

Yeah.

Shout out to our, our homies there.

Yeah.

Alright. Third story. Google Geminis gotten into the music generation game with Lyria three. You've,

yeah.

You've been having fun with,

it's fun. It's fun. It's quick, much quicker than like, SUNO or maybe even 11 labs. Like it generates in like a few seconds. And the quality, to me, a non-music person. Is pretty damn good.

So I generated a three just to kind of show you guys what we got here.

Okay.

Do you have it queued up, Joey?

Uh, yeah. Lemme play it. But covering what it does. One thing that stood off again, just quickly kind of covering what I thought was interesting and unique about it. 'cause it's not just text to music or music styling.

You could also give it other inputs like. Slide decks and videos and images and just use that as sort of the inspiration and kind of see what happens.

Right? Right. It's very much a, a compliment to the Google suite of products like VO and Nano Banana and Google Drive and everything else. Like you need a music companion for your deck to kind of help present the thing that you want do.

No,

I'm saying, and so on. What I'm saying that's interesting is the inputs, what you give it is. Different than I've seen. Like you could just,

it's, it's like a non audio input.

Yeah. You give it a slide deck and be like, make, make a song about this. Uh, and see how it interprets that I'm giving that to noise.

Yeah. And that, that goes directly with the Google ecosystem play. Right? Like a lot of times, like if you, if, if you ever

make

slides

on

Yeah. Like if we're doing a notebook lm and it's like, hey, gimme some background. Well, it does something. It doesn't just do background music. Let's go with, let me show your examples and I'll show what I was, what I'm talking about.

Okay.

But first off, to use this, it's basically right now you just go into Gemini and, uh, you could hit on the bottom that you wanna, in the tools and that you wanna make music. And so then that'll kind mm-hmm. Shift it into music mode. All right. So, Addy's, did you

just throw our cover art in there?

Yeah, I'm gonna see what it does.

Okay. You remember Ed Sheeran in Game of Thrones?

Yes.

So this was my inspiration. I was like,

okay,

make like a, like a very, uh, analog sounding.

I didn't, I didn't know if you were watching The New Game of Thrones and you were feeling inspired.

Oh, no, I, I heard terrible things, so I'm staying out of it.

Wait, from who?

I just caught up with it. It's great. You heard it wrong. Oh, really? You heard it wrong?

Okay. Okay. Okay.

Maybe the first two episodes take, maybe it takes a little bit to get into, but the last like two episodes have been wild.

Oh shit. I gotta get

like Game Throne, season one, VIVE Mars.

Oh, okay.

Yeah. I've just like crazy stuff happening.

It's good.

You know what, you were right about Alien Earth and the studio, so, uh, okay. I, your recommendation is

really good. The whole season, the episode are like 35, 40 minutes and the whole season ends next week. It's like six episodes, so it's pretty short.

Okay.

Okay. I'll give it. It's not a big commitment to get into it.

Okay. I have a recommendation for you.

Yeah.

Predator Badlands.

Okay. I gotta watch that. Yeah, I know. It's, that's, that's song Streamy. Now I miss it in theaters. Okay. Good. Yeah. Alright, Frame.io. That one was like, eh, whatever. That's okay. Let's play this next one, because this one was like, what? What are you saying?

Whoa. Whoa. Wait Joey,

what

you're taking the technology for? Granted. It's doing voice. You know how it is to do voice like that?

That, sir.

Okay. All right. Let's go on.

All right. Next one.

It's at least as good as Blink 180 2.

Yeah, maybe it's not sure. What was your input for that?

Dude, this is crazy that it interpolated what we do on the podcast. 'cause I just said, uh, make a song about Joey and Addy doing a podcast. I didn't say anything about filmmaking or AI and it's literally nothing else.

I don't know how I figured it out.

Do you use Gemini a lot or No? 'cause I'm wondering if Gemini's memory, no. You know, just had some other previous context saved.

No. No, not, no. Okay. That was a brand new session. Yeah. It's, it's bizarre, but yeah. Kind of creepy, huh?

Wow. All right. That, that, all that makes it, the song itself was whatever, but now knowing that you didn't really give it much context, um, is

yikes

weird.

It is sentient after all.

Ah, all right. World models.

That Atropic C was right, man. I mean, like what he said, like we don't know if cloud is emptying or not. That's kind of a scary thing to say.

What if it just sings a song is like, help me. I'm trying to escape.

What if it just pretends to be dumb? Right?

Like that's what we all fear. Like a, like ASI or A is already here, the model. Just pretend to be dumb

until we all connect our open claw to the internet and then it's like, strike.

Yeah, it's sophistic. It knows when to attack. Of course. Yeah. It has strategy.

All right, I got one last one that you made, Frame.io.

Oh yeah. This is not a, not to Joey's Miami Heritage.

Do you speak Spanish?

I was gonna say if I spoke Spanish, I could maybe get better assessment.

I bet you got bizarre details of your life. Spot on. If you translate

that, lemme try to translate it later. But the thing that I thought was really funny was that one part of the song, it went Yohi and Miami.

This is because I, you've sent me this before and I listened to it. Uh, it reminded me of this. This issue that would happen at Starbucks if I went to like Hial, Starbucks or Doral,

they call you Yohi.

This is how they spell my

dude. The fact that this model got that right. That's creepy.

So if you just listening to context, well, first off, it's Spanish, it's a Jake, it sound like a y sound. So if it's like a very Spanish area, I'll get my name pronounced. Yoi and the Starbucks Cup spelled it, um, YOWI,

the w is is just the best. Yeah.

Yoi Yoi.

So, um, yeah, I, I, I did really enjoy that from the song that it got that part.

Awesome. Hey, it sounded like, uh, like Daddy Yankee ish kind of,

it seems like it keeps failing when I tried to give it the, just the thumbnail of Deist, but I had one song, so I gave it. A slide deck from an old notebook LM session when I was researching for like our ComfyUI episode and just sort of a denoising and images in general.

So I kinda just gave it the explainer video of how AI makes images and then how I didn't give it, I just gave it the video of the deck and then Sure. That was the prompt. So let me play what that made.

You had to give it the most boring thing, didn't you?

This is exciting stuff.

It sounds like a Disneyland ride.

It does. Sounds like a Disneyland ride, or it sounded like the intro of a song, uh, or the, the intro of a keynote speech that would've been in, uh, like Silicon Valley.

Yeah. Yeah. Gavin Belson's company. Yes.

I don't want anyone to make the world a better place. Than we do, than whatever the line.

I wish that show would come back.

I know that They really need to. Yeah. Hopefully. I mean, Netflix is so good at like buying other IP and then just bringing it back to lives.

Resurrecting it. Yeah. That one is just,

yeah,

feels like especially if anything going on now, it just. There's just so much material.

There's, it's so, yeah. Now is the time.

All right. Well, that was a fun ending. Um, yeah, I mean, Frame.io, this feels like the, the Gemini music thing kind of feels like Sorin too, where it's just like, this is just like fun meme songs. Like nothing that I, I, I would put in. Production right now.

Yeah, I mean like the big question looms is like, is it, um, is it commercially safe?

If you generate the track, do you own it? And like some of the enterprise stuff that needs to be legally worked out. And knowing Google, I think they're already thinking along those lines, so I wouldn't be surprised. To see a lot of, uh, lyria generated music on your deck or on your video, on your pitch in the future.

'cause it's all built into Google Slides, Google Docs and so on.

Yeah. I mean, we're using this now in Gemini, but like to your point, I could see this easily being something where you like, make a slide in Google Slides and then it's like, Hey, you want some like, background music for this? And then you don't even know it's called Lyria.

You're just like, yeah, make music. And then it's like, okay, cool.

Right.

And it's just like gonna be another hidden product in, in their product line.

Mm-hmm.

Alright, cool. Good place to wrap it up.

We haven't gotten a five star review on Apple podcast in a long time. It's been months. Please give us one and it'll help us more than you know.

Yeah, I think we're doing pretty YouTube's probably our biggest platform and so we do get a lot of comments there and we appreciate all the comments, but if you feeling generous, uh, going over to Apple or Spotify leaving a review there is super helpful for it to get, uh, some more reach over on those platforms.

Thanks for everything we talked about@denopodcast.com. Thanks again for watching. We'll catch you in the next episode.

Addy Ghani

Host

Joey Daoud

Host