Denoised

Runway Aleph Changes VFX Forever, Adobe Harmonize, Wan 2.2 & More! | This Week in AI for Filmmakers

VP Land Season 4 Episode 49

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 48:43

Runway Aleph allows you to manipulate your videos using simple text prompts—no more tedious VFX workflows. This week, Addy and Joey break down the wave of new AI tools transforming post-production, including Luma AI's Modify with Instructions feature, Wan 2.2's open-source model, Ideogram's one-shot character generator, Adobe's new Harmonize tool, and more. Plus, we explore what these tools mean for VFX artists, virtual production, and the future of filmmaking workflows.


--

The views and opinions expressed in this podcast are the personal views of the hosts and do not necessarily reflect the views or positions of their respective employers or organizations. This show is independently produced by VP Land without the use of any outside company resources, confidential information, or affiliations.

Welcome back to Denoised. Good to see you again, Addy.

Welcome back, Joey.

All right. So we're to do our weekly roundup of what's new in AI in the video space.

Yeah. I saw your list and I'm like, my God.

And this isn't even counting the stuff when you're out of town for like the last week or so.

All right. The big one, the one that's been overtaking my feed, Runway's Aleph. Aleph. Yeah, this one's pretty wild. So basically now instead of fully generative scenes, you upload an existing video and you just tell it what you want to change and it changes it. This is pretty wild. 

the direction AI tools should go.

It's not about generating once in, you know, what is that saying? You set it and you forget it. It's not that that's not how a production works. Production is all about iterations, changes, refinement, and this is giving you access into that world. 

you seen some of the demos and some of the examples? 

Yeah. So first of all, any sort of traditional green screen workflow, think this jumps right into that territory.

Yeah. Relighting a lot of demos with relighting a lot of traditional VFX stuff, adding objects, changing clothing, removing cars, moving people, changing weather. I mean, this is 

so groundbreaking and so powerful. It's just, we're not covering it to its full extent here. And I don't think we can because we're not like running it and showing it to you.

I'll show some clips. 

I got some clips up here. Yeah, we'll show some clips. Yeah, we've got one of the woman walking and she's just changing her hair, changing her eyeglasses, changing her clothing. 

This covers so many aspects of traditional VFX, right? So you just said relighting. I think it'll do green screen, chroma key, background replacement.

It'll, 

I don't think you need green screen. think the joke was, saw them one of the posts with a shot of a, of a boat, like a real life, you know, stock footage shot of a boat. And then the joke, he made it a joke where it's like, put the boat on a green screen and then made it look like it was like a fake behind the scenes shot.

She was shooting the boat, but it was a real life shot of a boat. 

Yeah. Right. It doesn't even need the green screen. Joey, what are your thoughts on, I mean, the inevitable question job replacement and where people are in VFX. 

mean, yeah, that's where this well, Let me go to some the technical specs first because 

let me pin that.

That's going to like, feel like, you know, if you're a professional VFX artists, like there are a lot of, you'll see this and be like, that's definitely not usable for us right now. Sure. right now, currently it only supports five second output. And I think the output's 1080 and all of the other limitations with color space and color science with AI generated outputs.

So that's still a limitation for now, obviously. Like this will 

like what we're talking months, Yeah, I mean, 

I have not seen, I can't think I've seen any AI generative outputs that are like EXR, like of the quality that you would need for a professional. 

Yeah, it goes back to the total addressable market, TAM, as it's called in the business circle.

If you look at TAM of, you know, consumer goods, hard goods, I don't know, something like energy drink, right? Like the TAMs of things that are on store shelves are so much bigger than like the film industry. You're talking about like, you know, a few billion dollars versus hundreds of billions of dollars.

And so the AI companies are building things that are addressing TAMs, right? So like... Our film, our endearing market is so small and it takes so much work to build like an EXR output or, you know, 16 bit in and out. Yeah. And to have to train on that. then where's the return on the other end of it? Right. To go to level that up.

Yeah. 

Yeah. So that said about the original question of, you know, VFX artists. mean, you need to get on this. Like, yes, I feel like it will. I mean, obviously it's going to displace a lot of stuff and like more people can, or someone could do less, but I still think there's going to be. A lot of gaps. mean, the stuff that's coming to mind now is the demos we're seeing.

Well, first off, you know, the five seconds is short and we're just sort of seeing one shot demos. I haven't really had a chat. Like I think I just got access to it. I made a shot. was trying to just find what clips I've randomly had that were uncut. And I had some shots walking around the airport. And so I added an elephant in the airport.

It 

looks kind of 

It's 

a little blurry. mean, any, any VFX artists in their right weight in mind can be like, they can dismiss it quite easily. Yeah. But I'm saying don't do that because you cannot look 

that way. Yeah, that's where I yeah, I interrupted my own thought, but that's where I was going with this.

We're like, if you're a VFX artist, can see this can get a good first step, but like there will still be gaps and there will still be anyway. The thing I was saying with the testing was we've seen one shot tests. Like how does this do with the continuity? How does this do with like you're in a scene and you need to shot reverse shot and or replace a shirt, but you want that shirt to be the same, you know, in every single shot.

Yes. I'm sure there are still limitations there. And I think. what you need to figure out is like, how does this tool work and become like a prompting expert at that so that you can be the one that either gets it to the continuity space or get something from Runway to be the good first output. And then you use traditional compositing tools and methodologies to get it, to level it up.

But this can like save a ton of time in the start. 

Yeah. And keep in mind, we're also in a transitionary period. Like these tools are not final, right? Like Aleph is just the beginning of what Runway is going to release, I'm sure a year or two from now. Or that a week or two from Like they're not done.

They're just getting started. So like if you are in on this early enough, you can sort of be in a really good position a year or two from now and be ready for the next wave of tools and innovation rather than sitting it out. I don't think sitting it out is a good option. No, 

you just got to start messing with this stuff.

Yeah. Yeah. Cause I mean, that's, yeah. Like when, what would have taken days or weeks of compositing work, even a crystal ball a year. Your favorite posted out a joke, you know, it was like listing out every single thing you'd have to do to erase all the cars and like a street shot of New York. And now you could just type in a prompt and be like, erase, remove the cars.

So like rethinking, you know, changing the way you think of like, Oh, you know, this was a whole process. Now I can just type

to do what I want. Do you need to start messing with this stuff too?

So my advice is like, start doing the research now, start watching Denoised and get in, get in Runway Aleph yourself, like you're doing.

Yeah. 

No, you got to mess with it. 

Yeah, I mean, you know, also in the in the VFX world, like I know there's going to be pushback or of the like, you know, this still doesn't give you the finite control or this still, you know, like in a big budget, you know, Hollywood movie where they're going through VFX and they're going through every single frame and giving notes.

And it's like, oh, you the shirt needs to change here. This needs to change here. And it's like, you don't get that yet with the outputs. 

But on the flip side of that is something like Netflix is Eternaut, right? That where they use that AI inserted shot. We're going to see just more and more of that. 

Yeah.

And it's just like, is it just going to change where it's like, okay, we do more stuff and there's maybe the pickiness level goes down or the control gets better or you figure out a way to blend Aleph with new Kirk traditional compositing to like get the best best of both worlds. 

think productions around the world are not going to wait until AI is ready.

They're already going to start using it and push AI to be ready. If that makes sense. 

Yeah. And figuring how to put this in the pipeline. 

Nobody's going to wait for Aleph 2 and then be like, you know what, Aleph 1 sucks because it's 1080p blah, blah, blah, blah, blah. I'm going to wait for Aleph 2. No, they're probably already going to use Aleph 1 to some extent and then figure out how to plus it and make it work until Aleph 2 comes Yeah.

And so much 

need of these tools anyways. It's like, OK, well, where do you need to get? And then what tool can help you get there? Which, you know, is usually a variety of tools. Like there's really no one shot tool that's like going to do everything for you. It's knowing all of these tools. Like, we're going to a small project now.

And it's like, I went through like every single model to like figure out for this type of work, which models work best for this specific thing. 

Yeah. 

And these models wouldn't work best for a different type of project, but they work well for this project. But it's just a lot of like, you got to know all these models and tools to figure out what is the right tool for the right job.

And it's going to continue to be that way for a long time. because I think it's like the tools are very specific. You know, you have your screwdriver, your Phillips head, your flathead, you have your drill and like, it's not going to become this Swiss army knife anytime soon. Are you finding yourself using different models at a shot level?

Sometimes I try to avoid that because then, you know, like we've drawn the film stock analogy, it gets, you get different looks. so like, as much as I can stay the process now, it's been like, let me get the blocking. Right. That I want, and I'm trying to stick with one model for that. And then there's a certain style we're trying to go for.

So then I'm style transferring it with a different model, but trying to keep that all consistent and then animating it with a different model. But then that hits hiccups because sometimes depending on the model, they only support first frame, but other shots I'm like, well, I know I want this first frame of this end frame, but this model doesn't support it.

So I gotta use a different model. So I, for I've been trying to stick with models because then when I tried different model, you get a completely different look and then it just kind of takes you out of it. So I've been. trying to figure out what model produces the best results on average for this specific project.

Yeah, which is such an interesting challenge when you're making a 90 minute thing, 60 minute thing, whatever, shot consistency and look and feel and texture, maintaining that on top of all the creative changes. Yeah. And for 

longer projects, we're messing with some stuff now, but I haven't gotten there yet.

But I've been thinking of breaking that down, of thinking of at least for the scene to stick with the similar workflow for the scene, different scenes, we use different models or workflows, but at least get the scene continuity and keep things kind of consistent there. 

And those of you who have not caught up on our AI workflow video, we released it, what, a couple weeks back?

About two 

weeks ago on last week. 

Joey covers it in detail. We even have a very specific diagram in there. 

Yeah, and now there is a link to the diagram because people are asking for it. you can. Check out the link to the diagram. And I opened comments, which I was worried about online. No one's left anything crazy yet, but if you have any comments or any 

yeah, 

to like of your own tools that I did not include on the diagram, you can publicly comment on the on that workflow to add your own tools that I missed.

One 

other thing I want to say about Aleph that we didn't really talk about too, is you can also change a camera angle so you can give it a shot of a video. And I saw someone pose a demo with like a car driving down a street and then being like, give me the low angle by the wheel. Give me the like aerial shot.

How cool. And so this, you're not even just changing. elements in the scene, but just changing the perspective from one, having a single shot opens a lot of possibilities and things that like, not even thinking of yet, but just kind of makes things crazy. instantly goes to lazy cinematographer. It's like, 

yeah, just shoot it like this.

We'll get it post. It's 

like, aren't you going to rent a techno techno crane for that shot? Why would you just shoot at head level and we'll just move the. 

Yeah, this is good. Yeah. That was just running to Aleph after, yeah. Dora, man, we got it. 

It's like, why are you, dude, this DP is like an Oscar winning DP.

He's on his iPhone. Like, we don't need an Ari for this. He's just running 

around in circles, just capturing 

everything. What are you going to do with this? He's like a glorified photogrammetry artist at that point. 

Yeah, fix it and post it. going to have a new meaning. That's crazy. All right. Now, not to let Aleph have all the fun, Luma AI.

release their own update. 

I'm loving the competition here. I'm also not surprised because it seemed like 

with Modify video that they released a couple of weeks ago, they had the capability. Modify video was you would give it a driving video and then you could update the first frame or kind of give it tell it what you wanted and it would sort of change it.

But now you don't have to have a first frame or have like a driving modification similar to all if you just upload the video and it's it's now called Modify with instructions and so you can just chat it out and love what you want to change. And so. The demos are very similar to Aleph where you give it a video and just say, 

change this.

And it changes it. 

You know what my dream scenario would be? Like a round table with like Amit and Cristobal and all the big. 

I wake up dreaming of that. This is what you dream about. 

Yeah. Yeah. Yeah. So no shout out to Luma. So you and I send each other videos all the time. I sent you a Jon Finger video where he's sitting on a fire hydrant and then Luma Modify removes that fire hydrant.

Think we're going to show the video here. And it just goes to Cristobal's point is like, what is the existing cost and effort of something like that, you know, with traditional VFX and how instantaneous is it now? It fundamentally changes the game. And if you're concerned about quality frame rate, maybe Jon's face not looking right, things like that.

I think you're missing the point because there's a big seismic shift happening with all of these tools. And I'm just so fortunate that we're living through such a revolutionary time. Jon, without an H, sorry, Jon, forgot it was J-O-N. 

Yep. He's posted a lot of Luma stuff. Yes. He just posted another fun one at Muscle Beach because he's near Venice.

He's in Venice, yeah. Shooting in the empty courtyard area. It's all that. Stadium area, and then running a bunch of different modifications to fill it in with the crowd, to add, turn it into a sci-fi stage. 

I've done crowd replacement and crowd edition work. How easy is that? Oh my god. I mean, so one of the challenges was I was using Unreal.

And as you know, putting 10,000 humans in Unreal. is a challenge in itself, right? Just a real time processing slows down. And then on top of that, you now have a camera match move variation in crowd animation, variation in crowd ethnicity, body type, costume. this solves that. 

Yeah. will say in the commentary I've been seeing from demos online between Luma's Modify video and, and all of his Luma does seem to modify the faces a bit more and off does the better job right now of doing what you want to do and sort of leaving the rest of it unchanged.

So like in some of these demos, you'd see Jon's face kind of warps a little bit. Yeah. 

It 

doesn't look exactly like him, but 

I think Jon knows it. But even, with that, it's so powerful. 

Yeah. modifying the models and improving upon it. But yeah, I mean, still look just overall complaining about the face and be like, Oh, well just removed the hydrant below him and made him float in space or this other demo.

It removed all the graffiti on a building and just painted it normally. 

Yeah. I mean, on the competition side and we're, we're comp, we're comparing apples to apples, right? Like they're so close, but it, you say that Aleph distorts less than Luma? 

That's what I've been seeing. I'm having a mess around with them as much, but that's what I've been seeing in some other commentary online from people that have run both and compare.

I just got access to Aleph. And then how much 

longer before Veo 3 has all this functionality? Something 

else. Right. Also, I didn't want to leave this out too, because it went under sort of went out of the radar because they're so new. But Moonvalley also has, they have some features where you can shift camera angle.

They features where you can kind of do a style transfer on the video. So they also have some similar features to this. They have like 

pose transfer, motion 

transfer. Newer to the game and the social media game. So like there's less awareness of it. But yeah, Paul Trelo posted some demos too of like, you can also.

Do similar stuff in Moonvalley. is possible. 

Yeah. 

You know, you just might not might not know about it yet because Runway and Luma have been around for so much longer. Yeah. But yeah, wild. All right. Next update in the open source video space. Big update. Wan 2.2 is released. Wan's a sleeper. Yeah. Wan is from Alibaba and Wan is probably the best open source video model like Vio3 level realistic quality.

But you could run it. If you have a powerful enough computer, could run it on your computer. It's free. Or yeah, you can download it for free, run it for free on your computer. Or a lot of the API websites like Fall and Replicate offer pretty cheap access to just calling it up and using. They're much more faster and powerful computers because I ran this and it takes a while.

Dude, video generation on a local PC is no joke. 

Yeah, they two version. Or they have three models. One is a text to video model with 14 billion. billion parameters, another one is an image to video model, also 14 billion parameters, and then a unified video generation model that has 5 billion parameters.

Yeah, 5B, 

that's the one you want to roll with for local. 

Yeah, I did roll with the 14B. Oh, shit. And it took 44 minutes to generate a five second video clip. Yeah, 

that's not too bad. Honestly, that's not too bad. And I was running on a beefy card. And it's 720p, right? 

It was 720p. Yeah. Yeah. So yeah, I mean, if I knocked it down to 480, it would have obviously been faster.

Yeah. But it was, yeah, 570. But it's crazy to even 44 minutes to have something that doesn't generate it on your local computer for quote, free. I mean, obviously, the energy and the hardware cost. 

Yeah. So one of the things that I love to watch on YouTube, I know I'm such a sucker for this stuff, is. Some of the folks take historic photographs.

Like they'll take a photo of Cleopatra and then they'll like slightly, I'm like, 

oh, that's cool. That's what she looked like. 

So I did some of that with like my family. Like I have childhood pictures of myself, my cousins, whatever. So I did some of that with Wan 2.1 and I sent it to them as a gag. It was fun.

So with Wan 2.2, there is a slight architecture change that I want to cover and I find really interesting. So the way they broke up the diffusers and the safe tensors is they broke it up as a high noise and a low noise model. So you have to download both or several versions of both. I think it's in pieces because each piece is like five gigabytes and it adds up to like 30 or 40 gigs.

Sounds right. And the way it denoises is first when the noise level is really high in the earlier steps, it'll use the high model. And then at some point it'll switch over to the low model and supposedly like that's what is giving you the more superior result. I just find it really fascinating because is this the trend of what we're going to see moving forward as like more segmented, more broken out models?

Of like how the model works entirely or of? Yeah, like 

architecturally. 

Yeah. 

Yeah. I just find it really fascinating. the benefits 

of you like? I've seen people comment on the controller with a high noise, low noise, but I don't know exactly what difference that makes. Yeah. So 

no idea myself. So just from like a high level Um, AI perspective, the higher noise is where a lot of the creativity happens.

the first formation of the image happens in the high noise steps. like, if it's if you're, maybe if you're trying to get more of a creative output, then just let the high noise thing churn for a little bit longer and then let the low noise stuff refine and give you that fine detail, fine texture and all 

Yeah, maybe that's how it works.

Yeah, I'm also seeing in their post to the because they call it a mixture of experts architecture. Yes. And specialized experts handle diffusion denoising step time steps cooperatively. So it also seems like an efficiency gain to to like run both to get faster output. Yeah. The other improvement, too, is more understanding of cinematic language.

So lighting control, camera movement, composition. And it has done and even the demos has really good job with them. demoing of just very cinematic looking and realistic looking shots. 

Yeah, and you can kind of tell with their marketing too is that they're really aiming it towards your Luma, your Runway, your VO with this one.

even as generic as the Alibaba website is for this, it still has the word cinematic in there and where 1.2.1 didn't have that. 

Oh, 

OK, interesting. So that's an update. So they're shooting for 

our industry. Yeah, and mean, the quality is up there. I mean, look, it doesn't. generate audio like Veo 3 does. But realistic quality-wise, really good.

Yeah, yeah. Especially 

for open source. 

I haven't played around with it. I don't have a nice GPU like you. I just have some.

Technically not mine. I'm borrowing it from a friend. 

Yeah. So I just have a 3090, which is what, two, three years old now. And I'm going to try to run this at home in the next week or so.

Yeah. Anything you're going to try to do offhand. 

Yeah. Maybe all the Roman emperors. 

Yeah. So that one, that one, that one's been a fun one. Yeah. All right. Ideogram, which, sort of, you know, kind of forgotten about a little bit. Hasn't been in the news as much. I've always been a big fan because I was, they, they did text.

They did text really well better than anyone else for a long time. Yes. 

That's their claim to fame. 

Yeah. And I've liked a lot of their outputs. Sometimes I like their outputs more than Midjourney. I usually as my, it's usually one of my go-to image generators. Oh, sure. But they released a new feature called Ideogram Character.

And it is basically one shot, just upload a single image of you or whoever you are trying to add into your generation. And it puts them in the scene. And it's one of the best looking character generators that I've seen from a single shot. I'm 

looking at their website now. mean, obviously, it's going to be polished because it's on their website.

I mean, those are. pretty impressive results from just one scene. Yeah. I I've 

tested it too, from a single image and got like really good looking, impressive shots. I haven't tested it as much for, cause it was a lot of the, a lot of the demos that they show and like templates to kind of give you some template ideas are kind of portrait or modeling where it's like very posed face facing towards camera.

I haven't tried it as much where it's like, if I'm trying to stage a scene and, know, maybe get more of like obscured or profile looking up side of the faces. I haven't tested it with that to see if it. handles that well because all the demos are like, smile at the camera, smile at the camera. Yeah, 

it's really changing the angles that test the model.

Yeah, like if 

the face is obscured or you just see part of the side of the face or you want to put a hat on them or something else. look, overall from a single image, the results are really, really good looking. 

Yeah, it's right after a Flux Kontext, which also has similar capabilities, I believe. So on the video generation side, things are leveling up.

but also on the image generation side, things were leveling up as well. 

Yeah. Yeah. Flux Kontext. I've really been enjoying. That's been a good one. 

Yeah. 

To yeah, I recently learned that like a lot of the, was looking for kind of just controls, like doing style transfer of Flux Kontext. And then it doesn't have specific controls to give it.

But then I realized like basically just prompt it. 

Yeah. 

I've like, don't modify the framing of the original image, but change this. And then just does it. Oh, wow. Where I was trying to like, look for like a denoise slide or something to be like how the weight of how much it should. look for the original image and it's like, it doesn't even need that.

You just tell it, don't mess with it. 

You're trying to be super technical. Yeah. Where's the slider? And I was like, just, tell it. I'm a pro. I'm going to use pro words. It's like, dude, just tell us. Did you see on their website, they use the Coldplay meme, the Coldplay concert meme to do a magic fill. So we're looking at the picture of the astronomer CEO and 

I guess.

I guess that 

works. Yeah, look, I'm for that, for pulling from pop culture references to memeify yourself. 

And they also have selfie with Einstein. Some Kodak camera stuff in there. Yeah, that's great. Ideogram understands social media and pop culture. Yeah. 

What is a picture with a guy holding a gun and a puppy?

Oh, no. Are we going to get banned 

from 

YouTube for sure? 

To 

blur it out. Oh, yeah, you could draw masks. So yeah, if you have an existing image too, you can mask out. the image to swap out the face. 

Yeah, it's doing regional inpainting and regional correction, which is awesome. Inpainting is one of the hardest problems for models to solve.

It's like, don't change anything else just this little bit, and then blend it in. 

Yeah, so another good update from Ideagram. I'm glad they're back in the game, because I've always enjoyed them as an image generator. And they have good API support as well, which is also sometimes we forget about mid-journey, because they're talking about rolling out an API, but they don't have an API.

like, and you're in the multi-tool space, they never show up because you have to use them on the platform. 

Yeah, I'm not a big API consumer, but I think you are, right? 

I mean, I've been messing with them more in Comfy, but also using like Freepik more. These tools only work if the API 

exists. That's true. If the company offers an API.

It plugs into other things. Yeah. 

So you need that for these like super tool all in one things to work or to build out a... a comfy workflow, like it's only going to work if there's not a local model and you need to use an API version, it only works with the API. All right. Photoshop also super impressive feature back in the game is a 

Adobe's back.

Don't forget about it, about us offering you very useful features. Big one in Photoshop and beta, a new feature called harmonize. So this was wild and looks extremely useful. And every talk I've seen online is like, Oh, this is actually like extremely powerful and useful. 

Yeah. This is what people are actually going to use.

Yeah. 

Basically you take whatever image you got, you just take cut out of. what you want to place in the scene. So let's say we had, I don't know, a coffee cup for lack of a better example. And we put it on the table. And then you say harmonize. And then it just automatically blends it in the scene, matches the lighting, the shadows, the color.

perspective. We're already at your perspective. Like assuming you have a wide shot of something and you drop. 

But the object could be off perspective. Oh, I see what you're saying. I see what you're 

saying. 

Yeah, 

I think so. Yeah. Yeah, that's awesome. 

Yeah. And it just blends into the scene. Yeah. One click. This is so useful even for our industry.

We're doing conceptualization, previs, and you're adding a bunch of things into your set. Now you can just do it in Photoshop and then hit harmonize when you're done. 

Yeah, for art design, planning, and yeah, that's a great idea. 

You really have no excuses now with ideogram and this and all the other tools to have the perfect little- really good-looking- Good-looking 

previs.

Yeah, I think back to with- I remember a friend worked in one of the company that like they would just do the like 3D previs stuff for like commercials. Sure. Yeah. But the previs was like the worst looking like blocky 3D characters like all chopped up just like, to move it. You just literally like the characters literally like floated across the scene.

And and it was like they would get paid a lot of money to do these things. I mean, that's what that's terrible 

floor like a big company here in L.A. That's their whole business was that, right? Delivering that level of previs because before that there was none. There was no computer generated previs. Yeah.

Right. There was like hand sketches and things, storyboards, if you will. Yeah. Yeah. Yeah. So like we go from that to this, yeah, now there's 

no reason why your previs should not look good. And that's a big use case too of bigger movies where it's like, the, the, generative AI stuff, like, sure. Maybe some doesn't make sense for a final pixel, but to like be able to like roughly make out a pretty decent version of your film.

to see if it works and test it and see if you need to do script rewrites or anything like that, to have an audience test with a rough version. 

Especially with the way our industry is now, there's so much scrutiny in the beginning to green line a project. Yeah. 

you need this more than ever. Right, for right.

1,500 grand, you can make a decent looking version of a film to test it before you commit to $20 million, $30 million. 

Yeah, go into that Netflix office with this stuff. You'll be way better off. 

Yeah. No reason why this stuff should look bad. Yeah, this other demo on the Harmonious is a picture of a flamingo floating on a pool.

And then it just adds the reflection and gets the shadow, the reflection, everything 

right. Yeah. mean, sorry, Adobe. I got to say, they're limited by Firefly itself. think because I believe Harmonize and- Well, mean, 

it's interesting. Firefly is not mentioned in this at all. Yeah. 

But I think end of the day, Firefly is doing some of the generative filling and generative- Harmonize, which is powered by Adobe Firefly image model.

Yeah. So it's like, he's flopped out Firefly for like flux or something. Like the results will be infinitely better. Because I think the Harmonize technology rides on top of their in-house model. 

Probably. But I mean, that's probably also how, because they know their model is commercially safe. like, think they would open up.

I mean, maybe they'll give you the option if you want to use other models in the future. Like the Firefly web platform now is like, being more of an all-in-one spot. And it's like, oh, you want to bring another model? Sure, cool. I'll do whatever you want. And now it's trying to be more like a free pick kind of tool set.

Yeah. 

Yeah, inside Photoshop itself, they probably, and also probably going, assuming bulk of Photoshop users, maybe you're not deep in the AI space, you know about the model. So it's just like, oh, this is a feature. You don't want to know what's going on under the hood, but we also don't want to commercially jeopardize you if your company's policies are still kind of up in the air with that.

So we know that the model that's using the save. 

Yeah, it's like the training wheel for AI, right? Like Photoshop, know, anything accessible and clickable in Photoshop is like a lot of artists first foray into using AI for an actual project. 

Yeah, but look, I this is probably a good indicator too of the future of like what AI looks like for like mass consumers where it's like Firefly is not like a big showcase thing or the AI model.

doesn't say anything about AI. It's just like, oh, hey, this thing does something useful for me. It's harmonizing. a that I know and I click a button and it just works. And I don't have to think about like, how do I prompt this or anything like that? I think this is a good indicator of what this AI use in the future will look like, where it just does the thing you want to do and under the hood, and you don't have to think about it.

For sure. I think this is equally as groundbreaking as when Liquify and Photoshop came out, like, what, 10 years ago now? Do you know what Liquify is? There's a query you can of You can nudge endpoints and splines without distorting the image. It's automatically, like, if you're modifying a face. a warp, 

but.

not distort like that. Yeah. 

So like if you want to make a fat face, skinny or something like that, right? Like you'll push the cheeks and, and it won't mess up the nose. Okay. Yeah. So it's like hyper aware of the structure of the image and this is pre AI, right? And so when they dropped Liquify instantly, it changed like Instagram and social media overnight, because everybody was using it to like have these absurdly crazy body proportions or things like that.

I think with harmonize, you're to see an equal revolution. with product photography and advertisement and things like that. 

Yeah. Yeah. You just drop it in and be like, product. Yeah. You take a picture of your product on an iPhone and drop it into a scene and then be like, OK, looks good. There you go. And then some other updates, they have another beta of generative upscale.

So this is interesting too, because it's kind of going in topaz's turf. For sure. But basically using generative models to higher upscale an image using, I forgot, I don't know the science of traditional upscaling. 

Yeah. Traditional upscaling is just using algorithms and math. so it'll, it interpolate between the pixels.

So if there's a red pixel here and a green pixel here, it'll go, you know, red, yellow, yellow, yellow, yellow, just like make that up versus here. If there's a red pixel here and a green pixel here, it's letting AI do that work for you. it's using 

the base image as the, like, this is the original. source and then just as the noise.

Yeah, and you could clearly 

see like it gets rid of that low low resolution grain and the noise from the camera, which I don't know if that's a good thing or not, because then the image comes out super plasticky and 

yeah, when I've upscaled like archival stuff, I've like in my brain, it's like, does this look way better?

Like, it's just like, look really good. 

Yeah. 

Or and I think it's weird because it's so clear. Or does this look too weird because it's Not it's upscaling weird. 

I have a weird analogy for you on generative upscale and I'm going to compare it to tall towers. Yeah. Like like the space needle in Seattle. Okay.

So as you know, I just came back from both Tokyo and Seoul. Okay. So Tokyo has the Tokyo tower, which is equivalent to like Eiffel tower. It's not that tall because it was built a long time ago. but because it was a old school build, uh, the building sways a little bit. And it's kind of weird. Like at the top, feel like you're on a ship.

So the observation deck is like a 250 meters. It's not that tall, you know, and you go up there, but you feel like you're way the hell up there. Like you're really swaying around and then you're like, Oh my God, I'm on top of the world. Okay. So then, uh, we were at the Seoul sky in Seoul, which is at 550 meters, which is like more than twice.

the height of the Tokyo tower, but because it's a newer building, was like built 10 years ago. It's rock solid. Yeah. And so you're like, Oh, I guess I'm really high up, but there's no sensation. Right. There's no, feel like you might teeter over. Yeah. So I think that's the same thing with generative upscale.

know it's a crazy analogy. It's like that little bit of grain, that error, that noise is what's giving you a sense of realism in your brain. to remove that. It's 

the same reason why we add film grain back into digital images to give it some sort of texture, some feeling, which, I mean, ultimately, is your creative choice.

Yeah, I think pristine, sterile, noise-free is not the way to go here. 

Yeah. There's also even, they always keep coming back to it with Netflix, because they wrote about on their technical blog how, for encoding, they'll strip out the grain from the films to encode it. But then they'll add the back in during the streaming process.

Something that I think I'm very paraphrasing what they do. But it was like basically like the, technical encoding process, they wanted to get rid of it, but they don't want to play it super clean because it'll look too weird. And they need to add the grain back. 

Well, that's a, that's the whole reason why ARRI cameras are preferred.

Like one of the main reasons why people, some DPs love ARRI cameras is you have infinite control over the grain you insert into the image. Like you can even pick the grain on the camera. 

Oh, yeah. I realize that. Yeah, so like you start off. camera usage, I'm not actually. Oh, and speaking of ARRI ALEXAs, they just 

dropped the high frame rate camera recently, I think a couple of weeks ago.

So now ARRI ALEXAs can do like 240 FPS. Is this 

a new camera or firmware update? I 

think a new body. 

OK. 

Yeah. So 240 is pretty 

good. Yeah, that's it. At 

4K? It looks insane. Yeah. That's pretty good. Yeah. 

Man, I remember you had to buy those Phantom or rent those. those extremely expensive Phantom cameras. Yeah, 

and they would sound like this.

No, wait, they didn't sound like anything, did they? Yeah, they were quite loud. I remember we rented one once. 

yeah? 

They had massive fans on them. 

I remember you had to have like a whole computer and stuff attached because like they were not built for film, they were built for scientific studies. 

Yeah, when you were at NAB, I remember you covered a product that like has 90% of a Phantom.

It was like, yeah, they do, where they say they do 1000 frames per second. 4k, something like that. Yeah. mean, they had, it was a beta and the other cameras in the video. I think I touched one. was like, no, it's just a model. Yeah. was a, it was like a 3D print of the camera, but they had one working camera there.

mean, they said there's put the ship later this year. I'm curious. mean, his background was in scientific imaging. So he, yeah, he had the, yeah, the, the goods. like, could believe he could do it. Okay. yeah. I'd curious to see when it comes out. Cause yeah, also that would be crazy. Cause the pricing was. Yeah, was like Blackmagic pricing.

Yeah, crazy pricing. for something like that to compete with the Phantoms, which those things were like 100 grand. Five figure, six figures, easily. That'd be awesome. Yeah. Also, think a good copy, a good buffer against AI generated stuff where it's like, oh, OK, well, let's just go shoot really crazy stuff in real life that.

people are going to say, that AI generated? But it's like, we shot for real. 

Well, I think imaging is now more important than ever because it is the input into all this stuff, right? So the better your images are going into an AI system, the better results you can expect. I guess it's the same with VFX too, but in the past, we've had a lot of hands on the frame to get it to a place where you need it to be.

The whole premise of AI is you don't need a lot of hands to get it to the place. Yeah. So then it comes down to like, what is your source image? How good is that? 

Yeah. All right. Next up, Hunyuan 3D World Model 1.0. So they released a new, a world model product. Well, I think world mouse wherever it's not a world.

is. 

It's 

a three 

generation model. 

Yeah. So you could generate a 3D space text or image to 3D. It looks similar to kind of world labs demo that I saw a while ago where he creates like a 3D sphere and you sort of move around it in the demo, seemed like you could only move around a little bit, but it also generates the meshes and textures so could bring it into Unreal or Blender or something else.

Yeah. Yeah, I'm kind of skeptical right from the beginning because it's not just about generating the geometry or the mesh, but what quality the mesh is, how segmented they are or how grouped they are. as you know, modeling and world generation is such an art. 

For sure. For sure. Yeah. Above my head. yeah, I prefer to do it.

Well, I'm interested. Well, first off, this can run locally, so that's cool. Cool. I'm curious about this stuff, not necessarily from a photo reel or highly accurate version, but more like generating a 3D space for those use cases. I've talked about before of like maybe filming a scene like in a virtual production kind of space where you want to have reverse angles and have like a consistent background.

And it's like, okay, if you got something that could quickly generate a 3D environment that looks kind of consistent and gives you some parallax. That can be a good lift. 

Yeah, for virtual production, I absolutely see the use case here. However, for games, I think it's far more complex. 

Yeah, to move around.

need that. Yeah, 

like game maps are designed to a T, right? Yeah, that's 

yes. And that's also like you're not the director. can control what angles we see. only need to worry about what we see. Users need to go anywhere. And also like 

compute resources, memory size, all that stuff is so precise, right? You know, in a given game, you're given, I don't know, 100 gigabytes for the whole game.

And then each level comes down to like two gigabytes. Like you have to fit the geometry, the texture, the players, the props, everything in there. And this is not going to give you that level of control. I'm sure. 

Here's some more details. 360 panoramic generation completes complete, complete immersive world scenes, explorable 3D generation, interactive, editable, exportable meshes.

So I've been, yeah, I run, run locally. 

Hey, off topic, what are your thoughts on where virtual production is colliding with Aleph and tools like Aleph? I guess my question is, do we need LED volumes in the near future? 

In the near future, yeah. I mean, look, Aleph is five seconds a shot. And we talked about before with where the consistency is at, because all the demos we're seeing is shot, shot, shot, shot, shot.

But it's like, if you're in a scene, and let's say If the example you're trying to do is like, let's shoot our actors on like a white empty stage and we'll add this background in through Aleph. If we're going to be able to consistently get the angle and background shot to shot, setup to setup. I don't know if 

it's there yet.

just you know, I spent the last four years in LED volume world and I, know, all of us as an industry, we fought tooth and nail for every inch of, uh, taking green screen business and putting it into a volume business. I feel like with all this new stuff, it just kind of goes back to green screen. Or white screen.

Yeah, or no screen. Yeah, 

mean, what are the bigger arguments for VP? mean, it's still the benefit of like, everyone can see what you're trying to film and what's happening. The actors can see what's happening. Sure. I was going to say like the lighting benefits. it's 

like relighting. On materials 

and people and stuff.

But you know, Aleph does its thing, like. And it seems like it can kind of light the people too. mean, yeah, it's still version one of Aleph. I don't know if like, I'm trying to think, there a way, is there a blend where it's like, maybe you don't have to have as great of an LED screen, but you can put, and you can put something sort of.

Projectors. Projectors, but also like, oh, what if maybe you can, maybe you can kind of gray box or kit bash the environment. You don't have to have such an awesome, amazing environment. It can save on the pre-pro work that normally has to go into VP. And you can sort of have something up that people can see, but then maybe you run it through something like Aleph or like that.

That turns into the background, but like we 

have the geography and stuff is consistent. So maybe it's some hybrid of the two. Yeah. 

Yeah. No. great point there. 

All right. Yeah. Right now. No, I don't think this is at the point. mean, also like you're going to be getting to have to do it in five second chunks with possible questionable consistency and a 1080p output in like 8-bit color.

Yep. All right. Other tool that I saw that I just want to dive into more. It's called Morphic. And so they said introducing 3D motion. so similar to what we've been seeing in the world, you give it an image and it builds out a 3D model of the space and you can move around the camera in this 3D space. Sure.

And create different actions, movements. Yeah. I mean, just another sort of 

crazy it's generatively, it's kind of doing what Han-Yan's doing, right? It's making a 3D world, but not for exporting, but just to navigate around. 

Or just extrapolating some depth. think also this is something similar that Moonvalley does too.

Like in this image here, like it took the image and kind of extrapolated the woman. And then you could add some sort of camera movement to the woman. And it looks like it's like a pretty slick interface, like easy to use interface. 

Yeah. My guess is this is a lot of depth map generation in the background with generative filling the depth map and so on.

Yeah. Yeah. But yeah, popping my radar. I thought it was interesting. I'm very cool. around with it. Because also I'm into. Interesting useful and easy to use tools that can help speed up the process 

Yeah, I mean there's no end to like tools that we can cover on this show 

No, I had to cut out the stuff we miss like we didn't even you're gone when like Runway Act-2 was updated.

Oh, yeah, like well, I feel like that's too late now to Talk about it. Yeah All right next one Scenario this blew my generate in parts. This is not text to 

3D, which 

is not 

new. This is your your underwhelming Underwhelming it. It's not just text to 3D. It's text to 3D in a way CG productions need it in pieces.

I was trying to set you up to come 

in with the big drop. This 

is mind blowing. Don't sleep on this one folks because one of the challenges with 3D generation is that the mesh sucks. The mesh is inseparable. It's not in pieces. You can't re topo it or you know. You can't model it on top of the generation.

Well, check this out. They're showing an example of a still image of a Lego character and then the pieces of the Lego character, including the socket at which the Lego character goes in and out of the head connects to the body. Like it's all modeled. I mean, this is crazy. 

They explode the model and it all looks like accurate Lego parts exploding that it generated the entire character, but it accurately can pull it apart in the 

separate meshes.

In the old days, this is what a modeler did and was paid very handsomely for is not just generating the thing, but making it into modular little pieces that could be Modify down the road. So now you can just texture the goggles and just texture the face and so on. And then give it to another artist who can then modify it further or add, it to a rig and rig it into a character.

On the links and posts we're looking at from the co-founder and CEO of Scenario, Emmanuel de Maistre. Sorry if I slaughtered that. At the heart of it is Partcrafter, the first open source AI model that intelligently splits objects into multiple 3D parts, each with its own clean geometry and structure. One image in, structured 3D out.

Yeah, my guess is this is perhaps aimed somewhat at our industry, but really it's aimed at industrial applications where you have CAD level precision needed for 3D object generation. I think this is a really interesting space to watch as 3D generation levels up. Trying to figure out how to print something.

Yeah, exactly. Or even like, if you want to build things in 3D and simulate, like if you want to put a car in a wind tunnel, you would need to build it in pieces so you can move the pieces around, you know, things like that. 

Future of F1. 

Yeah, nuts. Big 

leap in, yeah, we don't see as much gender 3D stuff, but that's a big leap.

Absolutely. To helping move that forward. Very impressed. All right, another update in my favorite Google tool for learning new stuff, NotebookLM, which we've talked about before and gotten mixed up before. You don't use that, you? I use that all the time. That's great. That's how I learned to comfy. That's like for a comfy episode.

but yeah, that just notebook LLM, can give it a bunch of documents, YouTube videos recordings, and then it's sort of creates a smart little notebook area where you can ask it questions or give it mind maps, lesson plans, outlines ask you quiz you about the stuff. But my favorite feature of it was that can generate a fake podcast episode of two hosts talking about.

Whatever the thing was. That's the best. That's been around since they started and that's been the best feature, which is like a fake podcast episode, breaking down whatever this technical topic is. Now they've added a new feature video generation. so moving the podcast thing sort of forward, it'll create a video presentation explaining whatever is in your notebook.

The videos right now are kind of whatever it's sort of like a PowerPoint slide show. So it'll create, it creates fake slides, breaking down the topic. So it's sort of like you're watching a Power No presentation. So they're not coming for Addy and Joey just yet. Not yet. But once they connect this with Veo 3, which I imagine would be expensive right now, and there's someone showing the concepts, 

that'll be 

kind of crazy.

Should 

we just do this podcast with AI? You'll save you a commute. You don't have to drive over here. 

Instead of that, we need the Google hologram table or something. So I can just pop into my studio, you can hear it. We have a hologram meeting. 

Yeah, it just goes to show like, no matter what technology comes our way, like human to human interaction, analysis, color on things that's never going to be replaced.

Yeah. 

Yeah. I don't think so. 

But when you're in need of a very specific fake podcast episode learning about a very specific scientific paper. 

Yeah. And you just want to listen to it in your car or something. This thing comes in great. 

Yeah. Yeah. I've done that all the time with like the papers and I'm like, they have a iPhone app.

like you could queue up the audio and I listened to it on the drive. That's crazy. Yeah, that is the one thing that drive over here helps me with. can catch up on podcasts. And the last one, this is an open source tool that someone built Albert Bozesan. Shout out to Albert. Yeah. Built a cool tool called Shotbuddy and it is basically open sourced it.

It runs locally on a computer and it's a tool that helps when you're dealing with AI generated stuff is usually a lot of image generations and versions and chaos. And it can turn into a really big mess in your download folder. His tool, it basically, uh, is a quick little shot list generator. Every time you add a new image, it makes a new shot.

It adds a shot ID, and then it has a spot where you can add the image or add the video generation. And when you do it automatically moves that file to a folder on your computer, renames the file to the shot name. And then if you had a new version, it moves the old version to a version folder and then replaces your current folder with all of your latest generations.

And so there's this really nice handy tool to keep your generations a bit more organized at some notes. I keep things a little clear. 

It's purpose built for AI filmmaking. 

It is targeted at AI filmmaking. He open-sourced it. It's on GitHub. can download it. We'll a link. And yeah, I mean, I've been thinking of he released it and he's like, yeah, I'm people have other ideas of how they want to fit this into their workflow.

I'm not a programmer, but I've. On my list of things you're a programmer. 

got cloud code. Well, that's the thing. I 

only have I only have regular cloud, but I haven't installed cloud code yet. OK, but that's been on my list of things to do. And so I'm thinking like, oh, well, let me give it a spin and see if I can connect.

What are you ready built? What she said on the bottom, it says built with like chat GPT code. Wow. Or open AI code or whatever. Nice. And if I could modify add the extra features that I'm looking for. So that's my on my growing list of. Things to test and explore and project. Open Source 

Joey 

over here. Yeah, figuring out coding without blowing up my computer or whatever.

While you're rendering on WAN 2.2? Yeah, with that render. 

Yeah, so yeah, just a shout out to ShopBuddy, we'll have the link and stuff to it. But a super cool app and cool to 

Yeah, we love seeing underdogs do innovative work. mean, that's what's so exciting about this AI space is that you don't have to be Google.

You don't have to be meta to make a dent here. the field is wide open for anybody. 

Yeah, I mean, it's only getting easier and easier to make stuff. think to what we've been talking about with VFX artists and arts in general, the taste factor and the quality control is just going to be more more important of knowing what you want and the direction to steer it.

But the execution part is just getting easier and easier. 

Yeah, and I know a lot of VFX artists personally, and so many of them have a higher level of quality and quite honestly a higher level of creativity than some of the directors they've worked for. I'm sure. Yeah. And they've been frustrated the entire time during a production.

Like this is your chance to make the thing. Well, now you can make the thing. Right. You don't 

need to have a massive budget to make the thing. You could make the thing and use your towards the eye. You have 

the skills. You understand content production, right? 

Yeah. 

Yeah. 

Yeah. 100%. All right. Thanks for everything you talked about.

as usual at denoistpodcast.com. Thank you 

for your support. we are waiting on an Apple podcast, five star. you can give us one, I'm begging here. I'm just kidding. If you feel like giving us one, that would be amazing and highly appreciate it. 

Thanks for watching. We'll catch you in the next episode.