Denoised

Reference Images vs LoRAs: Which Actually Works?

VP Land Season 5 Episode 4

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 38:40

Addy and Joey put two AI workflows head-to-head: training a LoRA with Z-Image versus using reference images with Nano Banana Pro to replicate film cinematography. In this episode, we test whether LoRAs are still necessary for capturing specific cinematography styles, or if reference images alone can deliver the same results. Using the VistaVision look from the film One Battle After Another as our target, we explore practical workflows for pre-production and previsualization.

--

The views and opinions expressed in this podcast are the personal views of the hosts and do not necessarily reflect the views or positions of their respective employers or organizations. This show is independently produced by VP Land without the use of any outside company resources, confidential information, or affiliations.

Welcome back to Denoised. In this episode, uh, Addy and I are gonna have a bit of a showdown. Uh, we're gonna do Z-image with all LoRAs versus Nano Banana Pro with reference images and see what we get and who has the best workflow. Yeah, Joey, uh, let's keep it friendly here. It's not a real showdown. Now a showdown to the deaf. Oh my God. The, the loser gets to shoot a movie in the desert. This is a, an AI duel. All right, let's go do it. Okay. Look, the reason I wanted to do this is because there is a debate. We've had this debate, and I hear this, we've had this debate is like, do you even need to train a LoRAs and do all that stuff? Laura's dead to get this. Yeah. Style. Uh, Laura's dead when Nano Banana Pro and Flux Contests, and on all these newer models, you can just throw a couple of references in and get the thing that you want. And so this one specifically, we're kind of trying to target the cinematography style of an existing film. Saying if we're a pre-pro, we're trying to do some reference images and we wanna know, we want it to have the vibe of another film. That's what this experiment is. And so the film we picked that I've been biased all of 2025 as my favorite film, is one battle after another. Unsurprisingly Shop Fanta. Yeah, like, uh. I, I let Joe, I was like, Joey, pick, pick a movie. And uh, he picked that. I was like, fuck man. Of course. What? What'd you think? Spot on. Yeah. So shot by the amazing, uh, Mike Bauman on Vista Vision. But yeah, we used Shot Deck and grabbed a bunch of frames and initially I grabbed kind of like, uh. Mix of different frames that just look cool and then initially grabbed a grab bag. Yeah. And then we were like, eh, it's too varied. So like, then we're like, okay, let's just go after three sort of distinct looks that happened during the film. So one look is the first part of the, uh, the film, which was more of the magenta border patrol raid, uh, that had this like nighttime magenta look. Mm-hmm. The second look I grabbed was when they were kind of running around Sacramento. It, you didn't know what the city was, but they shot a lot in Sacramento. Downtown Sacramento with this sort of neutral office building look. And then the third one was the desert. Area, which was a lot more orangey, desert vibes, which is where you went, which is where I went. There is a video and another upcoming video once we finish it, of tracking down all the desert locations. The second video also has to do with G and Splats and Beeble and a bunch of other random stuff that we're experimenting with. So awesome. I'm excited once it's finally done. I'm excited to see it. It's once it's done, it seems Some cool test. But that would be a cool video. Um, and I also tracked down the, I've just had a mission to find where the Rolling Hills were, and I did find them. They're like four hours outside of la. But, um, anyways, you even found the, the tire marks on the road, which is crazy. Yeah. From the production problem, it was like from the production when the car is like screeched to a halt. Yes. They're, they're still there. Had to slam into. Yeah. So anyways, we got those shots and as four of these sort of like, okay, three kind of distinct looks, can we replicate those looks and generate other frames that. Have the same vibe, uh, that feel like they might have been part of that scene with that same style. Yeah. Cinematography for the idea of experimenting with previs, uh, or just kind of planning for another shoot. Yes. Let, let, so let's take a step back and as to why we're doing this in the first place. The power of generative AI in pre-production. Uh, you know, before you get into production, that's undeniable, right? Like. There's absolutely a way to apply this transformational technology to something that's really cumbersome and, uh, there's a lot of iterations, a lot of decision making that happens, which saves tons of money down the road. So during production and during posts. So if you do pre-write. The idea is you, you could be way more efficient in pre-pro, in production and post. Yeah. The challenge is how do you build the look and feel of a film in pre-production? How do you do it consistently? How do you expand on that universe that still in your head, onto a screen where you can share with others and get creative notes on it? So this is the inspiration, this is the motivation behind why we're having the showdown. Yeah. And so Addy went with the lower route. With Z-image. And yeah, let's start off with, you add, so like, why don't you walk through how, how you train the LoRAs. Okay. So we talked about Z-image being sort of the, the sweetheart of the creator community at the moment, right? Like, so on the video side you have LTX two that's doing a lot of cool stuff. And on the image side a lot of people, uh, are. Feeling Z-image, the same way they felt about SDXL just a few years ago. It's fully open source. It has strong image to image capability. It has control net. You could train LoRAs on it. So you can do a ton of stuff with it. And you can go on, you know, markets like Civic ai, you could buy the LoRAs, download the LoRAs, you can train your own. So I wanted to use the image. It's pretty easy to do and it's widely supported. So what I did was, instead of. Doing it in Comfy, which actually I'll do on another episode. Um, coming Your way. I wanted to just. Quickly get this done. So I use foul. And so first thing, Frame.io was to figure out what the LoRAs is actually intending to be used on Frame.io. It was the look and feel of the movie specifically, it's the Vista Vision Look, right, like that film. And that camera has a very specific look, even compared to something like IMAX. And also the color grading choices that were used throughout the film. Some of the compositional choices, some of the. The intrinsic landscapes that it was shot in. So like the physical location, the fingerprint of that location would actually come through into the LoRA. So trying to get all those, extract all those so, um, creative notes out into, uh, an AI model and to repeatedly generate it in that world. Yeah. So let's, uh, quickly go through the training set. Okay. So, as I said, uh, as Joey said, the training set is just from shot deck. And again, uh, I just wanna highlight that this is for educational purposes only. This is not something I would ever do in a real production for that. I would actually go out and shoot, you know, a small film and then use a lot of that as training data for LoRAs, right? Like, this is one way to just kind of shortcut that process because if you were doing this for real, you would like maybe on the location scout. Take photos with your phone, kind of color, grade them to the look you're thinking of. Mm-hmm. And then train that and use that as your source. Is that what you're thinking? You could do that or you can actually. Shoot a couple of sequences, like a lot of, a lot of productions just go out and they'll shoot one sequence or two sequence that could be used for pitching that could also be used for LoRAs. Okay. Interesting. I'm thinking like, look, when you're in the sort of previous phase, like I've seen a lot of pitch decks and stuff where you're just trying to get the idea across. So a lot of times they will just grab images of like things that already exist because it is a similar style or tone of what they're trying to get across. Obviously there's no, that's never gonna be released or public, but in that realm. Like I, I still see a use case where it's like, yeah, you would grab images, maybe not from one film, but from like a variety of films that have a certain, look, multiple films, and then train a LoRA purely for like pitching or, or reference stuff to figure out how to get, 'cause even if you're like, oh, we might shoot like one or two days of test stuff, that's still like a big investment. And you would want to Yeah, clear your, you would wanna nail down like the kind of look or cinematography you're going for before you start spending that kind of money. Yes, absolutely. You're a hundred percent right. I think it's a, it's really a budget question, right? Like what film it is that you're making. Mm-hmm. If you're making a $10 million movie, yes there is budget for test shots and test sequences, but if you're making million and under, there is none of that doesn't, the best you can do is have, have a camera and go to the location. Think that's even more important because it's like, okay, we can like experiment and figure out the stuff Now on the computer when it's cheap, we have time and we can really dial it in. So then everyone's on the same page and we actually like do spend the money and we're making the best use of that money possible. Yes. And uh, to kind of touch on what you said just now, it's really important to mix a bunch of different mood boards mm-hmm. Into the pre-production phase. Like a lot of people do that really well. They've turned it into a career. Yeah. It's like, you know, you take two stills from Star Wars. Two stills from Guardians of the Galaxy and now you, you know, you're trying to visualize the sci-fi movie. I, yeah, the shout out to Shock Deck. I mean, they do a great job at that, where you click on a image from a Frame, it gives you the color, uh, profile, but then it also gives you similar shots just in the sense of the color, the composition, like how it is. But it's from like all sorts of movies. So it's also another. Kind of good way to just do a grab bag of like, if you're trying to find shots in the same realm, but that are not from the same exact movie. Yeah, exactly. So again, for educational purposes only, we're just grabbing stuff from shot deck. Uh, I don't recommend you do this, check that you can actually do it. And uh, you have the rights to train on Laura's. Okay, so these are some of the stills from the movie. Uh, you could see on this one here, intense hail around the stadium lights. Uh, there's just like. There's really no black, like true black in the frame. Mm-hmm. Even the black of the knight is like this really nice indigo, uh, dark blue color, navy blue color. And then, uh, yeah, just like the reflections are this light blue color, like, it's, it's amazing. And then of course, you, you have a lot of the set pieces that are little bit brutalist, right? Like, you, you don't have a very. Clean, um, sophisticated city or environments. These environments are also very rugged, grungy, dirty, dusty. And so we want to extract all of that from these training sets. And then once I have selected the images, I have about, I don't know. F uh, did you 30 images something, send this whole group together, or did you do it by, because there were like three groups. I, I put everything into all LoRAs because the bigger your LoRA set is, the more versatile, the more useful it is. Um, in the case of like. Training alo for a character. Like if I wanted to rebuild Joey over and over again, I wouldn't need 30 or 40 images. I could do it away with 10, but I'm trying to build a universe. Okay, so you didn't, so you just trained one LoRA. You didn't train Alo on like the nighttime look and the desert look, it's just one LoRA. Yes. Um, so you can get away with that with really good captioning. Okay. Uh, because at the end, what you're affecting are the, the identifiers, right? In latent space. So if you're looking for like nighttime, it's just affecting that nighttime vector in latent space. Or if you're going to modify the desert. Or the highway, right? You're just modifying those two things and you can do all of that with a single LoRAs and really good captioning. Okay, so these are my caption files here. So for example, you know, medium closeup of a young woman with a light brown skin and dark, curly hair pulled back wearing a white martial arts. Ghee, she looks directly forward. So I did get the help of Gemini with some of the captions, and then I actually went in, I corrected some of that, I added some color into one. I was gonna ask if, uh, yeah, if you wrote these or if you, uh, ran it through something, this would be where like Claude Cowork would come into play. You could just point it to the folder and then have it make all the captions. Yeah. So auto captioning is a thing, and that's, that's what professional AI companies use. Um, Frame.io. I wanted to have just a little bit more control because this is a bespoke mm-hmm. Quote unquote film thing. So yeah, I went, I went into the captions and then I further modified it. And so your captions are a text file paired with each image that has the same exact file name? That's right. Yep. So now that I have my pairs all sorted out here, you could see the next step is to just then compress it into a single zip file. Okay. You don't have to do that. You could just upload them one by one. So I'm a big fan of fo. Then I went into the File Z-image trainer workflow. Mm-hmm. You can just Google this and it'll take you here. So it see here, you can drop the image one by one or you could just. Throw in a zip file. Okay. I threw in a zip file and then I just, uh, kind of kept everything to default. The learning rate is like how much you want the LoRAs to affect the main model. And, uh, you know, there, there's different types of training. So for example, if you're trying to train on a style, I would probably pick this or the actual content of the frame, like a person or an object. Then it would be that, but I just kind of left it at default, which is like a, um, balanced. Okay. Then the training takes, I don't know, five, six minutes. It's really quick. And then once the training is done, you have these two things that come out of it. Uh, LoRA file, which is a.safe tensor file. This can go right into your Comfy workflow. If you download it, you could see here it's only 82 megabytes, super light, and then the config file. Uh, I'm not sure what the config file actually has in it. I just didn't need to use it. So. In order to do inference on this LoRAs, I used foul once again. And you could use Comfy as well. I just used foul 'cause I just wanted this to go real quick. I was doing this last night, but you could download this file and if you had the Z-image LoRAs workflow from Comfy, you could just add this file to it right in front of locally. Yeah, and I'll actually do that for a future video E exactly that what you just mentioned. So this file lives on the file. Directory, so I just copied the link to that. And then FOUL also has a inference workflow for Z-image, which is this one. And this inference workflow takes aor. Mm-hmm. So it actually takes multiple LoRAs. I added this one and then gonna go ahead and paste that in there. The scale is basically the weight, uh, how, how much you want things to be influenced. So I'm just gonna go ahead and, and the litmus test, Frame.io for, the first thing that I wanted to, uh, kind of see if it's working is that Desert Highway Road. So I'm gonna go Desert Highway Road. As my very first prompt just to see if Theor is working, because remember, in my training set, there was ton of that environment in there, more so than any other environments. So if I was to see the effects of aor, I would see that there first. All right, so then I'm just gonna hit run. And the nice thing about Z-image, it's a super lightweight model. It's literally like three or four seconds and you get your output. You, not a banana program where I'm just like waiting around. It's like that is, this is a very nice thing about that. To see that super fast. Yeah. So Joey, you've been to this exact. Environment. Does this feel and look like that that desert you were at, it does feel and look like it, but I'm now I'm also wondering like what was, because I thought my interpretation of our, of our challenge was like, we want to capture the lighting color cinematography. Of, yeah, the place we do that too, but not necessarily the replicate the same exact shots. That's very fair point. My my good friend. And, uh, yeah. What, what, what do you, the showdowns getting intense. What do you got up your sleeve? So, but yes, this does look like, this does, this looks like the Texas, uh, the Texas dip, the big steep hill. That was the, the last one. Yeah. And I think the areas you said is Borrego Springs, right? It is Borrego Springs. Yeah, that's, yeah. Okay. Um, so real quick, I'm just gonna change it to a 16 nine aspect. It's just closer to the movie, technically the movie's Vista Vision, which is a 1.5 ratio. I know, I know. All right. All right. Take it. Easy Film guy. But however, uh, as we look at, uh, shot deck, so real quick. Yeah, I, I, I hear you on. The, the challenge was to not just replicate a desert environment, that seems pretty easy, but to replicate several shots from the film. Um, so I actually have examples of news. I'm thinking like a new staging scene, but that feels like a. It be, could be part of the film. Love it. Okay, so then we'll, we're gonna go head to head with the same prompt. Okay. Um, how about you give me a, a few prompts from Nana Banana and I'll run it here and I'll give you some prompts. You run it there. Okay. So I wanted to test what the. Stock vanilla Z-image would look like with the same prompt, just to know that, uh, you know, that LoRAs is really doing its thing. Mm-hmm. So I'm just gonna take this same prompt here, and then I'm gonna go to Freepik.'cause Freepik has a Z-image for free if you have the right subscription level. Um, so this way I'm not spending any money on inference. So again, this is what, what it looks like without a LoRA. Mm-hmm. Yeah, this is with aor, so there's definitely a little bit like it's grainier, uh, the vegetation is different. The color of the sand is different. Yeah. This, the, the other one feels like a default interpretation. When you say desert, like it's got sand dunes. Yes. Like, that looks like a, that's like. The first thing when you think desert, it's like the Sahara Desert. Yeah. Right? Mm-hmm. Versus this being Borrego Springs in California. Mm-hmm. Which is, yeah. More brushier, rockier kind of desert. Yeah. Just want to real quickly show you this one handle, right? Like you can absolutely go overboard with the, um, with the LoRAs. So if I really intensify the LoRA weight here and then just run it, you'll see that it actually starts to break. Like the LoRAs are not silver bullets. It's not gonna solve your problems. You have. You have to know when to use it. So, creative interpretation of one battle after another. Yes. This is the sequel. What, what? Okay. And so I was thinking it was gonna start throwing images of like Sean Penn or DiCaprio, uh, DiCaprio in there. Like if you cranked it up too much. I I didn't train it specifically on people. So, to get So all of none of your images had, uh, people, it was just, I pulled out, I, I, I think I had one or two Leonardo DiCaprios in there just 'cause he was in the frame. But yeah, I didn't. Because in order Frame.io to recreate Leonardo really well, I would need like 15, 20 images of him specifically, and that would be its own LoRAs that I would have to add to this LoRAs, not necessarily to to replicate him, but more like the way he's lit. And if you had an, if you wanted to make a shot of another person, but in the same, again, in the same style of how Yeah. Uh, Mike lit those people. Okay. That's a good test to try out. We should, we should definitely generate some of that. Yeah. But I'm gonna hand it over to you. Okay. So my, lemme just kinda show the high level of the workflow. So I want the reference image route with Nano Banana Pro. And so I'm doing this in Freepik spaces, so I didn't need as many images as Addy needed to trend of LoRA. Uh, so I kind of just grabbed four images from the same style. I grabbed the mix of images of locations and people, if I felt like the people kind of represented the lighting, sometimes it would go off the rails and just throw like Leonardo in there. Actually, there was one that made a, uh, really. Funky, beautiful, like montage image of like one battle after another, but I'll load, I'll load that up later. Uh, so I took these images. Um, but before that I, uh, just had Claude write a system prompt. Basically I explained like what I want. I wanted it to create a prompt that I could append to a text, to image prompt. Where the Nano Banana would give, get reference images and it was supposed to replicate the style and cinematography and color and composition, but not use or recreate the actual reference images. So I had to write up this extra prompt that I would add to the, after my deep input prompt of like what I actually wanted to generate. And the cool thing with uh, spaces is you can kind of create multiple text boxes. Oh wow, that's really cool. Yeah, so you can create multiple text boxes and you can link it or drag them into the image generator. Node, and then you can sort of call up both prompts. So basically you could just kind of, it just makes it a little easier to kind of keep things separate. So you could just change one prompt with the shot that you want. Yeah. It's more modular. Sure, yeah. Instead of, instead of changing everything and then you can just call them up here in the order that you want. Yeah. So like prompt. It's probably just combining it at the end anyway. Yeah, it's like a, it's a concoction, ator, concatenate, but um, ator, you don't have to really think about it. So Yeah, I don't. Oh, these were the outputs that I had. And so the, according to Addie's text, which had the, uh, parameters of our, of our challenge, one was desert border town wide shot. And so this was using the prompt, my simple prompt that just says nighttime exterior of a small town near Texas border border ultra wide. And there's four input images as reference. And this is what we got, which, that's say. That's pretty good. It got the lighting, it got the vibe. This feels like if we were in a border town Yeah. Next to the, uh, detention center, it was feels plausible. Like it's in the same universe. Yes. Uh, the, the lighting, the color palette, the lighting spot on mm-hmm. The actual, uh, assets in there like this feels more like a wild west border town than modern era border town. But then I'm just nitpicking. Yeah. Well, let's try it again. Like small, say contemporary. All right. Yeah. Small modern town dirt. Let's just go a little sharper. Two ago to go. 2K. Ooh. Okay. All right, dude. Alright. Let's take it easy. It's not gonna be as fast as the image, so we'll come back to that. Joey's really taking this serious. Okay. What was our second shot? Was. Desert exterior at noon, wide shot of rundown cars on an empty stretch of road. So sort of like this post apocalyptic feel. And so these were the images I gave it. And again, I gave this one, I gave a mix again of like some closeups of people, and then mostly the desert wide shots. Yeah. The, at the Sean Penn shot really helped set the, uh, the sunlight to like the top of your head. Mm-hmm. Uh, which, uh, you know, obviously has big implications on shadows and stuff. Uh, and then we got some, these are kind of good examples of like the color palette that they had in the desert. Yeah, yeah, yeah, yeah. Let, and then this is, let's see, the output. Yeah, that's not bad either. I mean, you got the noon high, high sun. You got, it's got like, it's, I think it's a little bit too ve vegetative. Too much vegetation. Yeah. And I mean also I think this. Yeah. Realm two. And same with yours as well. Like Yeah, these are pretty crappy prompts. Like, so it something that was Yes. Much more specific in what we were looking for. You know, would, it would improve the quality of the prompts, but that's not bad at all. Like, it, that's totally usable at a, at a pre-vis, pre-production level. Right? Yeah. Like to convey an idea. Yeah. And to build off also what you could do in Nano Banana and sort of how the interface in Freepik is. I have these four input reference images, and their sole purpose of this is just as that style guide reference, but I could add additional images, like if I had a specific location that I know we're gonna film at, or if I had. Our actor that we might use, I could add them as an input and I could just tag them as well into the prompt to give the prompt specific directions, like, you know, closeup shot of tag the person's image in a desert or in whatever location we want. Sure. And try to get more specific outputs that I'm looking for. Awesome. Awesome. Very cool. And then the last one was. Man in distance wearing checkered red shirt, entering a pawn shop in a small city. And this one was using the Sacramento, kind of a more neutral looking, um. Go Government building, not government, like eighties. Eighties. Small town building. Yeah. Or small, like MIDC Cities eighties office building look. Yep, yep. And I mean, that one's probably the least distinct style, but Man, Nana nails text. That neon sign is so spot on. I know. The sharpness of this is Yeah, like that's where Zm and struggled a little bit, which I'll show you. Yeah, yeah. Uh, and so yeah, this is the output. From that dude. Good job. I, it feels successful. Yeah. Let's see if, if this one worked. Oh, palm tree. No bueno. I'm just kidding. It is my showdown energy. Yeah. This feels, this feels more like a, this feels like the town from Edington a little bit. But you, you have the border wall there. That's pretty cool. It did put the border. Yeah. I mean, again, these were not the best prompts, so we didn't really do good. Yeah, like that hazy, grainy, hayla light, stuff like that stuff comes through. Yeah. So, well, yeah, and this was once I was trying to push, 'cause it was like I, you know, I didn't want it to, I wanted to see what would happen to, if I wasn't trying to replicate something that already feels like it's from the source images. Right. So this one was a prompt that was like a futuristic car driving downtown. And that was the base prompt, but it really, this one also felt like it kept the same style and lighting and vibe. Of the source images with a completely different scenario and location. Yeah. Yeah. The, I think the color palette they nail maybe, maybe lacking some magenta. Yeah. Like the streets aren't perfectly like dry. Like I love the wet streets with a little bit of dirt and Yeah. I got the ation that we see and the the ha the is so clear. Like the Vista vision stuff is so clear. Right? This was the one that went off the rails that was like, which like could just be a good, that's a poster, one montage. Yeah. Yeah. That's with like border wall, the chain link fence and everything. That's awesome. So yeah, that was my, that was my reference, uh, my reference attempt. Awesome. So let me show you the other two scenarios. What I'll do is I actually didn't, I actually did generate some, but, um, we'll do it live. So here we go. F it, we'll do it live screen. Wait, we'll do it live. So I was really fascinated by the nighttime scenario to me like that, that was like a big moment in the film. So all the revolution stuff really happens at night. So like that, that was the part that I was, that, that really struck a chord. To me emotionally, as I was watching the film, I was like, you know, every revolution is done by people. And people do like one thing at a time and they chip away. And over time that little stuff adds to like a big thing happening, right? Mm-hmm. And so, uh. Along that same creative thought. It's like, okay, if, what if two people are just walking around at night and they're planning some operation out? So two people in black hoodies walking down the street at night dealing a dramatic lighting with splashes of color. I'm gonna just turn down the weight to like, I don't know, 1.2. Let's go ahead and generate, so this is what we get. Again, uh, I think I should specify, like silhouette off the bat. The lighting, it feels like a bit too neutral lighting, like I don't have, I was able to get a really good output, which I'll show you. I saved it. How many tries did it take you to get that good output? Uh, two. Yeah. Okay. All right. That's good. Yeah. This is a look that I wanted, I wanted a silhouette, you know, just lighting coming from the front. Uh, and you could see that the, uh, the sodium lighting is, is very orange and yellow like that. That was the look and feel of the film. And because the camera was set to that white color, uh, white balance to sodium, then the other lighting caused significant splashing of color, as you could see there. Um, again, the, the Boca is white. White balance. We were talking about film stock, sir. Film stock is, correct me if I'm wrong, but the white balance is baked into the film stock, right? The film stock has a set color temperature. That's it? Yes. Yeah. Okay. Damn. You're schooling me hard. Okay. Is my, uh, yeah, my, uh, my film school knowledge that, uh, yeah. Coming back. It's coming in handy on the Showdown. Yeah. Alright, so this is the LoRA output. And again, I could, I could influence the weight a little bit. So let's just go up to like 1.5 and see how much more of the movie I can kind of squeeze out of the toothpaste tube. Yeah, again, it's, it's kind of starting to get lost here, but like it's starting to get more and more colorful, uh, which is really nice. Um, so I'm gonna just dial it back to like 1.4, two people in black hoodies walking down a street at night, silhouette lighting. Dimly lit, dramatic lighting with splashes of color. And then just to kind of see what the Laura's doing, I'm gonna just run a clean example without a LoRA here on Freepik. I mean, I'll say this is the, this is the nicest thing. The nice thing about C image is the speed. And this is speed. And that's important for the iteration too.'cause if we're in this phase where we're just like, we want a bunch of ideas and like. Brainstorm fast. You know, the 30 seconds for not a banana gets to be a pain in the ass after a while. Yeah. If you're, especially if you're generating like a hundred images in an hour, right? Yeah. Like you go and yeah. Directors usually don't have patience, right? If you're sitting next to Frow or somebody like that, you better be fast. So this is Plains the Image without LoRAs Uhhuh. This is Z-image with the LoRAs. So there's, there's significant influence on, on the lighting, on the color palette for sure. I'm squinting. You're squinting. Come on, man. Like look how saturated that, um, that yellow sodium lighting is there. Yeah. Uh, you got some of the neon stuff coming through over there, and then look how boring this looks. This just looks look a generic, generic city. Oh, I think that was, I think, I think it was two. Okay. Yeah, I think, I think you had two different things flipping up. Yes. This has, no, not the lighting vibe. Yeah. The, the LoRA definitely has the more, this is the one with, with the LoRA, its, it's just more dramatic, like, it just feels more like a movie where this feels more like a YouTube video. Okay. Over to you. All right. So I ran the same prompt. With the same exact setup and here it is. Yeah. This has got some border town, VIVE Mars to it for sure. Like from your last generation. To me, this feels more like in the world of the film, like we have much more of the halian, the streetlights, the color palette. Yeah. I think you win on this round here. Okay. Yeah, this is much closer. I think I won the desert round. And, uh, you are perhaps winning the, the nighttime. Well, gimme like, gimme a specific, uh, gimme a desert. Gimme a desert prompt. Oh, okay. Just a desert highway road. What was your, your har your highway. Did you have a highway desert image like this? I had a really good image. Um, 'cause the, I used the white Dodge Charger in the, see if you can copy this prompt on my side here. And that was one of them. This was another one. Yeah. That's not like, let me show you here. Oh, because you think there's too much vegetation in these, not just that, but like the, it just lacks the menacing look. Of, of that scene. Uh, it's hard to describe it, but here, let me, let me regenerate it here. Let's put, see, this is where like the, with the reference images, if you prompt it for like a person without describing them and you have reference images with people, they'll creep in. So I had just asked for like a fruit stand, uh, but it put like Sean pen in the free stand. So see that's, that would be so hard to do. With, uh, with Z-image and Laura's, uh, Nano Banana does it effortlessly. Yeah. That's, that's impressive. But it's, yeah, this text of sharpness. Yeah, it's, it's good. And, uh, if you do this like at a actual pitch meeting, like you're gonna get laughed out of the room. Yeah. This was the first one that came out with, but yeah. I see. I, I, I see what you're saying.'cause there's no cactuses in the reference stuff, so, and, and you're going into APO apocalyptic. Not, not, I mean, you. Probably seen the movie more times than I have. I might have prompted it for that. Okay. Can you take the apocalyptic stuff out? Like, we don't need wrecked car, we need regular cars that still run. Just old cars. Oh, okay. Well that's, oh, rundown cars. Okay. Get rid of rundown. Okay. So just say old cars shot of. Eighties cars.'cause they used a lot of eighties cars. Yeah. So like av Are they driving or on they, they on the side of the road. What was your, you know what the scene where Leonardo whistles at the people Yeah. By the fruit stand. Yeah, yeah, yeah. Can we try to recreate that? Why don't you come up with a prompt, a high angle, wide shot of a rundown eighties car at the intersection of a highway in the desert. By a fruit stand. Mm-hmm. And again, this is the, the downside of that, I will, I will say my, you know, when you were talking about Laura's before, I was thinking the, the pro of Laura's, this was a different. Use. Then I was thinking, you're gonna go with like, I feel like they are really good if we were trying to do a complete style transform, like turn these photos into an illustration or a cutout or like some other distinct style. Yeah. That is the core, like bread and butter or what LoRAs do. Yes. Yeah.'cause then you get consistent every time. Usually if you trained it well, I mean, if you build, if you build a good enough style transfer model, image to image model. Yeah. This is, this is darn to feel like the movie right here. Yeah, for sure. I would be nitpicking if I had notes here. But yeah, you nailed it. So let me take that same, that same prompt, a high angle, wide shot of a rundown eighties car at the intersection of a highway in the desert by a fruit stand. So, no, this is, uh, vanilla. This is the vanilla, uh, Z-image. Oh yeah. This, that vanilla one did a pretty good show. Exactly, dude. Yeah. Don't sleep on Z-image. Okay. So here it is, uh, high angle, white shot of a rundown. Oh my god. Another knockout. But come on dude. Look at the lighting. Finish him. Look at the color of the. Sand. Sand. Well look at the vegetation sand, man. Like it didn't even get the highway right? It's like a passing zone in the highway. Alibaba, you're letting me down with the image. Okay, so I'm gonna take that same prompt and just throw it into plains the image. The image vanilla has made a, uh, non-exec a car that does not exist unless you take a to CHOP's chop. Well, it's struggling with the intersection part 'cause the intersection is structurally like a cross, like perpendicular lines and that is influencing the car. It has made the car intersect itself. Yes. That's not good, man. That's not good. Um, do you know anybody at Z-image I could talk to? I don't you went this time, but yeah, if you have, if you have done anything we have done and, uh, had success with either of them, let us know in the comments. All right. Uh, 'cause I, I, I'm curious 'cause this is something we keep going back and forth. Yeah. I feel even more in my, in my belief of reference is all you need and, um, not. Spending time on making LoRAs for this use case? Yes. Um, I, I, I do, I am gonna caveat Joe, what Joey just said with like, if you want the Vista vision look and feel, you can absolutely extract that from a LoRA much easier. But as far as, uh, structural content, things like. What you're doing with building cities and adding border walls and things like that. I think Nano Banana does it better. So perhaps it's a mix of both workflows for your creative output, right? Which also, if you're in Comfy, you could build some crazy workflow that combines the best of both worlds of LoRAs, the image and Nano Banana for like specific edits or something. Okay. Or Quinn, I mean, you could try like a Z-image with AOR going into Quinn to make specific adjustments. Mm-hmm. That's kind of the beauty of Comfy ui. What was this prompt? Uh, so I'm just playing around with it. A high angle, wide shot of a wide charger driving fast down a two-lane desert highway. Extremely ultra wide shot, which this is not, I have found these models are very subject subjective. With what? With camera terms, with how they interpret closeups, medium and wine. Yeah. Yeah. Yeah. Yeah. That looks like it could be from the movie's. Sure. Ish. That's the closest. Yeah. Yeah. Um, I mean if you compare that to Chairman, chairman, I'll g I'll give it to you. Damn. After demolishing you with references, I guess, uh, the Loser has to negotiate a film in the desert with actual film. I'll be doing that this weekend. It's a loser, has to go buy me film stock, go buy the other person, film stock. The winner gets to the air condition room using ai. The loser has to be in the desert shooting film. I don't know if that's a good loser. I I, I I went to the desert voluntarily. I know, I know. I was just So this is, this is the image stock again, like uh, yeah. The desert is very generic desert versus. Yeah, like from the movie and oh, wow. There's even a little bit of motion blur in there on the, no, the LoRAs definitely got the vegetation and more, and some of the colors from the reference images, uh, more accurately than vanilla, which just made it look like a generic desert with sand dunes and stuff. Yeah. So Joey, what are your big takeaways from our showdown? My takeaway is, um, I, I feel. Uh, I still, I feel reassured in my belief that reference images are pretty good solution, uh, to if you're trying to replicate some look or element or shot. Uh, Nano Banana Pro is, is undefeated still right now, in my opinion. Yeah. I would say with your reference images, you push Nano Banana into a place where you can't really get output. By default, like the reference images really did give you the hail and the blue tones and some of the ruggedness of that border town that I don't think, you know, vanilla Nano Banana would give you just by prompting No. Yeah, by prompting. Yeah. That'd be a good test too. I could maybe give it an image and turn it into a prompt and see how it does. Yeah. But yeah, I think you're right. I think the, no, the reference image help, you know, I'm trying to think if you're LoRAs thing, maybe, you know, giving it everything into one LoRA. Doesn't work. Maybe you need like a alation LoRA, a color LoRA, and like you kind of stack 'em together. Yes. But you know, my thought is like, is that amount of work worth the output you're gonna get versus. Not a banana. The argument in your case, I would say is like, X image was so fast that if I'm brainstorming and don't have to wait 30 seconds for like every new idea. That's, that's a, that's a big friction point. Yeah. I think, um, having a one stop LoRAs for all three scenes was probably not the approach. I was hoping that it would, the captioning would take care of a lot of that, but turns out no. So if I were to do it again. And, uh, if please gimme another chance, Joey, if I were to do it again, I would probably break it out into each LoRAs for each of the scenes. Uh, and you should have enough, I split 'em up into scene folders. I think each scene had 15 or so images at least. Yeah, for sure. For sure. Yeah, you gave me enough. Should have to try that again. Yeah, maybe I'll try it again. I don't know, but I am still try held back by the. Fundamental limitations of Z-image, like Nana itself is just really good. Like, just as a model itself? As a model model itself. Itself, yeah. Yeah. So maybe that's where, uh, flux Calvin Klein comes in. I, I was gonna say try the, yeah, try the new flux one. Try the new flux, uh, Zoolander model. Yeah. Or uh, maybe even Quinn. I think Quinn, you could do LoRAs with Quinn. Yeah. Maybe even Quinn 25 11. I think so. Scout out like I, I I just find out about other models that I haven't thought of by just going to foul.'cause I have everything and I just like run searches and stumble on things I hadn't heard of and then try 'em out. And sometimes I'm like, oh, that actually works pretty good. So like I would search file for LoRAs and see what other models support LoRAs and maybe, maybe they work better. Okay. All right. I will do some homework and uh, yeah, I'll get back to you. Alright, well yeah, thanks everyone for watching links ish for whatever we talked about here. This is a different episode, but you can find out all this stuff in our past episodes@denoispodcast.com. Give us some more ideas for Showdowns, Joey and I would love to throw, throw all of our energy into. Random stuff like this, and hopefully you get a lot of education and also entertainment out of it. All right, thanks everyone. Catching the next episode.