Denoised

We tested JSON prompting in Veo 3

VP Land Season 4 Episode 48

Is JSON prompting a useful technique or just influencer trend? In this episode, we examine the heated debate around structured prompts in Veo 3, test the claims ourselves, and share the results. Plus, we dive into Higgsfield Steal's controversial marketing approach and explore AlphaGo, the AI system designed to build other AI models that could accelerate the path to artificial superintelligence.


--

The views and opinions expressed in this podcast are the personal views of the hosts and do not necessarily reflect the views or positions of their respective employers or organizations. This show is independently produced by VP Land without the use of any outside company resources, confidential information, or affiliations.

In this episode of Denoised, JSON prompting in Veo 3, useful hack or hype; Higgsfield Steal, a controversial new product with a weird marketing campaign; and AlphaGo, an AI model that can build other AI models. Let's get into it. All right. Welcome back, Addy. Thank you. It's nice to be back. Where have you been? I have been in a deep Asia for the last two weeks. That's amazing. I was in Tokyo for about a week and then took a small flight over to South Korea. Mm-hmm. I was there for another week and now I'm back. Severely jet lagged, and just really happy to be podcasting again. Good to be back here. Yeah. Well, I'm glad you're awake. You seem awake enough. Yeah. I'm gonna share some photos. I saw a Kodak store like a full on like branded Kodak Sandals, Kodak t-shirt, obviously the film you can buy there. It was so much in demand that I saw another one being built out. Oh, another Kodak store. Yeah. Or another Kodak store being built out in Tokyo. So, and what if that just like more of the licensing play from, uh, like that stuff I saw at CES where it's just like Kodak, everything. It's, it is exactly that. And I, I, I got myself this, this is from my childhood. I think yours too. I do remember that. Yeah. I never thought when we use this for real that one day we'd buy it as a prop. That would be nostalgic for it. Yeah, and it's gonna be on the podcast set here. Wait, it takes 30, this one holds 39 photos. Is that a lot? Yeah. It used to be 24, right? It was a 24 or 36 exposure. This says 39 exposure. Dude, they're stepped up to gas. They gotta, they gotta step it up. Compare and compete with a, with a flashcard. I don't know where I'm gonna process it if I ever do use it. I think there's some mail-in services that still exist. But also, I mean, well here in la like at least like there should be, yeah. Left. Nice to see, uh, Kodak making a comeback. You know, like we talked about Jurassic World recently, and then sinners and. Yeah, it's, it's awesome. Yeah. Film film's not dead. Film is very much Not dead. Is not dead. All right. So while you're gone, this started boiling up like a week ago. Uh, but it's still, it's still trending. JSON prompting Yes. Has become a hot topic in the Twitter sphere of, um, Veo, especially with Veo 3 prompting. But I've now, I've been seeing JSON prompting bleed into just AI prompting in general, but we'll kind of focus on Veo 3 'cause that was where a lot of the attention was given. Mm-hmm. I would say this started, I don't know if it started, but it came on my radar, uh, with Dave Clark. Yep. Who posted out a new short film. It looks really good. And he did it all with, uh, Veo 3. And then he was saying that he. Change this prompting format to JSON And so JSON is a format language. Yeah. It's ASCI language. So, um, it's our alphabets and numerics, so anybody can read it in English, but it's, uh, it wraps up into like a, a text file, if you will. Yeah. And it's like a structure and it just basically gives a lot of structure to the prompt. Yep. And you have sort of field for, um, lemme look at this sample one. Yeah, it's somewhere between like code and human language. Yeah, it's human readable. Yeah. But it's got some of the coding formats. But I'm gonna pull up an example here. So like in this one example, it has a description of your prompt shop, but then it also has a field for style, a field for camera, for lighting environments. And so the thought behind this was by having such structure to the prompt in JSON format, it would, you'd get better outputs. And then also if you're doing multiple shots and you're trying to have multi consistent characters, co consistency and consistent environment, you keep these factors the same. Right. So you can re-prompt the shop, but you have other elements that, um, stay the same. Yeah. It becomes more modular and easier to scale up. Like as you have a structure, you know, all you do is if you're gonna change from a wide shot to a medium shot, you go into the camera section, change the lens. Something like that. Similar? Yeah. Or just, yeah, change your prompt, but like at least your lens data's the same. Sure. And your lighting data's the same. I was thinking it also leads to, if you're messing around with like Claude code or something, you could build a little interface with like dropdowns to kind of spit out like in a nice interface, kind of write up your prompt and then it spits out a JSON. Yeah. Format. With the consistent elements. Yeah. I would even go a step further and use VLM, visual language model. So ChatGPT one has built in where you can input a frame of something you're trying to create. Right. It could be a hand sketch, it could be something out of Unreal Engine, and then have it give you a JSON file with the prompt that would be required to generate it. Yeah, and that was a, that was a test I did where I started just having Gemini or ChatGPT give me the prompt. Tell it the prompt of the shot I wanted, but give it back to me in a JSON format. The Dave Clark original post, he sort of also worked on some other things, which he didn't quite get into details yet. And I'm curious about this, 'cause this also might play into just how JSON is a speedier way to build out a short film. But he said he wrote an agentic workflow to automate JSON formatting based on the entire script. So it sounded like he gave the entire script to Gemini and then got like, not just a single shot in JSON, but like the entire shot by shot sequence, shot by shot, broken down. Yeah, sure. Um, so he sort of posed a little bit about that, but I'm kind of curious about how that further develops when he, when he does post or explain more of his workflow. So, but JSON prompting in general, so the argument is it gives you structure and you get better outputs, but then other people were saying. This is a bunch of, this is a bunch of hype. Jason Zada, AI peeps, JSON prompting is just influencer hype. Highly structured prompts are way better ouch than all the extra info that I keep seeing in people's JSON prompts. So he is basically saying just a nice good text prompt is funny. I don't agree with this guy's language and just like, uh, you know, this is not the right way to approach it. But I do understand the sentiment here, and this goes back to what I've been saying for a lot of episodes, is like movie making is done on a shop by shop basis. Mm-hmm. And if you try to automate it with a giant JSON file, you're still working in a shop by shop basis, but that entire shot is now inserted into a giant sequence, JSON. Right? So why do that when you could be as granular as at shot level. And I think you'll, you'll find that as you go higher and higher up the quality ladder in movie making, the more and more granular you have to be, so go from shot by shot to now tens of frames or maybe a single frame and so on. And I don't see JSON formatting and JSON prompting beneficial for those high-end films. Having said that, like those 10-second, 30-second ad spots, sure. Go use it all day long. Yeah. I mean, I'm thinking, well first off it's like, look, try this out for yourself. If this helps you structure your prompts better. Yeah. And you find that you're like getting better outputs, cool. Have fun with that. JSON, I think is, um. I don't find it super easy to read. I think you like, especially reading it on a social media platform, like you really need to copy this and put it into an actual markup language editor, like a sublime or something that's free, where it'll like format it nicely for you and you can see the hierarchy of the levels. I mean, I think overall just uh, text prompting in general is annoying and kind of dumb. Yeah. Yeah. It's not the way shifting away from that. Absolutely. That's not the way we're gonna go forward. Um, text prompting. Even if it's like 500 characters, or in this case 10,000 characters, whatever, is not gonna be the way of the future. Yeah. Like, we're gonna have a better interface into AI generation. Yeah. Something more visual. Something more tangible. Absolutely. Yeah. Uh, this is moving things around with mouse and keyboard or timeline or something like that. And there's been tons of attempts at that. Uh, and I think it'll only get better. Also, the formatting of JSON is not gonna save you or give you any better results because all you're doing is essentially adding. Whitespace and parentheses and labels and tag names. Yeah. And there's some arguments too who are saying like basically you're adding a bunch of extra stuff that is gonna use more tokens, it's gonna cost you more. Yeah. And not necessarily add more. Then there's some other debate or it's some other suggestions put off. Like, uh, Kris put out that, um, YAML is another formatting language, which I wasn't really familiar with. Mm-hmm. But uses less tokens is another way to format your prompts. But Sure. You know, could put the info in there. End of the, today it still comes down to natural language, right? Yeah. Like, uh, these clip encoders that these video generation image generations use is trained on natural language. It's not trained on JSON scripts. Mm-hmm. So whatever JSON follow you insert is gonna get decoded in action. Or, or the argument was like, yes. So people are like, oh, it it that they were trained with JSON, but it was like, yeah, they're trained on JSO formatting, but they were trained to understand. Human language, human language, natural language, not JSON. Like when I say photorealistic, that has a very specific meaning to you and me. And that's what the clip encoder is gonna try to generate when I put photorealistic that word in my text prompt. Yeah. So I wanted to test this. Oh, nice. So I did a couple tests and basically I used Gemini and I just had a very basic shot of like a person at a diner and we started a closeup of a coffee cup. And then we like tilt up and rack focus to a detective guy who looking out the window Nice. And. I just kinda gave it that basic prompt and then I said, give this, you know, describe build out the shot in a JSON format. Okay. And I gave it like one of Dave Clark's JSON prompts as a structure base so that it would just change it and modify it and gimme the output. And then I had to rewrite that JSON prompt as just a regular text prompt. So then I ran 'em both twice. Okay. And so then I ran 'em in Veo 3 fast. Started on a cup til it up rack focus tilts. Guy looks out also, it's just crazy how, how realistic this looks like the steam out of the coffee is, the shot itself is crazy. Yeah, yeah. Fine. Right? I mean it did what it did what I asked for. Yeah. I also gave it some specific time markers, like zero to two seconds, we push it in the cup. Oh, nice. Okay. Two seconds to flour seconds we pan up or tilt up. Yeah. That has a little handheld, handheld feel to it. The other one came out really dolly like, And then this one is the paragraph format. Completely different shot. It hit the points. I mean, I, I still think it's a matter of, I mean it still did what I asked. This one's weird 'cause he sits in, so this one didn't quite nail the specifics and it's inverse, mirrored. Yeah. I didn't really give it screen direction. I will say, well, the shot I had in my mind, the JSO one. Hit what I had in my mind, spot on. But having said that, the second one you showed with a little bit of a hand heel feel, I prefer that one. Yeah. These are the same prompts. I just had to do two outputs. Right. But seeds, again, it's like a random draw off the, yeah. Yeah. These are just different seeds for sure. And then, let's see, I did another one where I wanted to have like a reverse outta the guy. None of them understood what I had in my mind, but I didn't specify screen direction as well in my initial prompt. Can I get you some more coffee? The only thing I wanted to say was gonna get you some more coffee and then it had some chips in the beginning. I mean, so crazy how good that looks like if you look at the gate on the waitress and the the coffee fluid flushing around. Yeah. Oh my God. Can I get you some more coffee? I would do some audio massaging on that dialogue. Yeah. And the, in my description, I wanted the, I was thinking more of that. It would be from a, from the right side of him, but it, this would just more be like, I need to, I would just need to finesse my prompting for sure. Um, and then this is the paragraph version of the same shot. Milk hot. Can I get you some more coffee? Yeah. It didn't pour. Yeah, I, yeah, I mean, apples and oranges, it's super simple. Yeah. Yeah. And then this was the one you sent me. Yes. I just ran the prompt as is and, uh, let me give a shout out. Yeah. What? Yeah, shout out to Syed Ali Kazim for sending us this JSON. Yeah. Yeah. And this is what he generated. I'm more amazed, I was able that it's able to do the, uh, copyright logos and stuff. So this is the JSON and the prompt, the original one, he posted different seed from whatever he demoed, but thing popping outta the box. And then I gave. Gemini, the same exact JSON prompts, and I said, rewrite this into a paragraph format. It's pretty much exactly the same. The two, two wheels fly up and never return, and then two new wheels appear. It did this one, two, it did that two here. Like, see these wheels go out that tire. Yeah. Yeah. We, we need to work on that. But yeah. So it's more or less the same. Pretty much the same, yeah. And then in the, yeah, there was another one that, yeah. I'm more amazed that it could do the Superman and not give you a copyright is, uh, error. Correct. Yeah, that's, this was one that I found online. That they posted. Yeah. And then I just had it rewrite it and it does pretty much the same thing. Yeah. Also, like when it comes to that copyright stuff, I wonder, and I, again, I'm not a liar obviously, and I'm not a lawyer, but there is a certain gray area where you can be doing this and either it's under parody law or under fan fiction. Um, so like, yeah. I mean, I think again, going into the outputs and how you use this, like there are. There's leeway to that. I just had issues with Veo 3 where I had a name of something, it like flagged it and said, we can't generate this. It didn't tell me why. I assumed it was because I had a name of something that had some copyright protection. But it seems not the case.'cause this prompt said Superman and it spit out Superman. Like, um, one of the. Channels that I watch on YouTube, I think it's called Star Wars. Um, it's like a fan based Star Wars thing that's fully AI generated. Okay. And, um, oh, like a storm trooper, some vlog stuff. It's, it's nicer than that. Okay. Um, I'll pull it up. Uh, and it's basically taking bits of Star Wars that was never included in the original six or seven films. Okay. Like moments where like. Luke Wa Luke meets, uh, General Grievous for the first time. You know, some, something like that. Okay. Okay. And it'll use AI to generate it. And that stuff is up on YouTube. It's making a lot of money, you know, it's taking a lot of views and, uh, it they're not taking it down. Yeah, there's no copyright infringement there. I'd be curious if there was like similar to if you upload a video with a song and depending on the label, they have choices of if they either wanna just block the thing entirely or if they want to claim monetization, so like leave the video up. Oh, but they take the money. Got it. Got it. Yeah. Money you, you make. So I'm curious, I'd be curious if, yeah. And it's Luc's film did something like it's Mark Camel, like there's no mistake about it. Like they used as likeness and everything, probably without his permission. Yeah. But it's still up there. Yeah. Yeah, yeah. Uh, that's a separate wild west of stuff, but yeah. I'm just, I'm more surprised that Veo 3 is putting these outputs out. Yeah. I mean, I think with the JSON stuff, you know, I think look, try it out if it helps you Yeah. Make, if you get better stuff, if it helps you organize your shots or just you think better that way. Sure. Cool. Go for it. Um, if you like tech stuff. Long paragraphs better. Yeah. Cool. Whatever works for you. Yeah. It's like, uh, look, I like my coffee. Um, as a Americano you like your coffee as drip, but end of the day we're getting caffeinated. It's like, come on guys, however you wanna get caffeinated. Yeah. Yeah. I mean, whatever works for you. And also I think it's just more something to be aware of and it's like, yeah, you can try it in your toolkit and maybe this works for you also, like, you know, maybe if you just had a little bullet point list with all of the like, same exact things of. And it's just listed out. Yeah. You know, shot frame lighting, you might get the same exact quality output anyways. Exactly. I think the, the thing that they're, uh, missing the argument on is still consistency. Like, JSON is not gonna save you from that, like from shot to shot to shot. No, no. I mean, maybe it'll help you get a little bit closer and stuff, but No, it's not. Um, and since we last spoke. Veo 3 now does have First frame. Yes. Yeah. Which the person who commented like Veo 3's had first frame for a while. Yeah. We recorded that before that.'cause literally the day after, I think it launched and then was like, yeah man. Yeah. Good luck. Have to record that. If you have an AI podcast, good luck trying to stay up to date. Yeah. Good luck trying to create evergreen content for two weeks from now.'cause it's all outdated basically. But because they added first frame, there was another thing posted. And I think this is just also interesting on a bigger level where. Even the companies putting these model models out are still figuring out, yeah, cool ways to use the model that they did not even intend or know about. So Google posted a hack tip that they realized where basically if you, with Veo 3, if you, I think VO two works too, but Veo 3, if you take your first frame mm-hmm. And then you mark up what you want to happen. Mm-hmm. Like with text or text boxes on the image, and then you give it that image and then you say, follow the instructions on the image or something to that effect. Right. It will start with the first frame, with that marked up image, but then ch dissolve out the text and then make whatever you want that you describe happen. Happen. That's so cool. So you lose like a few seconds in the beginning. Yeah.'cause it's gonna be unusable, but you have far greater control Yeah. Over what you're trying to do. And that's how like, if you've ever worked any feature or TV show, like, you know, that's how directors give notes. That's why we have Frame.io markup. That's why visual. Yeah. Yeah, exactly. Yeah. Where it's like, oh, I just want to literally draw on the frame mark up and why move this way, way Frame.io mean. Frame.io, too, but there are like tons of other companies too, whether solely exists to Sync Sketch is up there. Yeah. Uh, ftrack. ftrack pf nah. ftrack F and then Shotgrid. There's a bunch of revisioning tools. Yes. Bunch revision. Yeah. With very specific tools to mark up part of the frame. Yeah. So this was, people have been posting different hacks and stuff of, you could do text boxes. Someone even realized you could upload an image, like they uploaded a picture of an image. On like next to a TV screen, and then it put the image on the TV screen. Oh. Like with static and stuff. Wow. So that you could not only just do text inputs, but image inputs as well, dude. So I love it. Crazy use case, but it's like just an example too of like everyone's messing around with this stuff and there's no, there's no like. Rule book, my brain instantly goes through how difficult that task was pre AI and now with ai. Yeah, yeah. Right. I mean, yeah, between that and also Aleph, which we'll talk about at the end of the week. Okay. But, um, runway's, Aleph and the demos I've been seeing from that and what it can do with existing video is cri Cristobal, you did it again, you son of a gun. Addy's open Invite to Cristobal. Yes. Uh, still holds, especially now I'm a big fan of Aleph. We're gonna do, we're gonna do a super cut, so, yeah. Um, some cool prompting stuff. Yeah. Prompting. Debates. All right. Higgsfield Steal. Higgsfield Steal. So there have been, you know, debates and, you know, it's like kind of fine lines right now going on with AI and copyright protection and ethical uses. Higgs Fields's put out a product that's literally called Higgsfield Steal, and their demos are, well, their pitches basically. I mean, honestly it sounds kind of similar to like a lot of other products out there, uh, where it's like you can give a couple source images or, and kind of create something. Similar to it or combined elements. A lot of other products have protections in place if you're using copyright characters or like of real people. Yeah. I believe it's called indemnification is the legal term Uhhuh. So like as long as you're using that product under the correct license and in the correct way, then the company that makes the product will essentially got your back. Uh, okay. Yeah. Ifield Steel's demo is using like, where is it? Some of the demo like, uh, Dua Lipa and, uh, I'm trying to think of what other demo I saw. The Daniel Craig. I saw Dan and Craig as a night walker. Oh yeah. Daniel. Daniel Craig is a night walker from Game of Thrones. Yeah. Um, so basically just. Using all sorts of, uh, real people not knowing about their awareness and the IP protected scenes and stuff. It's not good. It's not good at all. It's a bit icky, but that was weirder was a lot of the top AI creators on X started posting promoting Steal, which I know, I believe they're all like paid like. Higgsfield ambassador. So like, I mean, you know, a lot of the AI graders are ambassadors to the companies, but they're all sort of putting out the same messaging. So like, yeah, this was the, this was the Daniel Craig. Yeah. Ice Walker one Joker. Yeah. It's all right. But I mean, the messaging was just weird, you know? It's. Hollywood's in trouble and just new Higgsfield Steal AI tool can now steal camera angle composition, color lighting. Dude, that one's obviously like the Hollywood's in trouble thing. Dramatic thing is, is getting old Hollywood's trouble one's always like kind of funny, but um, Hollywood is evolving. It's not, it's not in trouble. It's in a transitionary period right now. Well, it'll be fine. Adding nuance doesn't work well on the internet. Oh yeah. Here, this one from TechHalla who's a big creator and has a lot of good stuff. But then the messaging of forget about copyright, just grab it is like, you probably shouldn't forget about copyright. It's kind of important. Yeah. I'm trying to look out like my first, my first immediate thought was like, uh, Higgsfield's probably a Chinese company so they don't have to abide by American laws. That's true. And probably true. I looked it up and it's not so former. Okay. Not true Snapchat executives. That's right. I forget, we talked about this a while ago. Yeah, and he, I guess he has funding from. According to Google funding from Kazakhstan. All. Okay. So, uh, it could, I mean, it's still a American company. I look, I think they're just gonna ready, fire, aim, and then see what happens. But yeah. And then, then this other one, this other promotion posted out, uh, a few days ago. It was the same exact messaging from this one from TechHalla again, Midjourney stole from artists constantly. Now you can take it back, you said Higgsfield Steal to get what you want from Midjourney. Guys. And it's a video. And then literally the same exact messaging from Alex, uh, Petrascu. They stole first. Now it's time to steal better take directly from Midjourney with Higgsfield Steal. Same video, similar messaging. It feels like there's Fox News. Um, the, the Sinclair. Broadcast. Remember that clip when it was like all of the local TV stations were reading the same exact script? Yeah. It feels like that in the AI space. Totally. Yeah. And it, this is not, in my opinion, this is not the right move, uh, not a great look, especially when like, there's all this other literal legislation about like what AI companies can train on and do and, and, and might. Yeah. And maybe in the influencer space. In the consumer space. Sure. I think it could be a cool meme or cool trope to get some eyeballs on your product. Mm. But. As you move up into the enterprise space with professional studios, big companies, like they're not gonna want to do anything to do with stealing? Uh, no. No. Absolutely not. No, I don't think, yeah. S field's in that position. Yeah. I mean, just call it in the enterprise space, call it style reference, call it something else. Yeah. Like every other company. That's how something similar. I mean, yeah, I guess if you wanna make a name and splash for yourself and stand out from that, I dunno if it's best long term play. Yeah, totally. But, uh, it's, it's good to sort of get a good calibration on where AI is right now. Like on one end of it where, you know, we have company like Adobe that's being super sensitive about stealing 'cause they've made some mistakes in the past. And the other end of the spectrum you have Higgs field who doesn't give a f right. About any of this. Uh, so, you know, obviously we're somewhere in the middle where we understand that AI has intrinsic value in bettering our content creation, our lives. At the same time, there's some real ethical issues it's up against. Mm-hmm. And we gotta figure that stuff out quickly. Yeah. I dunno, the whole thing felt icky. Icky, icky is right. It it, I felt icky. In Tokyo it was 95% humidity. Different kind of icky. All right. The last story, this paper off ago. Yeah. So interesting paper. No, I gotta say, uh, forewarning to our viewers, uh, prepared to be scared. Okay. Okay. So, uh, we talked about Alia VIR before and a artificial super intelligence. We know that companies like OpenAI, uh, xAI, and pretty much everybody is working at artificial general intelligence. So AGI. For the lack of better term is where a model or a combination of models matches human intelligence. Totally. Like it's as smart as you or me or whatever. And then ASI, artificial super intelligence is, uh, as smart as several humans or a thousand humans. So it's like infinitely smarter and it has the ability to get smarter over time. Okay, so how is this actually gonna happen? This is a glimpse into the future, my friends. So this is giving me some Skynet vibes, uh, I gotta say. And, uh, I don't know how I'm feeling icky about this a little bit, but at the same time, it's really cool that research is going on. So AlphaGo is a research paper and um, we have the. The general understanding is AI architecture is incredibly difficult, and that's why AI researchers get paid hundreds of millions of dollars, right? Like to build a new model and to build a model there is, um, just like in software or in hardware, you have to put together a bunch of little blocks in a way that is novel, that's new. Mm-hmm. And it'll do new things. And now up until this point, a, even though AI has gotten better and better, the only reason it's gotten better and better is because. The architecture and the models were developed by humans, and it's just gotten a new one out that hasn't existed before. So fundamentally, in order for AI to be super intelligent, you're still limited by humans. Like if a human can't figure out how to make a certain AI model more computationally efficient, better at quality, whatever, then it's just not gonna happen to beat around that problem. They now have an AI model, or rather an AI system that can come up with its own AI architecture, AI building ai, ai building ai. And we've talked about this vaguely before, but this is like really, uh, nailed down. It's really granular. So let me go through this at a level that I understand. And to be honest, it is, this is, most of it's beyond me. Anytime Joey's. Hey dude, can you do the white paper? I'm like, goddammit, you're gonna make me sound like an idiot in front of the viewers. I brought the first two. I'm like, ah, okay. Come on. So this is, read the last one for me. Yeah, so if any of our viewers have a better understanding of AlphaGo, please definitely shout out. In the comments to what we understand, it basically breaks down the problem into four age agentic behavior. So it has a, a researcher, an engineer, an analyst, and a cognition base. We're gonna put up the photo right here so you can kind of see it from the paper. And each of these functions are done by a separate AI model. And what it's basically doing is it's figure out a way to systematically produce innovation. Okay, so it has a database of about 50 core AI models, architectures. For example, unit architecture is something that image generation uses, but that's just one type of architecture. Okay. Right. And then LLMs and ChatGPT will use a transformer model or hybrid transformers, which is a whole different thing. So it has a library of 50 architectures that it. Nose is, is gold and then it has a variation of those, another 200 or so. And then given enough, uh, sort of room to create, it'll come up with a brand new architecture based on that. So once it has the architecture, and I believe the researcher model comes up with that architecture, then an engineer goes in and actually implements it, uses code, uses cuda, uses Nvidia GPUs and builds it out Completely automated. Yeah, which is freaky. And then an analyst will come in and test that for performance and test it for how novel and new and cool it is. So it has a quantitative measurement and a qualitative measurement. And based on that, you can either say, this new architecture is amazing, let's use it. Here it is. Here's how it's built. Or go back to the drawing board, do another one, and do the whole process over again. Yeah. Okay. Yeah, that's, wow. So for sort of groups of agents cognition base, thinking of this stuff, the researcher. Looking into it, the cognition base. Think of it as a giant database. Yeah. Yeah. Because see it has like a, a diagram pointing to a, a chat like an extractor from hugging, hugging face, uh, RVIs paper re so repository. So remember those 50 or so models that those models probably all live in hugging face. Yeah. And it's pulling key features out of that. Yeah. Right. And then the researcher research has, I think the researcher is where all of the magic sauce is.'cause it can figure out completely new architecture. Mm mm-hmm. Like put models together in a way that's never existed before. And this is where humans come in, right? Like we have our creativity. We can think of things that has never existed before and the researchers is mimicking that. Yeah. Yeah. And I see that regenerate kind of loops or it just seems like also could like probably loop this stuff thousands of times way faster than like any human could. And human could. Yeah. So this is one. Sliver of the path to ASI. Artificial super intelligence is like once AI is sufficiently capable of building one new type of architecture, it could then go ahead and exponentially make more and more and more of those and try to figure out things. Better, faster, cheaper, running on less GPU running far more, you know, in higher quality and just, yeah, I was just say you do a deep seek slash ASI. Yeah. Deep seek kind of thing. On a second note, I saw something really interesting. Jensen Wang, uh, CEO Nvidia was on an interview and I'm, I'm fascinated by Jensen, right? Because like, I think we all are in the AI world. His vision is like, I don't know, two miles out from where we are today, like where we are today. He's already. Thought of it two years back, right? Like, he's like, yeah, idiots, you're just figuring this out now. So, uh, he said something like, all you need is 150 people to change the world. So he goes on to say, you know, open AI is about 150 people. Uh, I think Deep Mind is about 150 people like Anthropic. Same about that size. And if you give 150 people. A few billion dollars and let them do their thing. We're gonna change the world. And that's gonna happen very soon. It's happening now. Yeah. And you combine that with new technology like this, like you're really accelerating where technology can take us in the next. Few years. Yeah. Probably not a coincidence. Isn't that Dunbar's number 150? I, I think it was called Dunbar's number 150 was like the number of human connections that like someone could sort of maintain. Right? Like consistently on Yeah. Like, uh, on a, yeah. One, one level. Before it just was like either too much or Yeah. Yeah. And that's one of the main reasons why like big CEOs of big companies have direct reports that are less than that number. Mm-hmm. Like Mark Zuckerberg, uh, I think famously said, he doesn't talk to anybody else at Meta except for like. 30 to 50 people. Yeah. That reports to him. Yeah.' cause he's just, you can't, you can't, yeah. Yeah. Your brain can't comprehend it. Yeah. You can't, uh, do it. I mean, I mean, Nvidia did say that they have like a pretty flat hierarchy, right? Doesn't Yeah. Jensen, all of the VPs and everything. One report just go to him directly. All the billionaires that report to the super billionaire. Yes. Is that how it works? No, no. That, so on that same interview, Jensen was like, uh, yeah. I. The executive team at Nvidia are all billionaires. I don't know if he was flexing, but he was like, yeah, and we've done well for ourselves. Yeah. Well, I mean, being sure you do math, I mean, what they passed 4 trillion. Yeah. So like Right. They've all got stock options like Yeah. The math maths. Makes sense. Yep. Alright, cool. Yeah, this is kind of crazy paper and, um, I'm curious to see what comes out of it. Yeah. Like, and uh, and the pace at which this happens is first the paper drops. Yeah. And then an AI company will build, uh, r and d version of this. Then if there's enough revenue and there's enough business case, then they'll build a product around it. Mm-hmm. And then we have it. Mm-hmm. Right. So this is the progression of things. So research paper, look out for those first. And I think, uh, I mean in this case, all the research companies, which are most AI companies, they're essentially research companies will be all over this. Yeah. Yeah, for sure. Yeah. Yeah. So, yeah, we'll see what comes out. Uh, all right. Wrap it up. Yeah. Uh, I want to give a quick shout out to some of our, uh, listeners out there. So, uh, one of our super viewers, David Wag, if I'm pronouncing it right, thank you for your support, Ronan 2079 Carson, dial two Wilson fish bag, and finally Crack pistachio 6,001. We thank you for your support. Uh, if you have not commented or engaged with us on YouTube, do so and I'll give you a shout out as well. Yeah, there you go. And also, uh, yeah, if you're listening to this on your podcast app, uh, please leave a five star review over at Apple Podcast or Spotify. Thanks for everything we talked about are@denopodcast.com. Thanks for watching and uh, we'll see you on the next one.

People on this episode