Denoised

Apple’s WWDC: Liquid Glass But Where’s the AI?

VP Land Season 4 Episode 37

Apple's WWDC left us with more questions than answers about their AI strategy. Let’s unpack the key highlights from the recent WWDC keynote — starting with Apple’s new Liquid Glass UI design. Plus, we explore cutting-edge tech beyond Apple's ecosystem—4D Gaussian splats bringing static images to life, innovative AI memory storage in MP4 files, and a breakthrough in faster image generation models. 

In this episode of Denoised, the quick recap of WWDC, 4D Gaussian Splats are finally here, and a couple interesting AI papers, including storing AI memory in MP4 files. Let's get into it. What's up Addy? Good to see you. Good to see you, man. How was your vacation? Good. Yeah, I went to Palm Springs. It was, uh, very hot, but uh, it was nice. Yeah, nice place. We stayed up and uh, just able to chill and, uh, I love summers here in la Decompress. Yeah, I, it was, uh, it was getting spic. It's gonna get even spicier there, but it's also crazy 'cause you drive like two hours and a temperature change of like 25 degrees. Microclimate in la it's a thing. Yeah. But no, that was a lot of fun. And Palm Spring is great. Just guy's Cool. The mid-century VIVE, Mars midstream modern, the classic cars, it's like, it's like a town that's just Ateme capsule. Yeah. It's got that fun 50s vibe still. All right. So the WWDC Keynote was this morning. Yeah, I checked it out. All right. Interesting. Nothing crazy. Nothing.'cause uh, last year, a year ago was when they. Teased or announced the Black Magic Ursa Sin Immersive. Yeah. I was like, oh, we're gonna have all dedicated camera for shooting immersive videos. Nothing in the film world. That crazy. Nothing overall. That was really that. Yeah, crazy. The biggest thing was their new UI update across the board, the redoing. The entire look with, and they're calling it Liquid Glass. Mm-hmm. And so it is a very much more transparent mm-hmm. Glassy looking effect that's going across the board. Mm. I feel a bit Mm, about it. You're way nicer to Apple than I am gonna be for the next few minutes. Right. What's, oh boy. Did they miss the mark on ai? I mean, I would just love, well, AI separate. I'm talking about this new, we can go on the AI Can we ran later? I'm talking about specifically Liquid Glass and their new look design. I mean, I'm like, I'm like bringing back skeuomorphism bring back Yeah. The physical notepad. It's, it's, it's pretty dope. I like it. And from all like, iOS needs a facelift and as newer devices can support better GPU, better rendering, better graphics. Yeah. And that was the, this is something that could fill up the space, but I was, Apple just dropped a Liquid Glass video on YouTube right after the conference. Mm-hmm. The first thought that went in through my mind was, why don't you use, uh, extra GPU to run an AI process in the background? Yeah. Instead of doing fancy graphics that really doesn't move the needle. Yeah. When they played a promo for their upcoming Apple TV shows, it had this new Liquid Glass typeface. Yeah. It's hard to see. It was hard to see the text and, uh, I remember there's some comments online too of just like, oh, you know, rest in peace, uh, accessibility. Which to Apple's credit, they're usually really, really good with accessibility. So I'm sure there's a lot of options to, yeah, turn this off or make it, uh, more legible. But for the just general branding, marketing, it was like a little squinting to read the titles of the shows they were advertising with this translucent glass looking font display. I gotta say, like from a creative design perspective, it's pretty novel. I think it's pretty neat that the, like this is all coming together, timing couldn't be worse. Like, just look at it optically comparing it to like Google io. Yeah, exactly. Or Microsoft build mountains of, of, of new updates and innovative, uh, releases. Yeah. And yeah, Apple the big, I mean, right. We're talking about the first biggest thing was they're making it glassy, uh, they did talk about Apple Intelligence and what did he say? He said something, uh, to the effect of it Apple intelligence with Siri coming later in the year. It took more work than we thought it would to reach our high quality bar. I mean, sure. I could see that, you know, uh. Apple strategy is maybe this, and I'm just speculating with you here. Mm-hmm. It's like, look, the iPhone as a platform in everybody's pocket, billions of them around the world, that's not going anywhere anytime soon. So even if we don't innovate on AI, other people are gonna put their stuff out on iOS, on iPads, on iPhones, on watches anyway. Yeah. And they are opening up more control. Two features and stuff on the iPhone. So one of, I mean, they did have an AI update. Um, so it's the foundation models framework. So I think it is their version, their model. It could run locally on the phone. Right. And they're opening it up where apps can, a developers can access it. Yeah. It doesn't really do much. Mm-hmm. I mean, it's stuff that we've sort of seen already of like, uh, some writing suggestions, the Genmoji stuff. Mm-hmm. It doesn't do a ton. You can open up. Apple Intelligence and give it access to ChatGPT. And it has a, they have a partnership with open ai, so you can, with, you have to go into the settings and turn it on, right? It's not like gone by default. You can give it access to chat GBT and then that gives you access to all of the better benefits of, of, of ChatGPT. Fair enough. I think the. Issue here is if you are one step behind today, you're 10 steps behind tomorrow, right? So an example is, okay, fine, you don't innovate on AI. You've got this incredible hardware platform. Everybody uses iPhones. Nobody can even think of you doing anything other than, you know, a lot of people that are not upgrading their phone this year. They're gonna upgrade next year. They're gonna keep buying Apple products. You and me included. So what happens when a disruptive new piece of hardware that relies heavily on AI comes out, whether it's an updated. AR glass from Meta or the Johnny I of Open AI collab, whatever they're working on, whatever this wearable device thing is. Yeah. Or it could be some startup out of nowhere. Then that gets acquired by Google and now Google's making this killer hardware, right? So like you are opening yourself up to major disruption if you're not innovating. Yeah, a bit. I mean, but we've talked about this before, like Apple's also never really been first. They, yeah. You know, see what's coming out and then they just usually do it better in this case. Yeah. That did not, that was not the case. Yeah. Yeah. I mean, I still think, I mean, look, they won the higher end, uh, for sure smartphone, and it's the aspirational product. All I'd like people in less developed countries who have androids want. An iPhone. Right. That's the one to move up to, to your point where, you know, they could get disrupted and where this other battle is, is in, uh, other wearable devices Yeah. Like categories that aren't really developed or that they don't own yet. Like glasses is probably gonna be the, the next big one. Yeah. And also they, the, they have significant investment in silicon now, right? Mm-hmm. Like the M 3M four chips are so good. It's so above and beyond what you need for day-to-day use for your laptop. Yeah. Yeah. Like even if a new AI application lands on this Apple silicon, they're still good for a while, so they don't have to custom invent AI hardware anytime soon. You're not looking at like Apple deploying Nvidia like departments within Apple to just build AI specific libraries and hardware. I don't think that's happening anytime soon. Yeah, to, you're right. I mean they are building their own model and, and, and that can run on the device. But yeah, nothing like to the extent of, uh, Nvidia. Yeah. I will say like building your own model is very vague in that sense. Like, yeah, students, PhD students at universities are building their own models, so it's not really saying much. Right. I mean, I'm curious what more, I mean also this is the keynote, it's, you know, more for general audience I'm sure.'cause they have. Hundreds of talks this week 'cause it's the developer conference. There's probably more details about what that foundation models framework is. Yeah. So I am curious, we'll try to dig in more in the future of like if they have toxic about online. Because I remember too when they announced the Apple Vision Pro and the spatial video, it was sort of glossed, you know, it was like special video. But then they had more dedicated talks of like the MV HEVC file Kodak and stuff that, that came out later in the week. And it was like, okay, we're able to get more details about things when, you know, we wanna tech out and learn more about what, what does that actually mean? Yeah, I, I think, uh, Apple maybe should. First of all, just a major disappointment in my personal opinion, that one of the most valuable companies on the planet, if not the most valuable, has nothing to really show in the most exciting technology in our lifetime. Yeah. I mean, they did dig themselves in hot water, you know?'cause that's like last year they had all of these big, you know. Announcements and it was like, oh, this is what you expected. Where it's like, you can talk Siri, be smarter, you can talk to Siri. You know, it can communicate and do things on your phone in like a much more natural language. Built an a, an event and all of that. And then they hyped it up with the Apple intelligence. Apple intelligence built into the iPhone 16. Yeah. And then with the ad, and then it didn't deliver. That's why I bought the iPhone 16. Yeah. Are you part of the class action lawsuit or whatever? Oh, is there, there is a class action lawsuit because it's like, you know, oh, you sold all these. Things and like nothing really delivered. The Apples fanboy in me doesn't want to sue Apple. And as much as I hate to criticize them, like I just want them to do better. Yeah. Yeah. I mean there was a Bloomberg article from like a month ago that was dug into like why they had so many issues and it was basically like, well, the part of Siri that like does the. Logical things, you know, it's very like calculator, you know, set a timer if then kind of stuff and the ChatGPT kind of part, I am losing my words, but of the like the, the natural language. Natural response. The natural language. But that's still a large language model. Yeah. You know, which is still predictive. Yeah. In it, at its core, the two would not, it would way more complicated to connect and sync the two. That totally makes sense. Then, uh, they anticipated series series like, Hey, make me a

calendar for 12:

00 PM tomorrow. That stuff was done like 10, 15 years ago. Yeah. And that's like a very definitive, like yeah. You know, there's no like, oh well probability of 1230 or 1250. It's like, yeah, you gotta set the, the clock for this accurate time. Yeah. Or this vent. And getting that to communicate with, merging it with a an LLM and making it perform the way you expect it to was where they got stuck. I'd imagine that's not that hard. But again, I'm not, I mean the only thing I'll give them is they're deploying this at the biggest scale. Right. Billing on billions of phones. Yeah. You roll this out. Yeah. It goes out to billions of people. Yeah. Okay. Well if Roblox can do it at that scale and Fortnite can do it at that scale, then surely Apple can too. So, uh, that and also you general updates, if you like, you know, just try to give you the highlights. If you didn't watch it, they're changing all their version numbers, unifying everything.'cause there was like an iOS. Phone version and an iPad version. Now it's just gonna go by the year number. So the next iOS, it's iOS 26. Mm-hmm. Uh, for the Mac, it's gonna be macOS Tahoe. Picking, you know, another California location they have not used yet. There was a big AI update. Oh yeah. New Genmoji updates. Oh, you can now merge two emojis into a brand new Yes. Single gen moji. Alright, so should we make an Addy and Joey emoji for the podcast? They're updating the camera app. Honestly, I think it. Gonna make it more confusing. They're like streamlining.'cause right now if you go the camera app and you have the little slider thing for the different modes and they're streamlining it into just like camera, uh, photos and video, and, but if you swipe, you can still access the other options. It's a lot more swiping, a lot more hidden features. I think it's gonna confuse people. And it doesn't really add, like, there wasn't any new, like, you know, like we know we have QuickTime and raw, like there wasn't, there wasn't anything new to that. Um, I guess usually they do roll that out when they announce hardware updates instead of the software, but nothing, yeah. In the actual performance, they, I gotta say, this is the iPhone 16 that I, I got thinking in, had Apple Intelligence, they put two new buttons on it. So they have this button here that supposedly. The bottom. Yeah. And then they have an action button, which is from the 14 or 15. Yeah, they, yeah. I have the action button. You really don't need that. I use the action button for to pull up, um, Blackmagic Camera app. Okay. Yeah. And actually I just recently, actually, I recently redid it to load up perplexity.'cause perplexity, ISS voice mode can also control things on the iPhone. So that's what you're doing. See, you're invoking other AI solutions on the iPhone. Oh, because it's way better. Yeah, it's way better. So that Apple is counting on that. As long as you keep doing that, right, you're they're gonna move new phones. Right. And that's probably why they're like, oh, well, we'll just let the apps have more access to core features. Exactly. Which, you know, usually they didn't. Does that also mean that at some point they're gonna have to bring down the privacy wall for these apps? I think, or with your permission. I mean, similar with ChatGPT where you have to opt in to that Right. And know what you're getting into a little bit. But if you use their foundation models, they did keep pushing. Hyping up that it, you know, that's private, it's can stay on device, it can run locally, it can run, you know, without a connection. So they are on the models, they're training, it seems like pushing, you know, for the privacy mm-hmm. Stuff that they've capitalized on of owning that word privacy. Yeah. You know, in our industry and our friend circle, we have quite a few, like hardcore Apple fanboy and, uh, I don't know, like you're still an Apple fanboy. Yeah, I, I mean, I am too, but I don't know how they're gonna really compare Apples to Apples with Google and Microsoft and OpenAI. I mean this is, this is nuts Joey. Like they are so they're not seeing the forest from the tree. I don't know. I kind of disagree ish because it's like they also, but they never really were that kind of company. Like they're not a enterprise. Cloud company, they're not a sure, like Apple is not selling APIs, they're not selling models and cloud services. So that's, that's not really their thing. Like yeah, Google's, you know, developed a lot more AI products, but that's also more, you know, Google Cloud and, and everything around. Yeah. They have more enterprise use cases. Yeah. And that's more their market. Like there's no cloud Apple Cloud service for customer. I just caveat I'll it with this is like the killer use cases, haven't even been thought of yet at a consumer level. So it's up to a company like Apple to give it to us, right? They innovate. They say This is the way to do FaceTime. This is the way to text message now. And we as a consumer either love it or hate it, right? Mm-hmm. So none of that experimentation's even going on. So we're not even at like the level one of rolling out. Do these tools work yet or not? I'll say also like Apple's still even I feel late to just cloud workflows in general. Yeah. Like they have a lot of good products. I don't use a lot of them because they're locally based, they're hard to collaborate on, especially if the person doesn't have a Mac. Like I like their Freeform app, which is like sort of their whiteboard brainstorming app. But it saves local files. It could, I could share an image of it, but I can't really like add collaborators to it. You know, pages is cool, but, and they sort of have a web. Version of pages, but it's, it's a lot. There's just, you know, lot of, it's just still very, um, no notes is really powerful. Notes is very syncable. Yeah. Yeah, yeah. Some other, lemme see quality of life things that I am excited about as a, like, phone screener, uh, habitual phone screener person is a call screening. So if it's a unknown number, oh, thank God. It'll like, it's sort of like what Google Voice does, but it'll be like, whatcha calling about? And then it'll like kind of, you know, oh, it picks up for you. It'll pick up and sort of be like, you know, how can I help you? And then like, it'll pop up with a notification of like, what they say. So you can kind of, you know, I love that. Approve. Yeah. Just something that, yeah, I've always wondered, it's like if I don't have your number saved, like can you just send this number to, to voicemail automatically. Yeah.'cause they're using a bot against you. Use a bot on their bot. Yeah. And also speaking of bots, they have Hold Assist now or two, which is uh, okay. If you're on hold. You can, I guess, put your hold assist on and so the call will stay active, but your phone will hang up. Oh, and then when the person comes back on, it'll ring you back. Love that. Which is also great. Yeah. So it's like, yeah, just now it's gonna be each side is a, a fighting with a callback agent. Or maybe Apple thinks I'm going on a tinfoil hat conspiracy theory here. Forgive me. People maybe Apple thinks. That this AI thing will just blow over. Well actually did you see the, uh, did you see that Apple did release an A paper? An AI paper? No, they released a paper, um, that was sort of criticizing all of the other reasoning models saying that they're not actually reasoning if you give em like a challenge, that's too hard. Yeah, I did see that. Then they just give up. Yeah, and so that was Apple's. Latest innovation was a paper from them saying the other models kind of suck. And uh, they're like, wait till we drop our LLM guys. Check this out. Okay. Some actual interesting ish stuff. So, and integration. So macOS Tahoe, the new, two of the things that stood out that I thought were interesting, I'm a big keyboard shortcut kind of person. Yeah. I switched out Spotlight. I use Alfred, which is sort of like an app launcher and you can also do some other kind of shortcut things with it. They are revamping Spotlight to. Be able to access files, be able to access your clipboard history, which also is another tool for basically make a. A lot more functional to like pull up and do shortcuts and you can connect it to shortcuts, which is a separate app that, yeah, you can build out shortcuts. Yes. I don't use it. It always either iOS, it's pretty handy. iOS it's handy. And then they have one for um, the Mac as well. Okay. And you'll be able to call up shortcuts and you can load AI features in the shortcuts. Like this document I'm on. It can like look at the document and then suggest. Rewrites or summaries. So it's some integration. Yeah. With AI shortcuts that you can run on your app. I feel like also the way I described it, it is probably way too complicated to set this up for like normal people. Yeah. I don't use shortcuts because sometimes I never found like a killer use case for 'em, and it always still seemed a little bit kind of a pain to set up and. Deal with. I don't know if that'll be kind of the cool thing, but Yeah. It goes to your point of that they are opening up the ecosystem where things can communicate more and access things more. Yeah. Than before. Yeah. It's basically like, Hey, OpenAI, take over my computer. Or so, you know, so along those lines, because it's like either the run local, their version, which is not that good or so you can pull it up with ChatGPT, but also ChatGPT has a pretty good. Desktop app already that they built that has its own popup command bar thing. Yeah. And you can like, you know, have a look at your screen and, and, and do stuff. Yeah. You know, uh, Sam Altman was on an interview and he said something really interesting. He said, the way millennials versus Gen Z use. Our products is completely different. Mm-hmm. So you and I, millennials, the way we use it is it's essentially a replacement in most cases for Google search. Yeah. Right. Sometimes, um, you know, I'm like, okay, I don't think Google search can handle this question. Like, gimme the coffee shop with, uh, the most gourmet coffee with local, you know, it's like a very specific thing. You go to chat GPT for that and maybe that natural language will just do a better job the way Gen Z uses chat. GPT. Supposedly, according to Sam Altman is it's a life advisor. Mm-hmm. So it's like, you know, when you're making major life decisions, like, should I date this guy or should I go to this college? You know, it's like that level Yeah. Of life advising. So if you think about it, that generation and the generation after is gonna grow up with that level of interactivity and commitment to ai, and they're gonna want their device to have that level of commitment as well. Yeah, I can see that. I mean, the. AI chatbots or the post the personas, like the, um, character driven chat bots or, yeah. And AI has to be more responsive. A big thing. So like local inference plus maybe it's a hybrid cloud, local inference that has to happen on device and then it has to be multimodal taking video, audio. Mm-hmm. Depth data. Be aware of surroundings, be aware of your location geographically, and take all those things into account and just be better with advising on life. Yeah, and multi. That did remind me there was another feature too, where you can like screenshot or tap on things on your phone or that you're seeing on the web or on your phone and run searches with it across different apps. Yeah. Not as sort of like built in Google image search, which we've had for years. Did they mention Tim Sweeney? Yeah, I was, you know, I was wondering'cause they were talking about games in arcade and stuff and I was like, I'm like, it'd be really funny. They would just, just Fortnite I put up face on that wall behind them. Yeah. Like if Fortnite ISS a demo app, that'd be like, um, really crazy. No, Fortnite, it was not a demo app. Yeah. It would've been funny too if they were like, we're offering you now the option to bring in, uh, your own payment system. Oh, they did say that? No. Oh, okay. No, they're like, well they're, we were forced to, they're bleeding billions of dollars. Yeah. Yeah. Alright. And then visionOS there were a handful of updates there and it's cool and the tech is really impressive, but you need a big, giant headset to take advantage of it. Oh yeah. visionOS that's the, the, the Apple Vision Operating System. And uh, so they have widgets rolling out and the widgets are cool because you can kind of put 'em in your physical space and then they just stay there. So like. When you put your headset back on, or after you reboot it, it'll, like if you put Oh, it registers something on your bookshelf. Yeah. You're like, oh, I wanna put a calendar here. I wanna put like the photo app here, widget. Oh, that's cool. And then you know it, it just stays there. Yeah. And so you put the headset on or you take it off, you move around. But when you come back to the room,'cause it knows the room you're in based on how it mapped it out. Yeah. It just locks it there. I love that. Yeah, it's wild. Yeah, it's cool. But yeah, you, it only works with the headset. I mean, that is spatial computing, right? New concepts of, on how we organize data. I think it's very forward thinking and I think it's definitely the way that things will go. I. Once the physicality and of the device, whichever company comes first, can figure out Yeah. How to have that type of precise locking on the location and mapping with the augmented reality, but, and to be able to store Lidar data or something. Looks like glasses. Yeah. Yeah. Something that's much easier to wear. And then, uh, more, there were a couple updates for Apple immersive videos. Adobe's coming out with its own app or version of Premier that can deal with, uh, immersive video. And then they also have partnerships with GoPro into 360 and Canon. Nice to do native. Play video playback of, um, 180 or 360 video. Yeah. Those things have been cooking for a while, so Yeah. Yeah. And then the other thing was the, uh, personas, which was mm-hmm. The way that it would scan you to create a, your photo realistic looking avatar. Okay. So like, if you were doing a FaceTime call and you were wearing the Vision Pro, but I'm on my phone. It was the little like virtual Yeah. Version of you that would show up. It was very uncanny valley, like Weirdish looking. Yeah. They did a new, they have meta. Meta. Did that a while ago. Remember? With the white like a year or two ago with the like cartoon looking? Uh, no, no, no. It was photo re, yeah. Oh, that, yeah. So there's looks, they did an improvement to it and like in the comparison they showed in the demo. Yeah. The quality was like way, way, way, way better. Okay. Like very lifelike looking. Okay. Which I'm very, I'm curious how they're pulling that off and how. Much, it reacts to like facial movement and stuff. There's probably an artist in India somewhere that's painting the face as you go. Just like you're just looking at your entire photo library and like reconstructing your face. No, they're, they're using ChatGPT. You're like, what does Joey look like? Right. Create a 360 panoramic of, based on this image. And then iPad is just more and more continuing to be like a laptop. Um, so the, they're you adding windowing. You can stack, you can resize windows, you can stack windows. Mm-hmm. Basically how you can do it on a laptop. Yeah. Uh, but now you can do it on an iPad. A couple interesting things that actually tie into sort of the podcasting space. Okay. And kind of go straight at what apps like Riverside do well for more audio and video input controls on the iPad. So if you plug in a microphone, you can have more control being like, yes, use this input microphone for recording. But also I. Think, I dunno if it was a separate app or if you're doing a FaceTime call, but now they have local capture. Nice. So if you are FaceTime recording interviewing with someone, yeah. It will record a local copy of your video and your screen on the iPad and put it in your files app, which is exactly what. Riverside does and why, like Right. I would use that all the time for remote recording because it saves the local files for everyone. Yeah. We've used it. Yeah. Um, and that's like the big selling point where it's like you're not oof, dependent on not good news for Riverside. Yeah. I mean, ish, but I, it just said it was iPad, which I some surpris. I'm like, well, why wouldn't it also be like on a. On a MacBook. MacBook, yeah. And then again, you know, this only works if you're on an Apple device. Yeah. Apple has the same problem that Porsche does when it comes to, uh, product segmentation. So Porsche has the nine 11, which is the flagship car. Mm-hmm. And then they have the nine 18, which is called the Boxer or the Cayman in some cases that. That car is, could be as expensive and as fast as the nine 11, it gets really close. So Apple has the MacBook and then if you go down to like the MacBook Air and then the iPad Pro, I mean you're looking at same price points and things that are more or less the same. Yeah. Just running different os And what Porsche did was they just canceled the nine 18. They're like, we're just gonna sell the more expensive thing from now on. Yeah. I don't see them killing the iPad Pro. Yeah. Yeah. I mean, 'cause if you, if you deck out an iPad Pro with their keyboard case. That's what, like 1200 bucks. Yeah. Possibly more. If you get the white, if you get the cellular version with like, and you max out the internal storage and a MacBook Air is like a thousand bucks and a MacBook Air you could get for a thousand. Yeah. So you're, you're, I mean, that's hard. Yeah. Yeah. But probably if you are someone who has an iPad Pro, you probably also have a MacBook Pro. I think you're in the, I think it's more like probably the, uh, what ecosystem are you in? Are you in the pro. High budget ecosystem or are you in the consumer ses and the And the basic lines and stuff? Yeah. The air and the I and the iPad Air. Sure. Line. Sure. Like if you wanna spend more money, go for it. Yeah. We have all the stuff for you. Yeah. I mean, like, I have looked at the iPad Pro, but I'm like, well, you know, this does the, I have an iPad Air, like the base model.'cause it's like, well, you know, it's cool for reading and pulling up some stuff. Like on the airplane. Yeah. But I don't, you know, it's like, oh, do I need to dish out for the iPad? Pro, pro I don't is perfect for not, can't do this. Yeah. I think it's for like a graphic designer or like a Yeah, if I was ux ui developer. Yeah. If I was a big stylist user. Yeah. Could see the advantages. I feel like the debate would be more like, you get an iPad Pro, do you get like a, a, a whack tablet? Yeah. Yeah. Mm-hmm. So yeah, that's mostly the roundup. Nice. Uh, we spent a long time on that one. Oh, we did? Yeah. I mean, uh, yeah. I kinda wanted to give a, a recap of if you didn't watch it, but nothing mind blowing. There you have it, folks. Gradual updates. That, and all that stuff rolls out later in this year. Nothing is out right now. That's what they said about Apple intelligence. All this stuff is supposed to roll out later this year. There you go. I'm sure that nothing has rolled out now. I'm positive about that. Yeah. But it should roll out later this year. The, the Liquid Glass thing is pretty neat. I'll give 'em that. We'll see how that goes. I'm curious, I'm curious just about like how legible it becomes. You're gonna see Liquid Glass rip off everywhere across automotive. True. Across hospitals. That's true. Liquid Glass. Yeah. I don't even talk about Apple CarPlay, but they're like Liquid Glasses coming into Apple CarPlay too. Like you're gonna see like Tesla UI mimic that and you know, like it just, Apple has such a ripple effect when it comes to design. Yeah, that, that, that is very. That is very true. Yeah. Now everything's gonna be translucent class. Translucent for 10 years of our lives. Yeah. Because yeah, they're like the last. Redesign was 10 years ago with the, like iOS seven or whatever it was, or nine. Okay. Next story. Yeah, so look, Gaussian splats have been around for a couple of years. It's basically AI powered photogrammetry. Mm-hmm. It's somewhere in the middle between machine vision, traditional photogrammetry and AI inference. Like if you combine all these three things together, what Gian splats are like is, it's literally a splat of paint. A splat of color. Mm-hmm. As a point cloud. And then that point cloud, if you have enough color, splats starts to represent a 3D object. Like the splat holds data, the splat holds. It's a point in space with color information and it, it has a gian. A fall off. So it's like not a, like a splotch of color, but just like a nice smooth gradient of color. Mm-hmm. And that's key because, uh, it supports obviously alpha channel and then you can start to mix and match point clouds very close together. The neat thing is gian spots were really good at being 3D without 3D. If that makes sense. Mm-hmm. So you don't need a traditional 3D renderer or game engine or anything like that. You can just, uh, on the fly do what's called novel view synthesis. So anywhere you place the camera Yeah. If it has enough data, it'll figure out what it should look like from that point of view. And now usually the Gaussian Splats we've been seeing are capturing an environment. A static environment. A static environment from like Yeah. A photo instead of a photo. A, yeah. 3D space. Computationally, very expensive. Uhhuh, just like photogrammetry. So it was limited to static stuff. Mm-hmm. And we knew that sooner or later, 4D, four dimension, the last dimension being time mm-hmm. Is gonna come out and. Here we are today. There's a bunch of announcements to cover on the 4D GS front. Yes. So the big uh, announcement came from a Chinese university who dropped a paper called Free Time gs, free Gian Primitives at any time, anywhere for Dynamic Scenery construction. We'll link the paper on the video here. Mm-hmm. Look, basically it takes into account that. You have the past and present frames or point clouds in a given capture, and so it goes back and forth in time to more finesse the point cloud solve. Mm-hmm. Okay. Which is kind of the same technique that a lot of video compression uses. Like if it's not real time when you're compressing MP four, it'll go backwards. It'll go forwards, and then it'll come back to the middle frame and figure out what the best frame at that moment is. That saves the most amount of data. So I, you know, I'm just sort of. Layman terming it here because I don't understand the mathematics behind it. I'm not a Gaussian splat expert. But again, in the paper it says that it, it can go basically, uh, back forth anytime, anywhere, which means it can reference Gaussian splats from the past in another space. As you're moving, right. Basically like building out a video timeline that we're used to point cloud video timeline. Yeah. And then optimizing the heck out of it to make it as efficient as possible and as high quality as possible. Yeah. I mean, yeah, obviously Cool. Applications of this one would be, you know, if you have the video and you want to reframe and move your camera around in a 3D space and reframe your shot Yeah. With the video happening. Yeah. A cool use case, other case, uh, loading this on and they have demos of loading it on an Apple Vision Pro or a headset. So like you can physically move. The video is playing, you're in the scene as it's happening and you can move around and see what's happening. Yeah. Uh, I mean, the big promise for Gaussian Splats and, uh, you know, from my virtual production days, even in VFX, is like, you don't have to build complex real world things if you can just go capture it well enough. Mm-hmm. Right. So, you know, let's say you're setting your movie in downtown la. And you have 10, 15 different cameras capturing one, uh, one street, and the cars are going by, people are walking, the, the weather is changing and the day of light is changing. You capture all that in time. Now, you don't have to build that exact digital twin of that city with meta humans and or digital humans or whatever. Like that's just way too complex. You can just use 4D GS. And composite on top of that. Put that in the background layer. Yeah, in Nuke or whatever. And then just. Do your thing and save tons of money. Tons of time. Yeah. Yeah, for sure. Yeah. Once the quality gets there, but yeah, this is a good step in the right direction. Yeah. One thing to clear up though, because I think this, I did see this going around a lot, or I saw the clip of the demos going around a lot, and it seemed like some of the misinterpretation of it was that you could do this. From like a 2D video and extract the 3D information, right. That, that, that is not accurate. Accurate. That, yeah, that is not, yeah. It's very much a traditional volumetric setup. Yes. And so there was a, one of the videos was a, the guy talking to camera, a nice interview setup, and yeah, it turned, uh, he, uh, tech YouTuber, a Chinese tech YouTuber. And so this video's from like a month ago, and then you could see the behind the scene shots in the video. And it's a. Array of 20 or so cameras, and in the paper it says you need about 20 cameras. Yeah. In order to capture this, to have enough data. Yeah. The technology in its infancy. I think what they're really after is what happens after you capture the data. Yes. How you can process this, and then how you solve it. Process it, yeah. To, so as that gets better, I think there'll be a less need for more cameras and in specific locations. Also, right now too, from what you saw on the paper, you, you still would have to process this. In order to play it back like this, you can't, even if you had the 20 cameras, like you can't in a sporting event, have a real time, not a big move around the space, you have to capture it. Process it, and then you get this four d ga and splat that you can move around. Yeah. Uh, the, I, I gotta say the live broadcast of volumetric data has been done at small scale. So companies like Intel and Verizon have worked on it, you know, five, 10 years ago. The problem was you're dealing with terabytes of data. Mm. You know, like from like all the broadcast cameras and all the other special cameras, just like there's so much content. So. You need a data center to process all that stuff. And then you need a big fat pipe in the network to ship it. Right. And that's something where it's like, you need this to happen in 30 seconds. Yeah. A minute for your instant replay.'cause yeah, once it keeps going, the game keeps going. It's like, well, whatever. Who cares? Exactly. You can maybe build that out for every NFL stadium or whatever. Mm-hmm. But it's not gonna go down to like a consumer level. Like our, our phones can't run realtime volumetric stuff just yet. So that's why this is so key is Gaussian Splat technology is not traditional volumetric solve. It's something entirely different. Mm-hmm. And something that is efficient enough, I think to down to a level where consumer devices are able to run it at some point in the near future. Right. Run it. Not create it. Receive it. Yeah. Yeah. Right. Yeah. Even like in a headset. That makes totally, makes a lot of sense. Yeah. I think a, a lot of this still because it's AI still depends on nvi, Nvidia architecture like Cuda and RTX cards and stuff like that. So we're gonna see it on desktops before it gets down to our phone. Or Yeah. I'm curious how they would run it on a headset if it's running through like some cloud-based version or option to Yeah. And then just pixel streams down to our headset. Yeah. I'm gonna check out Augmented World Expo tomorrow. Nice. So I'll see if, if there's anything in the 40 video space. Okay. That happened in there. I went last year. It wasn't, uh, as much as I thought. Where is it at Long Beach. Okay. Yeah. Long Beach Convention Center. Yeah. It's a nice location. Oh yeah. Go to Aquarium of the Pacific, across the street. Oh yeah. That's a cool, I've never, I've wanted to go there. It, yeah. And then last, uh, kind of couple interesting papers. One was Memvid video based AI memory. This is so brilliant. This is such outside the box thinking, and I think this is the type of stuff that Apples should be getting into Instead of Liquid Glass. Look, you have a video, which is really an efficient form of storing images, right? Like if you, if you think about the number of frames that you squeeze into an MP four file, 24 frames a second times, you know? 10, uh, 60 seconds times, you know, 60 minutes. You're talking about thousands and thousands of frames. What if in one of those frames you could throw text on there? And that text could be readable by a machine. So instead of, um, storing text with Ask e with binary data, you just pack as much text as you want with video and it's a far lightweight, far more efficient way to store text. Yeah, that's. Yeah, is so weird. Isn't that wild? Yeah. So, yeah, I mean the idea of this is from their website, unlike traditional vector databases that consume massive amounts of RAM and storage, vid compresses your knowledge base into compact video files while maintaining instant access to any piece of information. Right. What are your thoughts on this? I mean, I am trying to understand, you know, how it, the benefit of storing it as an MP four file, like what the file structure looks like. Were you able to fit more text and make it more accessible in an MP four file? Versus a database, like not, I don't know enough about like how the file structure is built, where that, how this makes sense. I don't know enough about it either, but I will, I know enough to be dangerous. I like the, uh, innovation around this and the, uh, thinking differently. The, the novelty is pretty cool. Yeah. Like my mind immediately went to secret military stuff where they have to transmit. Important text without text. So they throw everything into a video file. You know, it's like North Korea gives Iran a video file or something. Why doesn't this file play or it's like an, um, enemy of the state. Yeah. When, when he, he stores the video on the video game, cartridge file, and then the kid takes in the kid's like the game's broken, it doesn't work. And he is like, give that, uh, it's the opposite of that. It's like, wait, this video's not playing, and it's like, oh, that's because it's the entire. Wikipedia. Yeah. In the video file. My, my guess is this has to do entirely with how AI leverages VAM and, uh, short-term memory on a system where if you have tokens, let's say billions of tokens, which could be text, right? Uh, you're taking up a ton of ram versus if you're doing this, uh, supposed video, you're not. Really even loading it and it's just going through the video as needed. So it's like creating a pseudo ram with the video, if you will. Okay. That's my guess on why this is far more efficient. Of course, time will tell. We are the first ones to break it to you and so you've heard vid first. Yeah. If this does make it into training of a model, I think that'll be pretty cool. Yeah. Or how you could also, if you could use this as the memory for how a model could access, like if it could access more. Memory. Yeah, because that's like an issue too of, of the, the, the knowledge window. Like if you're trying to give something to an LLM and, you know, give it background knowledge, there's like a, a, a max of like how much you can give it. Yeah. So if this solves that too, that also be a big. For sure a big win. Uh uh, one of the things that researchers in AI always talk about is like how heavy a model is in size. So like mm-hmm. You know, if I give you the latest damage generation model, then you know, oh, it needs 22 gigabytes of VRA to run it locally. Right. So you better have a GPU that exceeds that and something like the RTX 30 90 above will do that. But if I give you a video model that's 200 gigabytes, you don't really have a single machine that could run it. So it's really hard for you to do research on it.'cause now you need multiple servers, you need to run it on AWS or wherever. Uh, and now the cost goes up and it just becomes more cumbersome. So I think this is one way to just squeeze a ton of stuff into, uh, you know. 20 gigabytes and under. Yeah, maybe in the matrix, the code they were looking at was really just an MP4 file. That that's how they could press the matrix data.'cause it's so much data that you can't look at it in real time. Yeah. Okay. And maybe the matrix is gian plots. Yeah. 4D GS Maybe. Maybe we're just looking at Gaussian splat now. We just don't know any better. Oh shit. This is, uh, this is the, for the after dark Denoised. Yeah. I think we're done here folks. One last one. This one's way more practical. Uh, contrastive flow matching. Yeah. So image generation models. Are fairly understood on how they need to be trained. You have the neural network, which is broken up into layers. You know, let's say you'll have, you know, five attention layers and then five diffusion layers, and then five, you know, alignment layers or whatever, right? So it's pretty well understood that in order for you to train an image generation model, this is more or less how you structure it, and then you just throw a ton of GPU at it and go away for a couple of days. It trains itself, and then you have. You know, 10, 15 billion nodes with each, its specific weights, and then you take that 20 gigabyte file or whatever and you load it in comfy UI and you do your magic. Right? That's how it's done now. So this paper comes out and says, but wait, you can do. Image generation model nine times faster with five times fewer steps with contrastive flow matching. What is contrastive flow matching? Okay. To the best of my understanding, okay. When you noise and dno, dno, DNO is an image. You do it in steps. Mm-hmm. So you can typically dial it in like 40. 40 different levels of noising will happen. Mm-hmm. And then 40 different levels of denoising will happen. Traditionally how it's done is all of that kind of funnels into one neural network. Okay? Here they're adding something called contrast and keeping each of those lanes separate. Okay? Leading to a far more efficient training. Okay? So if you add contrast in your training contrast like in how the images processed or out contrast like I believe so like what you and I think of contrast. Okay. So like boost a contrast. You don't have to do much else to change it. Yeah. But it creates a far more efficient, far more efficient training, training process because, uh, you're not mushing all that stuff together. Uh, so like step one, noise de-noise is separate from step two, DNO noise. Mm-hmm. And so on. I think it's cool because just from the fact of like, hey, you know, you could still do what you're doing, but just if you do it a slightly different way, you get better outputs. It's like the thing, or more efficient outputs, right? Like when you think you need NVIDIA's AI factory and 10,050 nineties, but wait, you can do it five times faster, nine times better. And you could do it with yeah, a handful of stuff. Now to the, to the defense of the large. Training models. Mm-hmm. There is the argument of like, you can't get to these more efficient routes or these smaller models without having the big models first. Mm. Yes. To shrink them. Yeah. That was also came up in interview with, uh, yeah. Uh, I forgot his name, but the, uh, who runs deep, uh, who runs Deep Mind, Google Deep Mind. Yeah. Where, you know, they're like, well, we got the smaller models. It was like, you can't get the minis. You need the big models first. Yeah. To train the minis. I agree. I agree. So, yeah. But everything has its use case, right? Like the mini models are better for running on an Apple device. Yeah. When Apple gets around to making exactly product, I'm right. You need them. Yeah. But you can't just create that from scratch. No. You need the big one. You need the big one to shrink it down to understand exactly what it's doing. Yeah. Yeah. The, the, I mean, we're gonna cover this in another episode, but the watershed moment for image generation was something called ImageNet, or AlexNet. Mm-hmm. In 2012. Mm-hmm. Fascinating history on like the first time Amos generation worked from a neural network. Yeah. Okay. Cool. Yeah, we'll break the town in the future. Yes. All right. Good place to wrap up. Hopefully I'm kind of covered wide base of interesting things, links as usual for thing we talk about@denopodcast.com. Thanks to Via Hero Web for our five star review on Apple Podcast. We thank you. All right, thanks everyone. We got you in the next episode.

People on this episode