Adversarial Input
The best way to stay on top of technology is to socialize with friends and chat about various topics. AI has been a hot topic lately and ippsec, odie and struggs believe this field is driven too much by hot takes and not actual discussion. While having chat's about others hot takes, they decided to record themselves chatting to share their own hot takes.
Adversarial Input
Fable Fallout: Guardrails, IDs, and the AI Coding Race
Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.
Episode 2 of Adversarial Input dives into the fallout from Fable 5 access, Mythos restrictions, and the growing tension between frontier AI capability and cybersecurity guardrails. ippsec, odie and struggs compare hands-on impressions, debate whether "unjailbreakable" models are realistic, and weigh the privacy tradeoffs of identity verification for advanced model access. They also dig into Anthropic’s developer workflow advantage, xAI/Cursor speculation, the rise of open-source GLM models, possible GPT-5.6 rumors, and whether AI cyber benchmarks are being distorted by hype.
Intro: Fable 5 First Impressions and Guardrails
SPEAKER_01What is going on, everyone? This is Ipsec. I'm joined again by Odie and Struggs. This is Adversarial Input episode number two. It's been about two weeks since our last episode, and a lot has changed. So what's going on, Ryan or Odie?
SPEAKER_02Yep. A lot is going on. Obviously, we got Fable access. Some people got Mythos 5 access again. We had it for a couple of days before the administration put that export control on it. Before we get into all the drama that happened, did you guys get to use Fable 5? Um, what were your thoughts on it? Did you get to use it with any projects, or were you still waiting on kind of conserving your tokens?
SPEAKER_01Um, I used it quite a bit. Um, it both impressed me and also wasn't that big of a shock. Uh, the thing that impressed me the most is just it was very much faster than Opus. I don't think it really did anything like my old harness couldn't do, or um, that was out of the reach of Opus. It's just like it was five times faster, probably ten times the cost, and it was just more enjoyable to use for me.
SPEAKER_00Yeah, I would I'd I'd kind of say the same thing. Um, I would say it's like a beefier uh Opus. Uh I will say that any of the cybersecurity-related topics I would try to use, it would quickly downgrade to Opus and warn me that that was kind of a restricted topic. Um, but any of the other topics I was doing, like web app refactoring, etc., it felt really good.
SPEAKER_01Yeah, it was definitely odd. Um, I've never really seen the guardrail just downgrade you to a different model and then keep working. Um, it was a weird experience. I didn't hate it, but I'm I don't think anyone else has really done that, where you just like are doing the most capable model. It's like this is probably not a good topic. Let's downgrade you to Opus kind of silently and just keep working.
SPEAKER_00Yeah, it it it it reminded me a lot of um the Opus plan model you could do where you'll get into the planning phase and then it'll downgrade kind of in the background to Sonnet, or you have some kind of the environmental variable set in cut inside of Claude Code where you can tell the agent to spin off and use different types of of models for different tasks. Uh I'm I'm presuming it was a very similar like workflow for them, but yeah, it was pretty cool. If they if they could figure out a way to when they do online it again, um to do that invisibly, I don't know if I'll mind too much. Uh, but yeah, that was super cool.
SPEAKER_01Yeah, I the thing that really annoyed me is like if cybersecurity was mentioned, um, it would just instantly downgrade you. Like I was kind of hoping to be able to use it more as an orchestrator and have it create some type of tasks and then pass it over to Opus, but I mean it the Fable couldn't even do like orchestration or anything just because how sensitive it was.
SPEAKER_02Yeah, it's kind of funny that you know what came out later, some researchers, I think it was attributed to Amazon, were able to find a quote unquote bypass around or like a jailbreak around the safeguards. But during my usage, too, right, it was very, very, very sensitive, like to the magnitude multiple times more than anything we saw with Opus 4.8. Anything remotely related to cybersecurity or anything else would immediately flag and downgrade. So it's kind of interesting to see everything else that came after that, and some researchers claiming they're able to get around it, because I think this is quite clearly you know the most enhanced safeguards and the strictest safeguards we've had to deal with yet.
SPEAKER_01Yeah, and I I think the jailbreak, like they're kind of a victim of their own making, or whatever that saying is. I don't think the jailbreak is as big as the jailbreak like people say it was. Like, I don't think it was like a complete jailbreak, you can do anything. Like the thing I read was like more of a second-order injection type of thing where you just like have Fable explain what security is or something, and then say, okay, go apply it to this repo. And it does some tasks that it probably wouldn't do before, but I'm assuming like you hit the guardrail eventually. Um, and by eventually I mean like relatively quick. So maybe instead of just getting turned away right off the gate, you get turned off on like conversation number two or input number two or something like that. So I don't think the jailbreak was as big as it is, but of course, like you find any vulnerability, no matter how big it is or small it is, you're going to like scream it from the rooftops when it's the biggest model, and the model that the company's been saying is a cyber weapon.
SPEAKER_00And and yeah, I I don't know. Like I would say the jailbreak stuff, I kind of agree with a lot of your points. Uh the from what I can read, it is attributed to Amazon. And the thing that they attributed was that they were asking the model to read a code base and fix its flaws, yeah. Um, which seems like stuff that vulnerability researchers are already doing, patch diffing, etc. Um, it's reproducible on all types of models. It's very possible that the jailbreak will can't be fixed, which is kind of awkward, right? You're you're asking to make a model that fixes your code, but can't see what's broken. Um, I don't know, man. It doesn't seem like much of a guardrail, it seems kind of like a contradiction. So it's going to be interesting to see what anthropic does here in the
Mythos, Export Controls, and “Un-Jailbreakable” Models
SPEAKER_00coming weeks.
SPEAKER_01And the craziest thing to me is like it got mythos kind of pulled because like Fable is advertised as Mythos with guardrails, and like Glasswing has been operating for probably close to a month, and like no one's really cared. And then all of a sudden, um, because Fable was released, then Mythos gets backtracked. And the thing that no one's really said that I'm kind of curious about because like the government letter got leaked, and I'm curious if that also means like anthropic employees also lost access to their internal like mythos level models.
SPEAKER_02Well, apparently, right, and this is something I've been reading about recently as well. They lost access potentially to mythos of five, but that doesn't hold true for the mythos preview model. Or just reading today, there's apparently another follow-on already to this newest model, this latest model, which is even like another step above it. So it has that export control list and the whole US uh citizen validation or verification doesn't apply to those yet either. So a very odd, like kind of middle ground spot that you're talking about.
SPEAKER_00I I think too, which which is uh I think another piece to this, right? Is you know, the White House released that executive order and they they kind of talked about this uh validation board. At first, it was you ship us a model. I think it was like SISA or or DOD uh and IC partners would kind of look at and come up with benchmarking to kind of green light a model. It was 90 days, it's it's shrunk to 30 days. Who knows? Maybe that's the way they go. They kind of give them access to the models and then they green light it. Maybe that's when we'll see mythos back, but or sorry, Fable back. Um, I don't know, it'll be kind of interesting.
SPEAKER_02Yep. So to kind of close the loop on that, right? The administration came back and said, hey, cannot re-release this until it is quote unquote like unjailbreakable, right? We all know that's most likely never ever really possible. So it's gonna be very interesting to see when they do re-release Fable to see what kind of enhanced guardrails are gonna be on what they already had. When are you have you guys heard anything? Do you expect it to be back this week, next week? What are your what are your bets on that?
SPEAKER_00Um I I haven't heard much. Um I've heard rumors like everyone has. Um I I saw you you posted some stuff about potentially adding uh identification deeper into those model uh anthropic usage. So say you had to like provide a driver's license or something like that, which I've heard in OpenAI land that's a normal thing they do uh when interacting with those levels of models, and that that may be enough to solve the issue, right? Who knows?
SPEAKER_01I do think it's funny the whole like unhackable mandate. It reminds me a lot about like just contracts in general in the early 2000s, where like your deliverable was an unhackable network. Everyone knew that was silly, and they just kind of like hand-waved and it's like, you know what, no one's gonna hack us. Um, and then like 2015 came around and all that like ransomware and stuff started spreading, and like it moved to assume breach. So I'm assuming the model's gonna go kind of the same way. As far as far as prediction of like when Fable and Mythos will be back, um, I'm not sure. Uh my main guess would be maybe July 8th. And that's only because that's when like all the rumors of identity verification is going to be coming July 8th, which I've never actually seen. I see them point to a help article, and I don't see July 8th in that help article. So I don't know where they actually got the name or that got the date. I think the weirdest thing to me is all these help articles keep getting like updated just constantly because Claude is probably the one doing all these updates. So people see them like making small tweaks to it, it says updated yesterday, and then they're like, Oh my god, they put in this identity verification thing. But um back in April, people got spun up because they started asking about identity verification in terms of age and other things. So, like those articles have existed for a while, it's not new. Um, I think them just saying, you know what, we're gonna use this thing that we previously used for age verification, now we'll use it for citizenship.
SPEAKER_00Yeah, how do you guys feel about that? Yeah, it's gonna be tough L to get past.
Identity Verification and Privacy Tradeoffs
SPEAKER_02How do you feel about the identity verification? I know I think uh Matt mentioned it, but I did it and submitted it for that very similar process, I think, that Anthropic's gonna use. I did that for OpenAI for part of their TAC program for their their cyber verification to get access to like GPT 5.5 cyber, right? You take in uh you take a picture of your you know your government-issued driver's license or passport or whatever, and then also do a quick kind of um selfie, right, where they see your face. Do you think that's kind of a step too far? Is it gonna be worth it for the fable access? Like, what's that balance there? Because I know Anthropic is kind of pushing themselves as you know, more open for everyone type of company and organization.
SPEAKER_00I I I'm not a huge fan of it. I I don't know. I Discord had this thing to ver and they backed off of it, right? Everyone got pissed. Very different beast. Um, I think a lot of people just don't like that. Um yeah, I'm not a huge fan of it. Um, anything super technical I do, I think I would just use mythos anyways. Um but uh I guess we'll have to see. I'll have to see how they implement it, who they go with ID verification, the the scaffolding they build around it. Uh, I'd imagine their smaller plans they to soften the blow if they move in that direction will probably not require that. Like if you're just a just a uh anthropic model usage or clot usage in a free plan or like a like a pro plan, I'd imagine they wouldn't use it. I could imagine it'll be all around fable and maybe clod code usage. I don't know.
SPEAKER_01I think it was around um all models, but not on enterprise. I think it's just like your max plan and stuff. So I'm not positive by that. Okay. They said the companies, I forget the name. Um, I want to say it was Persona that did identity beer scapication. Um I don't like it. I a lot of other places have done it. Um it's not one of those things where I'm going to protest and not give up on my ID. Like I just assume my identity is already in so many databases that I don't see the harm of just putting it in one more. But that's like my take. And I know many people don't agree with it, but I mean I liked Fable. It made me a lot more quicker. Uh I wouldn't see myself delaying over that. And I think Discord still does identity verificate verification, it's still just on um dangerous servers, I guess, that just have like not safe for work things and other things. Right, okay. I do remember that. If you want to do some of the voice stuff in Discord on some channels or something, you need to have identity verification. Or I think if you have Nitro and you pay them, they're like, you know what, you had a credit card, we assume you're age. Maybe that's happens. I don't know. Um Discord's never asked for my ID. They also like I want to say used AI. Like if you like stream and then have your cam open, it does like some AI thing to estimate your age or something. I remember reading something vaguely about that. So there's other ways to do it for age.
SPEAKER_02That's interesting. Yeah, I kind of feel similarly, right? Like, um, I'm I'm I don't like it, I don't enjoy it. I don't wish I didn't I don't really want things to go that way, but I feel it's very similar to almost TSA Precheck where I'm gonna submit myself to maybe an enhanced scrutiny to be able to get a benefit that I'll take use of. So I would do it and I'm going to do it most likely, but you know, it's not ideal. But I understand it because these models are significantly different and based on Fable 5 access, right? Like it was a significant step above Opus 4.8. So for those reasons, I feel almost obligated to be able to stay up to date with it.
SPEAKER_00I felt like the underlying theme there was like we're gonna hand it over, but it's at the end of the day, is it gonna be really effective towards the goal that they want? Is if that's the export control part, is that if that's the uh no foreign access part? Um I don't I think those the specific thing that they want to defeat. I think it is it is going to be very trivial for adversarial people to circumvent what it's meaning to defeat. So at the end of the day, all it's going to be is just another piece of PII that we now give and give to them, and it's going to be attached to for me, a lot of um like I don't really care about like my prompts and conversations, but the part that just gives me the e bjees is this is kind of like a stream of consciousness that that this company has, probably a lot more uh intimate than most people that use it probably would really want, which might be the foundation of why a lot of people don't like it, right? Um very interesting.
SPEAKER_01I 100% agree. I don't think Anthropic wants it. I I don't see it actually doing anything other than accomplishing the goal of getting their models back online and like checking a government checkbox. Like I think that's all it is. I think the bad people that want access will find ways around it. I know like um I used to do like professional StarCraft 2 and commentating. In order to play on the Blizzard Korean servers, you needed a Korean Social Security number. Um, but no, like you need the Korean Social Security number in order to play on Blizzard servers in Korea. So like that was easily circumventable for me.
SPEAKER_00Yeah, I remember that. That's that's that's actually a really funny uh analogy. Yeah.
SPEAKER_01So like I don't think it's gonna do anything other than get it back online. Um if it helps them get back online, sure. I think like if this was the first like kerfuffle they had with the government, they probably like do more of the legal thing. Like, I think there's probably more legal precedent for them going after and going in court, but I think that's just going to cause more issues. Like at some point, if you just take the same people to court over and over, they're just like, you know what, we're gonna keep upping the ante until you're no longer existing. So I don't think that's a battle they want to fight anymore. So they're just going to like do the bare minimum, get it back online.
SPEAKER_00And they have a good product, man. Like that that that's gonna help them in the end, right? I think no matter what the administration feels or or the higher ranking members of the US government feels, at the end of the day, it's a fantastic product. So the people that actually need to get the the job done are looking at it like, man, this is a really good tool and I need it. Um, I think that's gonna be very motivating
Anthropic UX, Cursor, and Coding-Agent Competition
SPEAKER_00to them.
SPEAKER_01Yeah, and I think the coolest thing, like not the coolest, but like the biggest thing with Anthropic is the ease of use. Like, I don't think their models are that superior to like open AIs or even like open source GLM. Like, that's probably not as good as Fable or Mythos, but it's probably good enough where you can take it to the next level with orchestration. They're just like really good at making it user-friendly and a um safe experience, I guess. And by safe experience, I mean like I'm no longer hearing Claude dropping Terraform databases and things like that. Like, yes, there's a lot less mistakes being made. So, like, I kind of trade it back to um when Amazon first came out. Amazon like undercut the market of online purchases by a lot, and now like Amazon's not often the cheapest, but it's just so easy and they've had so much user experience. You don't really look at other online shops anymore. Like, yes, people just have prime memberships because they trust it's giving them something. I I don't really think it gives that much anymore. Like you can get free shipping without any things like that, but it's just such a good user experience. Like you open up the support chat, they're like, Yeah, we'll refund you, just destroy the product, it's fine. And like you don't look at competitors. I think that's kind of what Anthropic does is they just have a really good user experience.
SPEAKER_00And that was a quartet it, right? Is their their goal was to make it inconvenient not to use it. So the the the developer workflow, yeah, I agree. And and claw and claud code and all that stuff is just very top-notch, which I think if we wanted to pivot a little bit, this is potentially why SpaceX was purchasing cursor, right? Is they're they're trying to get get break into this dev tooling space. I've never used it, but it seems to be like that seems to be the strategy, right? There.
SPEAKER_01Have you used cursor much, Odie?
SPEAKER_02No, not very much, but I'm interesting to see now that it falls under SpaceX or XAI, right? How that's gonna work with Grok build. I was looking into that a little bit. It's gonna be interesting to see if they're gonna be able to combine with the cursor, their data set, whatever kind of models they have internally, and see if they can push out a potential third. Well, I don't want to speak badly about Gemini, but maybe another user, another competitor in the space with these CLI coding tools, these agentic models, right? Because oddly enough, it seems to be really dominated by anthropic and open AI right now.
SPEAKER_01Yeah, I think X is like the dark horse in this. I don't think like I don't hear anything about Grok or anything. Like occasionally I find the random offensive person is like, oh yeah, I just love Glo Grok because it's not like denying me or doing other things. Like, I don't really know how to use it. I think their user experience is pretty bad. But at the same time, I'm constantly surprised with just how much money they have somehow. Like they've definitely felt like they've done some cheat code in life where they just have unlimited money. Um, saying anthropic, like saying, Oh, yeah, we just bought some of the old X servers or rented the space from X because they upgraded their model and now I can use their older equipment. I'm like, they're that far ahead in terms of just what they have, yeah.
SPEAKER_00It it it it it really there's so there's so many of these companies now, right? I mean, this topic is so hot. Well what does it take to make a really good AI company? Is it is it just having more money than the other guy? More money to buy talent, hardware, um, more just more employees. I w this will be tested, I think, uh with SpaceX Groc.
SPEAKER_01That reminds me of like, did you hear about the story of all birds?
SPEAKER_02No, yeah.
SPEAKER_01It's a random shoe company that for some reason just said, you know what, we're pivoting from shoes to AI, and their stock just went crazy. And they've actually just like abandoned shoes and now they're an AI product. I don't understand it.
SPEAKER_00Hey, I got I their their board probably loves it. Probably love it.
SPEAKER_01I just googled it. Um, it's from like C N B C struggling shoe retailer All Birds makes bizarre pivot to AI, adding 127 million in value instantly.
SPEAKER_00Interesting, yeah. I mean, we'll probably see that. I it's a it's a thing I've been kind of thinking about as these LLM providers uh just amass more and more um value. You got XAI gobbling everything up, you got largely Elon, right? And just all these other providers, what it's really going to look like when they decide that these developer tooling uh were uh largely developer focused uh workflow environments, right? That's kind of their space they're in, when they will decide to just kind of For products that just do other, I guess, um uh uh like uh different types of uh like job types. Like say like you go to anthropic for like legal counsel or something like that, where they just release a product. I could see something like anthropic just being like, screw it, we're just going to make uh a cybersecurity like product and just use all of this this data that we've accumulated and training from these. They say they're not trading off of probably company data, but I don't know. I I think they probably are at some level, they're getting a lot of experience in it. They're air quotes forward-deployed engineers, right? Uh, they're they're learning things and taking back to their companies. I I think in a decade's time, maybe less, these companies are just going to decide to operate in these spaces and just offer products, you know. Anthropic has that design tool. That was a multi-billion dollar buyout of Figma, right? That was you, you you kind of get in a col a collaborative space and like design a UI. That that was just kind of like an offshoot product that they now offer, you know. I I just see them kind of pivoting into this these more product level presentations. Um, so it is kind of interesting to see where that goes.
SPEAKER_01I not only do they just offer, I think they did a better job. Um, I somewhat used Figma, like I think by better job, I just mean it wowed me when I was just in my terminal, and then Claw is just like, you know what? I can mock up some web server for you, and then you can choose it. I'm like, wow, this is just like clawed design, except it tricked me into using clawed design.
SPEAKER_00Yes. I I I think they could do this with a lot of stuff, man. I think like the we we hear the like stuff about the marketer there just being one guy, right? I'm sure he just he or she could just has a product that he's probably workshopping internally on just like anthropic running all of your marketing. I I think I think there's a lot of products like this that are going to evolve, and it kind of sucks for all the other businesses out there that are maybe kind of if if if if that and those LLM models are kind of or these these large Lego models are kind of in your tool chain, it sucks, but what value are you kind of providing if it's all trickling into that model? I I think you're kind of setting yourself up for failure. You may need to like figure out how to diversify off of that or figure out another niche. Um Yeah.
SPEAKER_01It it sucks for all the people that are just like sharing their experience with like making a better um open code or claude or whatever, whatever you call this. Um maybe the harness is the term, but like all those things, like people like, oh, like the whole sub-agent workflow wasn't originally an anthropic thing. Like you had skills and other things that kind of use that. I think it was part of open code, and then all of a sudden anthropic moves in and just kills your whole business model because it took over. Like, I think the number one rule right now is don't do something like don't be an enhancement to an AI product. You have to use AI to build something new, and that new thing just can't depend on AI. Yes.
SPEAKER_00Um
GLM 5, Open-Source Pressure, and GPT-5.6 Rumors
SPEAKER_00I think another thing that uh we kind of chatted about loosely was the uh GLM52 uh open source stuff from ZAI uh releasing a not fable level but Opus level. Um uh I think it was dropped two days after the Fable ban. Um and I I I looked a little bit into the benchmarking. Uh I think I can really reiterate that it that it is near Opus. It does beat GPT-5.5 uh and it trails Opus a little bit. Uh the main thing here is that it's one million context, it's one-sixth the cost, and it's just more open source models that are coming from Chinese companies that are that are doing really well. And I think we're just gonna see more and more of that. Uh, and people that run their own home uh hosted lab stuff are probably super excited about this stuff if they can even run it. Um but it is pretty cool.
SPEAKER_01I don't think anyone can run GLM five at home.
SPEAKER_00Yeah, yeah. It would probably be very difficult. I've seen like uh I've seen people that had like crypto rigs, uh crypto mining rigs that have kind of just been like, screw it, I'm just gonna run models now, uh, and have had some success with uh like the the the chemey uh big models. Uh but yeah, I agree. I think if you're just like your average Joe with a couple of graphics cards, I don't think you're gonna be able to run anything that that serious.
SPEAKER_01Well, I think the GLM5 specifically required a graphics card that you can't really buy as a consumer that costs like at least 50 grand a pop. Yeah, if I have like eight of them. Like I think I did the math, and like to run GLM 5.2, like I would have to have around 800 grand.
SPEAKER_02Yeah. Still pretty gigantic, right? But it is interesting that they released something that's even on the level of Opus 4.8 or GBT 5.5. I haven't seen anything that said it beat um 5.5. I'm a big fan of 5.5, but still interesting that they even got in range. And then oddly enough, uh, someone posted on uh X or Twitter about um when are we gonna see something in open source model that is in the range of Mythos? And I think the CEO of um DAI was like, give us a couple months, or like don't hold your breath or hold your breath or something. Like literally, they think it's coming soon in the next, I mean the near future, right? So that is gonna be a crazy game changer, right? If we get a open source LLM that's close to that level, mythos level of performance and everything else, it's gonna be insane to see.
SPEAKER_00Which is nuts to think about, right? If if you could get fable level performance and all you have to do is wait a couple months, why not just wait? You know, it it's gonna be interesting.
SPEAKER_01If if the AI is as good as it says it is, then why why not like just give up your ID to get it early? The A is gonna be able to find it anyways. Yeah, yeah, yeah.
SPEAKER_02Yeah, GLM is gonna get it soon anyway, so you might as well just put it up there. It's gonna have whatever it wants soon.
unknownThat's really funny.
SPEAKER_01It's when they stop asking for this information, is when you should be scared. Yeah, that's a really good point.
SPEAKER_00Yeah, I would I would say the thing I read that talked about it beating uh G uh GPT-5.5 was uh the terminal benchmark stuff. You you have tons of these kind of uh uh model evaluator services and stuff like that that do a wide range of uh of model like comparisons. I would say they're c're they're all over the place, but um yeah, I think generally it's it's a it's a it's a good model that's up there.
SPEAKER_02Yeah, speaking speaking of right rumors around that GPT 5.6 is coming out soon. Some people are thinking this upcoming Thursday, which would be what the 26th, the 25th?
SPEAKER_00Polymarket. Polymarket says 83, 89 percent by June 30th. So uh yeah, it could be some some employees place of pets, you know.
SPEAKER_02Are we thinking mythos fable level, or is it gonna be another enhancement improvement on 5.5? Where where do we think we're landing with that? Do does open AI have some type of you know pressure on them to be able to get to that mythos fable level of a model, or or is like another kind of improvement going to be enough for them?
SPEAKER_01I hope to god that open AI says this model is mythos equivalent before the government like relax the restrictions. Like as soon as I see that, I'm gonna start making some popcorn. But yeah, I I don't think they're going to like I think that is a banned word from any marketing now. Like even the GLM 5.2 stuff, which is the open source model. Like, if you went back a month or two ago, they were always saying this is gonna be the next mythos or whatever. And now they're like, you know what, maybe we shouldn't say this because we don't want to get that type of attention.
SPEAKER_00Or the or the weirdness of they do say that and they don't get pulled. That's a lot of drama there, huh? Like uh Anthropic gets pulled, open AI doesn't. Also, if the the node jailbreak is publicized and people just do the same thing in the open AI space and it just works the same way. Um you know, GPT-5.5 has built-in cyber safeguards. Who knows if those are going to function just similarly or they have the same I would imagine the same jailbreak will probably work on I would I would hazard a guess that it's going to work on 5.6 if if the jailbreak is we asked the model to fix the code and then they did like an elaborate patch diff. I don't I don't foresee any model ever ever ever not that not working on.
SPEAKER_02So yeah, that's kind of the thing, right? Is was it about that? Was it about the history with the administration and on threat and on thrap anthropic uh specifically, right? There was a lot of backstory and baggage with a lot of these things, but do we think 5.6 or are the other models are going to be held to the same standard, or do you think maybe kind of like uh Ip was alluding to that some of this incredible marketing or this this hype or you know this doom trolling of that mythos level model is what kind of put anthropic in a sticky situation?
SPEAKER_00Doom trolling, that's a new one. That's that's good.
SPEAKER_02That's cal cal Newport. Um he he he termed it that I don't know.
SPEAKER_00I I think you know, if it is the the the the the baggage with the US government, it's clear that openai has a better relationship with the US government. So if they can do this and kind of allude that it has the same capabilities and nothing happens, then we know we have our answer there, right? We'll we'll know when this releases and it hits and how the internet and everyone uh how wherever it all lands, we'll we'll get our answer pretty quickly. Um but supposedly the the rumor too is that it's gonna be 1.5 million context. Um I think I think some grop models is it did they two uh possibly two million context?
SPEAKER_02I don't know about grok, but I saw I saw the rumor at 1.5 or 2 million context for the potential next.
SPEAKER_00But uh that that kind of stuff, I don't know if I'm super excited about, but but but uh and this is the the he you must not be named thought though is if it's mythos level, uh that'll be super cool.
Mark Warner, Claims, and AI Cyber Hype
SPEAKER_02Cool. Maybe maybe the last thing, I don't know if we talked about this beforehand, we might want to cut this out, but did you see the uh the link of Mark Warner, vice chair of the Senate Intelligence Committee, claiming that the head of the uh national security agency told Cyber Command that Mythos broke into all of our classified systems, not in weeks but in hours. Like just continuing with that hype. Do you believe that's true? Taken out of context? Like again, are we just in this crazy hype cycle? I just thought it was pretty crazy. It's got the it's got Twitter and Ex all riled up today.
SPEAKER_00I I think if you point these these models at any code base or network or could like software configuration, it is going to shake vulnerabilities out. Um, especially if you're we're kind of transitioning from you know, you know, humans are always infallible, but just largely only human-driven design. And then you you you you throw AI into it that can work 24-7 and just enumerate everything. I don't know if I could see any network that could survive if you just gave uh agents admin access to your to your to your network or configuration and just it came back with nothing. I I just don't think that can be true. Um I don't know. It could be a lot of um just the the the quote could be maybe taken out of context or or or or just he was relayed information and then he just kind of played the telephone game and then it it it landed on him and he's just like oh it k it destroyed our networks in three seconds like Skynet. Um I don't know. I I I have just seen no one has survived the the the the eye of Sauron cut uh of AI looking at your stuff and not finding things.
SPEAKER_01So yeah, I have two quick thoughts of that. Like I think number one, um it's kind of the game of telephone and what actually is hacking in. Yeah, it's kind of like last week we talked about I want to say, like everyone publishes this research, AI is really good at vulnerability development, but is it good at the other aspects of cyber? Like red teamers don't really find O days in products, they go and exploit other weaknesses, and that's the information we don't see about cyber. Um I'm also questioning not that like it could do what he said, but I'm also like, did they actually just launch an AI open-ended at their secure network and just let it have like go ham? Like I don't think any organization has really done that. Like I can't imagine any Glasswing partner, um, let's say like NVIDIA, like, do you think NVIDIA would just open up Claude on a workstation and say, you know what? Go hack all our stuff and see what happens. Like, I think that's a recipe for disaster. Not like is it going to find vulnerabilities? Of course it is, but I also think like it's probably going to take a lot of networks down and do a lot of stupid things that you just like a human won't be able to recover that network in any meaningful time. And by that I just mean it's gonna make so many changes to users, things like that, that is just trying to accomplish its goal. It's working really fast and like making so many changes that just take too long to like revert. At that point, you just have to hope it didn't touch your backups and say, you know what? We're going to like just turn the clock back, restore everything from before we were idiots, and just launch the AI at our systems.
SPEAKER_02Yeah, I think that really raised uh eyebrows for anyone that has pen testing, red teaming, any type of hacking experience, right? As soon as we saw that, we're like, okay, I'm not quite sure that's the case. I would love to hear more about that as well, but I agree. It sounds like a game of telephone or someone trying to relay something to maybe someone else or a politician that's not as technical, but you know, definitely a great headline, right? But how true is that?
SPEAKER_00I'm sure some of these networks, uh vulnerability scan, to probably have the same effect. So uh they're probably like, oh crap, don't hit us with Nessus.
SPEAKER_02Um, but uh you know get taken down by NMAP. Yeah, 100%. One of the other things, maybe wrapping up, this is uh
Off-Topic Wrap: House of the Dragon and TV
SPEAKER_02off topic, but the new season of House of Dragon coming out. You guys think it's gonna be better than the last one? It's gonna be a good one. It seems to be getting some initial really great reviews. How how are we feeling about that? You guys gonna be watching?
SPEAKER_01I'll be hate watching. Like I was so fed up with all of like Gamer Thrones stuff, and then um Knight of the Seven Kingdoms came out, and I like hate watch that, and it got me back excited about the series because I really enjoyed that. That was good.
SPEAKER_00Yeah I I will I will also be excited to watch it, although it's it is interesting, right? Like uh did you see the the the DOJ approved the the the paramount buy, which means that Netflix could potentially get Game of Thrones. That's crazy, right? To think about.
SPEAKER_02No, yeah, that's wild. I think, yeah, that's pretty crazy. I think I don't know if the first season of House of Dragon was really good, the second one seemed like it was kind of slow. So obviously, like you're saying, uh Knights of the Seven Kingdom, pretty good as well. Seems like it went by quickly, but I'm excited to see. Hopefully, this is back up to par with what we've experienced during the kind of the heyday of Game of Thrones, right? During those really amazing seasons.
SPEAKER_01I my only wish is Game of Thrones somehow went to Apple. Oh, oh yeah.
SPEAKER_02Yeah, did you finish uh Widow's Bay? Yeah, yeah, that was pretty good. That was a good one. I enjoyed that a lot. That's probably one of the better series for sure this year.
SPEAKER_01And like Maximum Pleasure Guaranteed is also pretty good.
SPEAKER_02That one was hyped. I was trying to start that when I was stressed. I think I had to go give like a talk down at a conference or something, and I was already a little bit wired up, and I was trying to watch it, and I literally had to turn it off because it does that really good job of making you like conveying that apprehension and stuff. So I would like to finish that. That looks good.
SPEAKER_01Okay, well, I think that wraps up this week. So uh yeah, take care everyone. We will try to put out an episode in two weeks from now. So hopefully Fable's back and we have a lot of interesting things to talk about.