The Macro AI Podcast
Welcome to "The Macro AI Podcast" - we are your guides through the transformative world of artificial intelligence.
In each episode - we'll explore how AI is reshaping the business landscape, from startups to Fortune 500 companies. Whether you're a seasoned executive, an entrepreneur, or just curious about how AI can supercharge your business, you'll discover actionable insights, hear from industry pioneers, service providers, and learn practical strategies to stay ahead of the curve.
The Macro AI Podcast
Does Claude Learn from your Code?
Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.
The concern is understandable. If your team is building a specialized AI product on Claude — with custom agent logic, refined system prompts, proprietary data pipelines, and hard-won product insight — it is natural to wonder whether that work could somehow make the model smarter and eventually benefit a competitor.
Gary and Scott break down the issue clearly and practically. They explain the difference between three things that are often confused: in-conversation context, Claude’s account-level memory features, and the underlying model weights. The key takeaway: API usage does not update Claude’s model weights, and a competitor does not gain access to what Claude remembers within your account.
The episode also walks through Anthropic’s commercial data protections, including the default policy that commercial API inputs and outputs are not used to train generative models unless a customer opts in. Gary and Scott also discuss API data retention, zero data retention options for enterprise customers, and the practical areas where teams can accidentally create risk — including browser-based prototyping, feedback buttons, and partner program opt-ins.
Most importantly, the conversation turns this into an operational playbook for business leaders:
Use the API for serious development.
Audit whether developers have disabled model training in browser settings.
Avoid feedback buttons on proprietary workflows.
Create a clear approval process before joining partner or beta programs that involve data sharing.
Gary and Scott close by reframing the strategic question. For most AI products, the durable moat is not the prompt itself. The real competitive advantage comes from proprietary data, customer relationships, execution speed, product insight, and the feedback loops that compound over time.
This is a practical episode for executives, founders, product leaders, developers, and investors who want a clear answer to one of the most important AI business questions: where is the real IP risk, and what should teams actually do about it?
Send a Text to the AI Guides on the show!
About your AI Guides
Gary Sloper
https://www.linkedin.com/in/gsloper/
Scott Bryan
https://www.linkedin.com/in/scottjbryan/
Macro AI Website:
https://www.macroaipodcast.com/
Macro AI LinkedIn Page:
https://www.linkedin.com/company/macro-ai-podcast/
Gary's Free AI Readiness Assessment:
https://macronetservices.com/events/the-comprehensive-guide-to-ai-readiness
Scott's Content & Blog
https://www.macronomics.ai/blog
00:57
Here's a scenario for you. You spent the last four months building a specialized AI product on Claude. Custom agent logic, a system prompt you've refined over hundreds of iterations, a data pipeline that nobody else has thought to wire together the way you did. It actually works. Your users love it. And then one of your investors asks a question that you weren't quite prepared for.
01:27
What's the stop in Thropic from using everything you just built to make Claude smarter and then handing that edge to your competitor down the street? Welcome to the MacRae podcast. I'm Gary Sloper. And I'm Scott Bryan. And Gary, that investor question right there is one we've both been hearing kind of relatively frequently lately from founders, CTOs and from product teams who are building
01:54
some pretty sophisticated workflows on top of AI models. And that question, I think it's a fair question. There's definitely real intellectual property on the line. It's a fair question, Scott. And from what I'm seeing and what I find interesting is that the concern is coming from really smart, technical people. It's not just folks who don't understand how the models work. It's not relatives saying AI is scary.
02:24
So that's kind of the interesting part there, but because on the surface, the logic seems sound, artificial intelligence gets better over time. You're feeding it to your proprietary inputs and therefore some version of your cleverness is going into the model, right? Yeah. I mean, and that's, that's the intuition that I think we should peel into a little bit today because it's, it's, it's really based on a mental model of how LLMs work that actually turns out to be wrong.
02:52
really important ways. And in this episode, we can walk through the technical reality, the legal protections, and what you should actually be doing differently, if anything. Right. And for this episode, we'll be as specific and current as our research allows us to be. So that's our caveat for today. And by the end of the episode, hopefully you'll know exactly where the risk levels are and if they're real and where they're not.
03:20
and what steps are worth taking as you master your AI dimension here. Yeah. Yeah. So let's just jump right in with the mental model that's kind of driving this sphere. I think most people, even technical people, intuitively think of AI systems as something like a highly capable employee who gets smarter the more that they're exposed to. So if you brief them on your architecture, absorb it, and
03:49
now that knowledge lives in their head and is potentially available to whoever they work with when they walk out the door. Yeah, and I'll be honest, when I first started working with these models, that was my intuition too. I don't think most people probably didn't have that feeling. And honestly, it's reinforced by the experience of actually using CLAW day to day. So if you haven't, you may not have that same perception or at least opinion.
04:16
Because if you use Claude AI regularly, it does remember things about you. It knows your name, it knows your industry. It will remember how you like to communicate with it. And you can actually customize that if you haven't used Claude before. So when someone says Claude doesn't retain anything from your sessions, a regular user's immediate reaction is going to be that that's really not.
04:40
you plainly true. I've seen it remember me and like I mentioned, it remembers some of my quirks, I think, as I prompt it. So it's not 100 % true. Yeah. Yeah, exactly. And I think that reaction is totally valid because flawed does have memory. But this is where we need to be a little bit more precise about what kind of memory we're actually talking about because there are three distinct mechanisms at play. And I think conflating them is exactly what leads to the intellectual property concern that we've...
05:10
opened up with. Yeah, that's a point. We should probably jump into those. Yeah. Yeah. I'll kick it. I'll just kick it right off. So the first is in conversation context. So that's within a single session. Baud has access to everything that's been said in that session. And that's all sitting in what's called the context window. So it can reference anything from earlier in that particular conversation. And that's why it feels coherent and aware per se. But that window
05:39
is active only for the duration of that session. And then the second is Claude's memory feature. So this is separate. It's a separate explicit system that's built into Claude. This is where Claude stores summaries and facts about you across those conversations. So that'd be your name, like, Gary. So your preferences, your job, your communication style. And when you open up a new conversation, Claude already seems to know you, but...
06:07
And that's this ah memory feature at work. So it's real, it's intentional and Anthropic is, they're transparent about it. ah But the critical thing is that it's a feature that stores structured summaries in a database. So it's not the model that actually learning or updating its parameters. And I'll just cover the third one too. The third ah is what this intellectual property question is actually about. And what it is, it's...
06:37
it's model weights themselves. So the core trained parameters of Claude are the model weights. And these are what would need to change for Claude to actually learn your proprietary architecture in a way that could benefit a competitor. The model weights are completely static during the inference period. So your API calls and your conversations just don't touch them. Yeah. So when we talk about the IP risk, the fear that you're
07:05
clever agent architecture is somehow bleeding into clod and becoming available to competitors. ah That's what we're talking exclusively about that here on the third leg that you just mentioned, the model weights. And the answer there is unambiguous. Your API usage does not change them full stop. Right. Yeah, exactly. Yeah, the memory feature is a separate system entirely. So it's scoped to your account.
07:35
So another user or a competitor, they don't get any visibility into what Claude has learned about you through that feature. So it's not a shared pool of knowledge about all users. It's a private store tied to your specific account. And the IP concern and the memory feature are just simply two totally different systems doing different things. Right. And I think then, you know, naming that distinction upfront is actually what makes the rest of this conversation credible because
08:05
If we just said, you know, Claude doesn't remember anything without that context and, you know, any regular user would rightfully call us out. The right statement is more precise, right? So Claude has memory features that work at the account level, but none of that touches the underlying model. And as you were just talking about, and none of it's accessible to anyone else, as you, you know, you talked about as well. I think that's an important piece. I understand where the side of caution could be, especially
08:35
governance standpoint. Um, but that's very, very different thing from your intellectual property leaking into the model and it's complete chaos. Right. Yeah, exactly. And with that distinction that you just made totally clear, if we jump into the architecture a little bit, because I think understanding how the model works at a technical level makes all this a little bit more kind of concrete for the listeners. Yeah. All right. So, so we've established that there are different kinds of memory and the IP risk.
09:04
question is specifically about the model's core parameters, the weights. So, you know, that's actually happening at the model level when a business makes an API call. ah I'll be brief. When you, when you, when you send a prompt to Claude via the API, you're doing what's called inference, which we've talked about before and a little bit earlier. So you're running a forward pass through a neural network. Think of it that way. Yeah. Yeah. A forward pass in tech jargon. So.
09:34
Right. The model takes your input, processes through its layers and generates a response. ah So the model weights, the numerical parameters that encode everything it knows are really completely static during that process. They don't update, they don't store your input, and each API call is just simply uh stateless from the model's perspective. Right. So to be more concrete, the model that processes my 9 a.m.
10:03
uh API call is functionally identical to the model that processes a competitor's 9.01 AM call. So one minute later, my prompt didn't change the underlying model in any way that affects what the next person receives. Yeah. Good, good example there. And there's no shared memory store. There's no cross user database being built from live API conversations that feeds back into the model. Each inference call is isolated.
10:30
So the architecture as deployed has no mechanism for one user's input to modify the model weights that another user's call actually runs against. And now here's where I want to push a little bit further, because I think this is the real question a sophisticated CTO or general counsel will ask. I could see them saying, okay, fine. The model doesn't learn in real time. Benetropic is collecting API traffic. Couldn't they?
11:00
Scoop that up and use it in the next training room. Uh, my prompt chain ends up in the training data set. They train on it and capabilities coming out from my work get baked into the next version of Claude available to everyone, including my competitors. I could see them asking that. Yep. No, that's the right thing to be concerned about if you're, if you're, if you need to be concerned about something. And, and that's where the distinction between inference and training really matters. Uh, so model training is a massive.
11:29
deliberate engineering undertaking. It's not just a passive or continuous process that's kind of silently ingesting your API traffic, you know, nine o'clock, nine oh one. So engineers create a specific data set. run enormous compute jobs and then they deploy a new version. uh It's, it's a separate, totally intentional pipeline and, and commercial API data is explicitly excluded from it by default. Right. The other
11:59
area that you have to take into considerations. Anthropic states this clearly in their terms, so their T's and C's. Inputs and outputs from commercial products are not used to train generative models unless you opt in. And that's a big one. So it's worth being clear about the legal weight of that. When you're paying for API access, you're in a commercial agreement. Anthropic uses your proprietary prompt chains to train base models without consent would
12:28
would simply just be a policy violation. would, you know, as their own documentation states constitute a massive breach of contract. And I think that's a, you know, a very meaningful legal backstop that anyone would want to be aware of. Yeah. Yeah, certainly. And then I want to add more, uh one more, a little bit more technical framing, uh which kind of I'll, I'll state as a general principle of how large language model training works and not a specific claim from Anthropics.
12:58
documentation that you just referenced. So training these models involves processing huge quantities of data across billions of tokens. Any individual prompt is really an extremely small signal relative to the total scale of the overall data set that they're training on. So the model learns broad statistical patterns across the entire corpus, is a pretty typical term.
13:23
It's not encoding and storing specific prompts as retrievable knowledge. So even in a hypothetical world where your prompt ended up in a training run, the mechanisms by which that could actually reproduce your specific architecture and someone else's output is really not how the models actually work. And that said, and I want to be clear about this, the right protection is not to rely on that technical argument. The right protection is the contractual one. That's solid ground.
13:52
So what we should probably do now is talk about exactly what those protections look like. Yeah. Yeah. So Anthropic is explicit in their commercial terms and the protections are obviously meaningful. For API users, inputs and outputs from commercial products, your prompts, your system instructions, your agent logic and Clod's responses to those are not used to train generative models by default per their commercial terms.
14:22
Right. And to reinforce the legal dimension here, when you're paying for API access, you have a binding commercial agreement. This isn't a privacy policy that can be quietly updated. Secretly ingesting proprietary prompt chains to train base models would essentially in Anthropics own words, constitute a massive breach of contract, real liability, real enforceability. And quite honestly, I don't think that they would want that anyway because it would just open up the floodgates. Yeah.
14:52
It could be huge. And then, you know, beyond the training policy, there's the data retention piece. And this is where the picture gets a little more nuanced because nothing persists, quote unquote, is not quite the right way to say it. So to be precise for API users, Anthropic retains input and output data on their infrastructure for seven days after a session. And that brief window exists.
15:18
really solely to monitor for malicious abuse if they need to. So detecting cyber attacks, policy violations or whatnot, it's not for model improvement. So after, and then after seven days, it's deleted. Right. So really the accurate statement is the model has zero memory of your session the moment it ends, right? The data briefly on enthropics infrastructure for abuse monitoring and then, and you know, from that perspective, and then it's gone.
15:47
There are two separate timelines, the way I look at it, and both of them are short and purposeful. Neither one involves your intellectual property being used to train anything based on what we've seen here. Yep. Yeah. And then for enterprise customers who need even stronger guarantees, there's zero data retention or ZDR. So for qualifying large scale, large enterprise customers,
16:13
Anthropic offers agreements where data is never written to disk at all. So it's, it's processed entirely in memory and, and it's gone the moment the response is delivered. So if you're operating in a highly regulated industry or handling, you know, genuinely sensitive IP at scale, that's a tier that you should probably be looking into. Right. So strong default protections and even stronger enterprise options are available. Yeah. So
16:42
Let's walk through the actual exceptions to balance it out. The specific scenarios where data can end up being used for training. Some of these are not as obvious to many users and getting caught up in one of them accidentally is a real risk for most teams. Yeah. Yeah. There are essentially three. So first is uh explicit opt-in. So if someone on your team accepts an invitation to join a developer partner program or agrees to share data as part of a specialized beta,
17:12
arrangement that may be granting permissions for that data to be used in training. those types of programs can offer real value, but that trade-off needs to be a conscious, documented decision, not something a developer just clicks through to get early access to uh a newer pending feature. Yeah. Good point. And from a governance standpoint, who at your organization has authority to...
17:39
opt into partner programs that involve data sharing. That answer should exist well before you receive the invitation, not after. ah It's a short policy conversation that I think really closes a meaningful gap in your business and something that you need to be mindful of and probably have to start that as a topic as you build out your center of excellence for AI. Yeah, definitely. And then second, and this one typically surprises people.
18:09
the feedback buttons. So the thumbs up and thumbs down interaction and API consoles or developer dashboards and clicking on those will kind of spark a response to a proprietary prompt that grants explicit permission for that specific transcript to be submitted for review. So we're clicking on the up and down thumbs. the source documentation describes it as manually submitting feedback to report bugs. It feels like you're just, you know, rating a response quality.
18:39
But what you're actually doing is surfacing that conversation right into Anthropix team. Right. So simple guidance here for any, anything involving proprietary workflows, don't use those buttons. Just don't use them. Don't use them. It's not like you're getting points. There's no gamification here, I don't think yet. So. Exactly. You know, and your product development team doesn't depend on them. I don't, I don't think they do. And it's not a risk worth taking. And,
19:08
I'll kind of take us to the third, which is probably the most common pitfall for fast moving teams. And that's using the Cloud AI web interface for development and prototyping. So unlike the API, the consumer product, Cloud Pro, the team may use session data to improve models unless you explicitly navigate to privacy settings and disable the model training. So if developers on your team are iterating on system
19:35
prompts or testing agent logic in the browser without that setting turned off, your data does not have the same protections as your API traffic does. So that's a really important piece. ah It's not being lazy. It's just something to just double check. Yep. Yeah, that's huge. uh And also importantly, when, when model training is disabled in the web interface, data is retained for 30 days before deletion. So that's a longer window than the seven days on the API side.
20:05
It only applies once you've toggled that setting to off. So, um, the 30 day figure is conditional on the team member actually having changed their privacy settings. It's not a universal default for all web interface users. So that that's, that's again, that's why it needs to be part of team onboarding and not something that, uh, you you just, uh, forget about or leave to chance. Right. Right. So maybe we.
20:34
We changed gear and list out our four things to act on that we came up with before the show, Scott, and maybe you kind of kick it off. Yeah, sure. Yeah. So, the first one, uh, number one, so all, all serious development should go through the API. So you're prompt testing agent chaining system prompt iteration through the anthropic API or through enterprise cloud platforms like AWS bedrock.
21:02
or Google cloud vertex AI, uh, which Anthropic explicitly recommends for commercial development. So the API is where the zero training protections apply by default. Uh, the web chat interface is not that environment. we just mentioned. Yeah. And, and, and treat this as a team policy, not just a personal practice, every developer building on cloud needs to understand this distinction that you just listed out, Scott.
21:28
That's a short onboarding conversation or maybe even something that goes into your LMS system, Reno for new hire training, or even just, you know, annual training. And that, that helps remove a significant category of risk. And the single most important structural decision your team can make here is, reducing that risk. Yeah. Yeah. I'll just take number two again. So audit your team's browser habits. Like today. So find out whether developers who are
21:56
prototyping and Claude have disabled model training in their privacy settings. They haven't, haven't do it now. It takes 30 seconds and it's obviously important protection. Yeah. Good point. And I'd say number three, establish a clear policy on the feedback buttons we just talked about. So you're going to keep them or not. It doesn't need to be complicated. Just we don't submit thumbs or thumbs up or thumbs down on prompts that touch proprietary workflows. For example, that's the whole policy and that's what it could be.
22:25
The people on your team who need to know this, you know, thumbs up, thumbs down are probably a short message away or quick Slack or something like that. So again, keep in mind, uh, because that is kind of a little nuanced component of, the feedback loop. Yeah. And constant reminders until it's just the habit. touch. Um, yeah. And then the fourth one, uh, was, uh, created a decision process for partner program invitations before one arrives.
22:55
So when your team receives any early access, uh, offer that involves data sharing, route it to whoever owns IP policy, not whoever received the email. Early access can be, you know, it can be generally valuable, but it should, it should be a deliberate decision with eyes wide open and not a accidental opt into some partner program. Right. Good point. So for the listeners, those are the four actions we just went through, which was API first development.
23:24
audit browser settings, no feedback buttons on sensitive prompts, clear partner program gate. Those are the four moves that address the actual risk surface here. None of them are technically complex. They just require awareness and team alignment and probably get back into, you know, modifying that privacy engineering team that you have. All right. Yeah. Let's, let's just talk a little bit about the, the moat, uh, protecting your moat. And I think.
23:54
I think this is a strategic question that's, that's always worth chewing on. So even if we imagine the most pessimistic scenario, your system prompts somehow made it into a future training run, despite all the protections you described, how much would that actually cost you competitively? And that's, that's worth thinking through. Yeah. And this is something I've thought a lot about, but as someone who builds these systems and someone who advises teams on AI product strategy, ah my honest view is that
24:24
for most well-built AI products, the prompt is not where the durable competitive advantage lives. What's harder to replicate is the market insight that told you which problem to solve, the customer relationships that gave you feedback. ah Nobody else has that. The proprietary data your users generate inside your platform over time. ah Your team's uh speed of execution, a prompt in isolation,
24:52
can be reconstructed by a determined competitor through experimentation often faster than people assume. But don't forget what you have in your back pocket that helped you get there. Yeah, no, I just had this conversation with somebody. People are developing all these things. You might think you have something unique in the marketplace, but is somebody already working on it? They could be. So, but from a technical standpoint, the AI products that I see building real defensibility are the ones that are
25:21
creating data flywheels. And we've talked about those in previous episodes. That's where every user interaction generates the signal that improves the product and that data belongs to them, to your organization alone. And then that compounds in a way that a prompt never will. A competitor starting from scratch doesn't have two years of your users behavioral data. That asymmetry is really the moat. And I'd add.
25:49
Misplaced worry about Claude learning your architecture can actually distract you from the IP risks that are generally real. 100%. Yeah. Keep people leaving with product knowledge. That's a big one. And that's not new to AI. That's been there for years. uh Competitors using your product and reverse engineering the experience. That's a concern. uh Insufficient internal access controls over who can see your prompt infrastructure. That's a concern.
26:18
And those are the vectors worth spending real security budgets on that you really should be worried about versus like I said, the misplaced worry about Claude learning your architecture. Yeah. And I think the overall takeaway is, you know, don't think about IP protection. That's not it. It's understand what the actual attack surface is. Apply the right protections to the right risks and invest in building the advantages that compound over time.
26:48
Compounding is the key. The data flywheel. Good point. Well, hopefully this episode helped clarify something you or your team have been wrestling with. If so, share it with whoever might be asking the question. ah Please continue to pass our podcast out. If you have any questions, please continue to send them in. We've been interweaving them into some of the episodes and feel free to connect with both Scott and myself on LinkedIn. Thanks for listening and we'll see you next time.