Navigating AI & Data Privacy in Marketing Artwork

Vera Shafiq Podcast

Real and relevant discussions on business, marketing, technology and digital. Vera Shafiq talks with a diverse community of thinkers and doers including industry thought-leaders and grass roots professionals who are striving for the same thing - doing marketing the RIGHT way and being PROUD of the work they do. Now with a focus on franchise marketing!

All Episodes

Vera Shafiq Podcast

Navigating AI & Data Privacy in Marketing

January 28, 2025 • Vera Shafiq • Season 3 • Episode 2

In this episode, host Vera Shafiq delves into the critical issue of using AI tools in marketing while safeguarding proprietary and sensitive data. Highlighting popular AI tools like Copilot, ChatGPT, Gemini, and Perplexity, Vera addresses the concerns surrounding data privacy and security. The episode also explores DeepSeek, a new AI model from China, discussing its potential and inherent risks. Vera shares best practices for utilizing AI tools safely, emphasizing the importance of anonymizing data, and introducing solutions like private AI models and enterprise AI systems with strict data governance controls. The episode concludes with a look into synthetic data as a method to train AI models without compromising privacy.

00:00 Introduction to the Podcast

00:29 The Importance of AI and Data Privacy

01:51 DeepSeek: The New AI Model from China

05:39 AI Data Landscape and Best Practices

07:00 Using AI Tools Safely

11:38 Advanced AI Solutions for Enterprises

15:01 Conclusion and Key Takeaways

0:15

Hey everyone. Welcome to the podcast. This is the podcast where we have real and relevant discussions on business and marketing in the franchise space and talk all things marketing and AI. And I'm your host Veera Shafiq. Today we're diving into a subject that's top of mind for a lot of marketing professionals, CMOs, and it's really how do we use AI tools safely while being mindful of our proprietary and sensitive data? So, you know, as we know, we've all been using AI tools such as Copilot, ChatGPT, Gemini, Perplexity on the list goes on and on and on. and really it's been giving us the opportunity to drive efficiency and insight and research and do our jobs a lot better. But the nagging question at the back of a lot of our minds is what about the data privacy? What about when I start to share my company's proprietary data with the AI models? For example, you know, I'm uploading performance data, marketing performance data, or I'm uploading some graphs or some charts, even some competitive stuff. Alright, what is safe to upload and what should I be worried about? These are the kinds of questions that I'm hearing and I myself have had. Um, so just researching and really understanding how these platforms work has given me a much better clarity on what I should and should not be using AI for. On top of that, very topical this week is the subject of DeepSeek, which is the new AI model that came out of China and just came out of nowhere as it almost seemed as if it came out of nowhere. It's a Chinese artificial intelligence startup that recently gained significant attention for its open source AI models. And it has particularly of interest a model named R1, which is supposed to rival. A. I. Systems. Such as Open A. I. S. Chat. G P. T. Offering comparable performance in areas such as reasoning mathematics encoding. So deep sea has achieved these advancements. With a significantly lower investment and development costs. And it's really kind of been brought to the attention in the media that, you know, this company has spent 5 million developing its AI model, whereas. Open AI spent a hundred million dollars to do something of the same caliber. So, questions around the security of DeepSeek also come up. And, you know, we wonder how safe is it to use DeepSeek if we wanted to test that out, and especially if we're using proprietary data, so today we'll break down exactly where and how you should be sharing your proprietary data and where you should exercise caution. So let's start with DeepSeek since it's not It's very topical right now. It's the talk of the town. DeepSeek has focused its foundational AI technologies and committed to open sourcing on all of its models. Regarding safety, there are several considerations. So, censorship and bias is one of them. Their models have been observed, and this is all over the internet, and if you read all of the kind of experience that people have had testing it out, I also tested it out and noticed similar kind of, um, attributes, is that, um, The DeepSeek models have been observed to exhibit censorship, particularly on topics sensitive to the Chinese government. So, for instance, some of the examples that I've heard about is that, you know, AI, the AI may refuse to discuss subjects like the Tiananmen Square protests or human rights in China. So, it is censored. And then there are the security concerns. So regarding, security, I was reading an article in the Guardian and it says that the company recently faced a significant cyber attack leading to a temporary suspension of new user registrations. So while deep seeks. Has already addressed the issue. It does highlight that there are potential security, security vulnerabilities, uh, with deep seek and then data privacy as a Chinese company. And, you know, we all know what is going on with Tik TOK and the whole, I guess, fear that the Chinese, uh, you know, holding the art data somehow, there is. The fear that DeepSeek may be subject to local data regulations, which could raise concerns about data privacy and potential government access to our information. So, I'm not going to sway one way or the other, but I personally, I'm going to remain wary of. Deep seeks, content limitations, recent security incidents, and the data privacy considerations. And right now for me, I don't feel the need to use deep seek. I am sure that might change in the future, but right now I'm going to stick with the foundational models that are, presented to us through open AI. Anthropic, we have met as models and Google's models. We have models coming out the yin yang. So I'm going to stay away from deep seek for now. That's, that's my personal choice, but again, these are the considerations that we do need to start to think about as these AI models start to come out. So let's talk about the AI data landscape for a little bit before we get into best practices. Let's talk about how AI tools interact with your data. So many of the publicly available AI models, including chat, GPT, and Collect and store user inputs to improve their model performance and for training purposes. And if you look at the privacy statements for all of these models, you'll see that they state that very clearly. Chat GPT plus subscription has a toggle where you can toggle off using your data for training. So that's something that you can do, um, for franchise corporations, specifically the topic of, whether this data is being used and how it's being used can raise significant concerns around specific things like confidential brand strategy and marketing data. We could be using competitive positioning and pricing models in our queries and our prompts. Then there's the topic of localized franchisee specific data. And then the big one, which is CRM data and PII. many brands in the franchise industry have been highly scrutinized in verticals, such as health and finance, and they are certainly on the front lines when it comes to being really accountable and, and highly responsible for their customer data. So let's talk a little bit about how to use these tools and where we should kind of hold back and maybe think about using a different solution. So tools such as chat GPT. Perplexity, Claude, Gemini, they are great for, handling publicly available insights. So if we want to do research on industry trends or general marketing strategies or market research, this is all publicly accessible data that can be shared with the models without risk. We can. You know, go back and forth with the models on all of this stuff that is already publicly accessible. The same thing applies to anonymized data sets. So if we want to feed the models, customer data, operational data, uh, we can use anonymization techniques. To anonymize the data, definitely make sure we're removing all PII. So no emails, phone numbers, addresses, names, all of that good stuff. But we can use techniques like data masking or tokenization, or we can aggregate data so that it's not individual level data to remove that PII and used anonymized data sets. So that's one way that you can get away with uploading your data to things like chat GPT Without worrying about it. Another thing you can do, so when you're looking for market research, competitive analysis, things of that nature, industry shifts, you can definitely start to use, you can use chat, GBT for that for audience behaviors. Just make sure that you're not inputting proprietary insights when crafting your prompts. So just stay away from very, very highly sensitive proprietary secret sauce type of stuff. and then finally, content creation and optimization. This is a big one. We know we can do this. All day long till the cows come home with no worries because we can use GPT and co pilot to generate copy and creative and SEO strategies. And this is all above board. No worries about that. obviously we need to be careful in terms of copyrighted or plagiarism and things like that. But at the end of the day, the copy and creative that, AI generates is essentially unique and we can use that and not be worried about, getting into any trouble there. And that there is an asterisk behind that. We do, need to be careful, with creative and especially when we're telling it to write things in the style of someone else or in the style of a photographer or an artist, we need to be a little bit careful there, but for general purposes, marketing content creation and optimization is fine to use. A chat GPT or a Claude, et cetera for. So where do we need to exercise caution? Well, I would say, you know, back to that proprietary marketing and brand data, any internal marketing strategies or proprietary budgets or performance data, or anything that includes passwords to any of your systems should definitely not be shared with AI models such as chat GPT, because that does not ever guarantee full data privacy. So. Just be careful about, explicitly going in and giving chat GPT, carte blanche, everything that you have, which is proprietary. As I mentioned earlier, you need to clean that data, aggregate that data, or anonymize that data before uploading it. customer data and CRM insights. So we talked about this. Never input raw customer data into AI tools unless you're using an enterprise AI solution with strict privacy controls. And we'll talk about that in a little bit. Uh, localized franchisee level insights. So when we're talking about individual franchise location data, whether that be marketing performance, financials, anything proprietary, and again, anything that's a secret sauce should be secured within internal systems and not be exposed to public AI tools. And then finally, I would say predictive modeling for business growth. So if we want to create our own proprietary. Predictive models, that's fine, but we need to build those on internal secured frameworks, not on public AI platforms, because obviously that's a secret sauce, right? We're trying to create a very custom predictive model for our business and to grow our business. So we don't really want to share that with, a chat GPT or, you know, a model that's. I guess, unless the voter is, is potentially using that data to train other data on. So, if you are looking to leverage AI at scale. Without compromising your proprietary data. There are some solutions. I would say the most advanced solution is using private or on prem on premises. That is AI models. this is probably the most expensive and involved solution. So. Instead of using publicly available AI, some franchise corporations are investing in private AI models deployed on premises or within secure cloud environments. So some of these types of platforms include IBM Watson for enterprise AI and Google Vertex AI. So these are where you have your models downloaded on premises, they're not in the cloud, and now you are actually creating your own proprietary models. within your own company four walls or within a secure cloud environment. the second level would be, an enterprise AI. System that has data governance controls. And this includes things like Microsoft co pilot, Salesforce, Einstein, Adobe Sensei, these are enterprise grade AI with strict data governance, and they ensure that your proprietary data remains confidential. So for example, Microsoft co pilot enterprise stays within the Microsoft 365 suite. So all of your data, which sits in Microsoft 365 stays. In the four walls of your enterprise. And there's no leakage of that data outside into the public domain. AI powered marketing automation. There are platforms for that right now that have their own AI models, which are also contained within your own datasets so that you don't have to worry about any data leakage. Things like HubSpot, Marketo and Sprinkler are now, incorporating AI powered marketing automation. that keeps your proprietary data within secured systems. So you could be free and safe to use those. And then finally, something that, which is a little bit different and it's using synthetic data for AI training. So I think synthetic data is something that we're going to start hearing a lot more about. and it's something that we can use to artificially generate data sets. That mimic the real data, but without exposing actual business insights. So this really ensures security while maintaining the model accuracy. So for example, if I want to upload, or I want to use a set of data. For my franchise to predict customer purchasing behavior, for example. So due to privacy concerns, using actual customer transactional data is not feasible. So instead, what we can do is we can generate synthetic data that reflects the patterns and trends of the original data set, and then the synthetic data can be used to train the AI model. And therefore, the predictions that will come out of the AI model will be accurate, but they will not compromise our customer privacy. So, I think synthetic data is another, option for us to kind of get over that hurdle of sharing our actual proprietary data with, a public AI model, such as ChatGPT. So definitely look into synthetic data if that's something that you are interested in doing. So yes, we really summed up a lot of different things here, but I think the key takeaway is, make sure you do your research, know that it's not safe to share very proprietary and sensitive data with things like chat GPT. I think a lot of us are able to use these platforms quite successfully without having to share proprietary data. We definitely want to make sure we're not including names, company names, or people's names or anything like that when sharing, Insights with these platforms. But I think the next level of AI implementation is really to get onto one of these enterprise level platforms and start to, develop your own data sets, create your own. And, um, get a little bit more sophisticated and advanced with your use of AI. Well, that's it for today's episode. Thank you for listening. If you enjoyed what you heard, please feel free to give me a review and, tune in to next week's episode. Have a great week, everyone.