Streaming Audio: Apache Kafka® & Real-Time Data

Apache Kafka Networking with Confluent Cloud

July 28, 2022 Confluent, founded by the original creators of Apache Kafka® Season 1 Episode 226
Streaming Audio: Apache Kafka® & Real-Time Data
Apache Kafka Networking with Confluent Cloud
Show Notes Transcript Chapter Markers

Setting up a reliable cloud networking for your Apache Kafka® infrastructure can be complex. There are many factors to consider—cost, security, scalability, and availability. With immense experience building cloud-native Kafka solutions on Confluent Cloud, Justin Lee (Principal Solutions Engineer, Enterprise Solutions Engineering, Confluent) and Dennis Wittekind (Customer Success Technical Architect, Customer Success Engineering, Confluent) talk about the different networking options on Confluent Cloud, including AWS Transit Gateway, AWS, and Azure Private Link, and discuss when and why you might choose one over the other. 

In order to build a secure cloud-native Kafka network, you need to consider information security and compliance requirements. These requirements may vary depending on your industry, location, and regulatory environment. For example, in financial organizations, transaction data or personal identifiable information (PII) may not be accessible over the internet. In this case, your network architecture may require private networking, which means you have to choose between private endpoints or a peering connection between your infrastructure and your Kafka clusters in the cloud.

What are the differences between different networking solutions? Dennis and Justin talk about some of the benefits and drawbacks of different network architectures. For example, Transit Gateways offered by AWS are often a good fit for organizations with large, disparate network architectures, while Private Link is sometimes preferred for its security benefits. We also discuss the management overhead involved in administering different network architectures.

Dennis and Justin also highlight their recently launched course on Confluent Developer—the Confluent Cloud Networking course. This hands-on course covers basic networking and cloud computing concepts that will offer support for you to get a clearer picture of the configurations and collaborate with the networking teams.

EPISODE LINKS

Kris Jenkins: (00:00)
In this week's streaming audio, we're talking to Justin Lee and Dennis Wittekind about cloud networking, specifically cloud networking for Kafka. It's one of those topics where you'd like it to be straightforward. Here's the URL, off you go. But if you've ever worked with a company that's large enough to have its own networking team, then you'll know it can often be a lot more complicated than that. There are often restrictions on what you can or can't do, policies you need to comply with. And when that kind of stuff comes up, you really need to know what your networking options are. That's something that Dennis and Justin have been dealing with for a long time for a lot of different companies. So they recently filmed a course all about it, and I wanted to get them in, get the cheat sheet from them, figure out what are the options, what strategies work for which kinds of company, and maybe get some soft tips on them on how to get everyone to agree on the right solution. Before we get started, this podcast is brought to you by Confluent Developer, which is our Kafka tutorial site, head to developer.confluent.io for Justin and Dennis's course. And lots more courses besides, but before you do, I'm your host, Kris Jenkins. This is streaming audio. Let's get into it.

Kris Jenkins: (01:20)
My guests today are Justin Lee and Dennis Wittekind. Hi gentlemen.

Justin Lee: (01:24)
Hello? How's it going?

Kris Jenkins: (01:26)
Good to have you here. Now let me see I've got this right. Justin, you're a solutions engineer.

Justin Lee: (01:32)
That's correct.

Kris Jenkins: (01:33)
That means fixing people's problems generally, right?

Justin Lee: (01:37)
Yeah, sort of right. Part of it's helping customers get up and started [inaudible 00:01:41] technically and then Dennis does more fixing the problems.

Kris Jenkins: (01:43)
Yeah. And, Dennis, you are, I love this one. You're a customer success technical architect.

Dennis Wittekind: (01:50)
You nailed it. There it is.

Kris Jenkins: (01:51)
Which is a CSTA and I'm told some people abbreviate to CSTA.

Dennis Wittekind: (01:55)
Correct.

Kris Jenkins: (01:56)
Which I love, that's very casual.

Dennis Wittekind: (01:58)
Our goal is to make customers be able to take a nap. Right.

Kris Jenkins: (02:03)
And not because you're boring.

Dennis Wittekind: (02:04)
No, no. That's all their problems and they don't have to stay up all night and they're-

Kris Jenkins: (02:09)
Stay relaxed. Cool. So we brought you in to talk about networking, because you've just released a course, like a nitty-gritty, nuts and bolts course on the cloud networking. Justin, I think you were the face for much of that. All of that.

Justin Lee: (02:24)
Yeah. I wrote some of the Confluent and then I recorded the actual lecture components and then Dennis did the labs and the demos. And you can talk a little bit about that.

Kris Jenkins: (02:34)
Oh, okay. You juggle it that way. Yeah. How was that? How was the filming?

Justin Lee: (02:39)
It was interesting. I mean, it was actually my first time out at headquarters in California. So that was nice. And we did the whole camera thing and the screen thing. So it was an interesting experience. It's a little bit more stressful than I expected.

Kris Jenkins: (02:52)
It's more that we don't often get that kind of Hollywood moment in the programming industry. Must have been fun.

Justin Lee: (02:58)
Yeah, it was good.

Kris Jenkins: (03:00)
So the topic that we're talking about today is cloud networking. I'm going to start with a naive question. You can tell me why I'm wrong. Right. Cloud networking, it's all in the cloud. So you just open network connection and that's it. Right. Life is easy. Why would there be any problems with cloud networking?

Justin Lee: (03:19)
Yeah. You just told, you explained the whole podcast. We don't need the podcast, or we don't need the course anymore. No. Yeah. It's networking getting up and running can be pretty easy to do. Right. We have the public shared secure public endpoints is what we call it in the course. But basically Kafka clusters that are available publicly are, should be the default option for you. But depending on who the customer is, and depending on what their InfoSec and security and compliance requirements are, they may have a number of things that make that a little bit or a lot more complex. Right. And that's what we're looking to solve with the course.

Kris Jenkins: (03:53)
Okay. Well give me a first example. What's the first thing that makes it more complex?

Justin Lee: (03:58)
We work with a lot of customers that have InfoSec or compliance requirements where their data can't transit the internet. Right. Functionally they say, okay, because we're handling financial transaction data or PII, Phi, or something along those lines, they're not allowed to have their data accessible over the internet or even transit the internet, which means we need some form of private networking option between where their Kafka clients are running and the Kafka cluster that we're running for them in Confluent Cloud. Right. And so that requires some additional configuration, requires some additional preemptive forethought around how you're actually building the architecture. And then it requires configuring everything, which is where it gets a little bit more complex.

Kris Jenkins: (04:41)
Is that like, so are they saying you basically got to be co-located in the same building or has it got to be private connection between separate data centers?

Justin Lee: (04:50)
It depends on the organization. Right. At its core, the physical location of it doesn't traditionally or often matter as much as the network security. Right. So while under the hood, they may be running at the same data center or set of data centers. What really matters more is that it's transiting over private network connection between different clouds or between different networks or whatever the case may be in your environment.

Dennis Wittekind: (05:17)
Yeah. And sometimes, yeah. Sometimes it comes down to even like firewall rules, right? When you think about it, right? Like it's much easier to write a firewall rule for like three IPs that are inside of the same IP address base as your existing network, as opposed to having to basically open up your clients to an entire CSP cloud provider, IP range, or to allow egress traffic to the internet from your internal applications. Right.

Kris Jenkins: (05:44)
Yeah. Cause you often find that account providers they're expecting you to be pretty flexible on which IP addresses that actually going to connect to. Right.

Dennis Wittekind: (05:54)
Yeah. Usually they're, IPs in cloud providers are generally ephemeral, right. It actually like costs additional money to have kind of a static IP address most times in cloud providers, especially in public facing one. So yeah, it's kind of, cloud providers kind of operate on this idea that the host names relatively static, but the IPs are ephemeral can change it in time. So that complicates, especially for legacy organizations, maybe they have some on-premise workloads to open up their firewalls that only allow for IP address based rules out to cloud applications.

Kris Jenkins: (06:29)
Yeah. So I'm guessing we are talking about often like companies like banks, right. Those would be the classic, we have special rules and they're not negotiable. Have you worked with any banks directly on this and how's it going?

Justin Lee: (06:45)
Dennis, you want to go first?

Kris Jenkins: (06:46)
Dennis?

Dennis Wittekind: (06:48)
Yeah. So yeah, so generally like in the financial services industry, there's very specific compliance requirements either from an internal organizational level or just from a government entity, we want to make sure that everybody's money is safe, right? So clouds, when working inside of a cloud at a bank, you have to be very prescriptive on where traffic is entering from and where it's leaving to. And so usually that necessitates having some sort of private networking, whether it be a direct connection or peering. Generally, it's more of a unidirectional connection. So that's why banks normally prefer something like a private link connection where they can only egress out to Confluent Cloud, but Confluent Cloud and other CSP products can't reach back into their networks.

Kris Jenkins: (07:46)
Right. So your, hang on, how is this actually working? There you are, you're running a cloud service. Are you having to set up on the side VPNs between certain banks or what's going on under the hood?

Dennis Wittekind: (08:01)
Yeah. Kind of under the hood, it kind of works like a VPN. You effectively create a logical private network connection. You land endpoints inside of, in this case like the banks VPC, and then you allocate some IP address space to those endpoints. And then all of the services and applications inside of the bank's network can connect to this external service just by hitting those IP address endpoints that exist inside of their network.

Kris Jenkins: (08:32)
Yeah. Okay. Is that usually a technical problem or kind of negotiating with the various teams and the bank problem?

Justin Lee: (08:42)
Usually from what I've seen, it's usually the various teams within the bank say you have to do this, right. This is like our requirement from an InfoSec perspective is you have to use private link or the equivalent service and whatever the cloud provider is. And then once they've made that determination, then it's a technical problem or a technical action item to configure that. Right. And one of the nice things with Confluent Cloud is that we provide those capabilities self-service so that, the customer can effectively set it up entirely just by interacting with our UI and API and configuring everything versus having to talk with us and work with us to do it. So we provide a lot of the mechanisms so that customers can set it up on their own.

Kris Jenkins: (09:22)
Okay. Because I've never actually gone looking for these kinds of features in Confluent Cloud myself not being a large bank, but it is self-service?

Justin Lee: (09:31)
Yes, most. Most everything is self-service nowadays.

Kris Jenkins: (09:35)
So where do you come in? What are the exceptions? Cause I know sometimes you actually get your hands dirty on this. What's the...

Justin Lee: (09:43)
So I mean, networking is a fairly complex problem, right? So getting the basic functional, get the cluster up and running, get the network up and running, connect it to my infrastructure. That part's pretty straightforward. Right? The parts that become a little bit more complex are where you want to do things like multi-region or multi-cloud, right? So for example, one of my commercial customers a year back or so said, we're going to run part of our infrastructure in Google and we're going to run the rest of it in Azure. And we had to figure out how to connect those while still maintaining those private networking capabilities or still meeting those private networking requirements.

Kris Jenkins: (10:21)
Right. What, give me some details. What was that like?

Justin Lee: (10:27)
So in our case, we were lucky because the customer already had a third party vendor that they were using to connect the clouds, right? So they had third party vendor that connected to Azure and the same third-party vendor connected to Google. And we were able to set up and basically piggyback on top of their existing connection between the clouds, right? So they say, we have some of our clients are running in Azure. Some of our clients are running Google. We have a Confluent Cloud cluster running in each one.

Justin Lee: (10:56)
They wanted to replicate data between them. So that if one of the clouds went down, they would have, they would continue to maintain their business, right. The business continues to run. And so we architected some self-managed components of primarily Confluent replicator in that would run in their environment that could connect to both ends, right. And if you go into the course, we talk a little bit about how the differences between private link and period and so forth. And one of the requirements that we had is if you're doing a peering connection, you can't transit through an intermediate network to get to a third network, right. So it's kind of hard to diagram with my hands, but you have network A, network B, network C. The A is connected to B, and B is connected to C, then A can't talk to C directly.

Kris Jenkins: (11:40)
Oh, okay.

Justin Lee: (11:41)
So we had to set up things like an [inaudible 00:11:44] HA proxy or an [inaudible 00:11:45] instance in this case. And we have some documentation about how to set that up in our documentation as well.

Kris Jenkins: (11:52)
Okay. So, and you cover things like that in the course, you go into the details of NGINX and that stuff.

Dennis Wittekind: (12:00)
Yeah. So in the hands-on exercises, we actually use NGINX for a slightly different purpose, primarily to allow UI access. So one of the things when you're dealing with private networking is that the actual UI that you use to interact with Confluent Cloud needs to access that cluster as well, to pull back, obviously there's metadata about the cluster that's available on the internet, but then obviously once you want to start looking things like topics or reading messages on the topics, right.

Dennis Wittekind: (12:35)
All that data stored on the cluster, like if that was exposed over the public internet through the UI, that would kind of defeat the purpose of the private networking. So, in order to make the UI work, you have to have some sort of proxy in the middle that allows your computer that may not be on that private network to be able to access those resources. So in the course, we set up when you create a private link cluster or a VPC to your cluster, you have to set up, we set up in the course in NGINX that allows you to basically forward traffic from your local laptop that lives on the internet and your home network through the VPC and AWS and across the Confluent Cloud.

Kris Jenkins: (13:20)
This is sounding a little bit like as well as being a Kafka networking course, a little bit of a how to set up your own VPC course.

Dennis Wittekind: (13:28)
Oh yeah. Yeah. So, that was one of the things that was kind of challenging was what is the easiest and quickest way to kind of create the environment because when we are creating these exercises, it almost assumes that you already kind of have all of this enterprise cloud infrastructure set up with all of your networking is already kind of predefined. In the course we had to start from just scratch, like assuming you had nothing. So yeah. Yeah. You know, I would say a significant portion of the course is actually maybe education on AWS networking constructs and how to create those resources and [inaudible 00:14:08] elastic IPs and things like that. So yeah, if you're completely green to AWS networking, the course is step by step enough that you should be able to get through it. No problem.

Kris Jenkins: (14:19)
And you go all the way down into details like what's DNS and what are IP ranges. Right. So it's really getting down in the weeds.

Dennis Wittekind: (14:27)
Yeah. Yeah. We create, you create everything from a VPC to peering connections, private link connections, you create DNS, private hosted zones inside of AWS. So it kind of runs the gamut of those services.

Kris Jenkins: (14:41)
Okay. Yeah. I can see that. And I know another thing you were talking about was like Telco companies, which have a different set of requirements for networking.

Justin Lee: (14:53)
Yeah. One of the customer verticals that I work with is the Telco space, right? So I cover a number of Telcos as a solution engineer, and they have different requirements than financial services and other businesses. Right? One of the big things that they do is they have millions of endpoints, right? So, traditional financial service, they have a bank, they have a data center customer. Like at Telco, they have antennas everywhere. They have towers everywhere. They have, like you have to be able to connect hundreds or thousands or tens of thousands or millions of devices that may or may not all live in a data center. And so there's a larger number of integration points. And based on that, depending on who you are and which cloud you're in and things like that, there's different options.

Justin Lee: (15:38)
Right? So for example, in AWS, they support something called a transit gateway, right? This is basically, it's kind like a network router or cloud router. They can connect to multiple AWS [inaudible 00:15:50] and other network endpoints. And it allows you to say rather than having point to point connections between everything that's running, you have everything connected to a single Transit Gateway or to some set of transit gateways. And then that greatly simplifies your architecture. That's one of the Confluent Cloud networking options that we support. Right. So that we talked about that in the course as well.

Kris Jenkins: (16:11)
Okay. That sounds structurally like very similar to the idea of using Kafka as a backbone, right. Rather than having different systems connected to have one core channel, you put everything through. Whether it be this router or a Kafka topic, it's the same conceptual shape, right?

Justin Lee: (16:28)
Yeah. It's one of the things we're seeing, right. More and more businesses are moving more towards like a consolidation pattern, right. So rather than having like disparate infrastructure everywhere, like they're going to continue to have the disparate infrastructure everywhere, but having like a central place that connects to everything. Whether that's, like you said, the network component where it's a single cloud route or Transit Gateway that connects to all of your different components, or a single Kafka cluster that acts as the central nervous system of your business. Right. It's the same pattern that we see a lot of businesses moving kind of directionally toward.

Kris Jenkins: (17:00)
Yeah. Yeah. I can see that would be like the tension, the constant swinging pendulum between we move away from monoliths to microservices. And now we need to find a way to move back to some of the advantages of having a single place to talk. Right.

Justin Lee: (17:13)
Yeah, exactly.

Dennis Wittekind: (17:15)
Yeah. And I'll also add like the distinction of Transit Gateway versus something like private link or peering. Right. You know, I think it's important to realize, like you can still have a central Kafka cluster, like a single Kafka cluster that's peered to many different networks or that has a private link connection to many different accounts and networks. The benefit of Transit Gateway is that it's basically a single place for connection to all, right. So when you provision, let's say you're deploying some sort of cellular architecture where you have many, many, you're basically creating an instance of a network every time you stand up a new instance of your application. Having a Transit Gateway where you can just hook the app up to the transit gateway. It doesn't require you to make any changes on the Confluent Cloud side. Whereas if you're using private link, like you would've to land a private link endpoint in that new VPC, right. That you create could be [inaudible 00:18:10] less dense.

Kris Jenkins: (18:11)
There's no way that would scale to tens of thousands of connections. Right.

Dennis Wittekind: (18:14)
And that's why Telcos use it mostly because they don't want have to like create these configurations every time they spin up a new instance or a new tower, right?

Kris Jenkins: (18:23)
Yeah. Dennis, I wanted to get from you because you are more on the kind of, you're on the technical side, but you're also on the dealing with negotiations with people side. Right? How are they characteristically different, Telcos to banks, on these topics?

Dennis Wittekind: (18:41)
It's interesting because they're equally as secure from a, we want private networking, but they're less so concerned about the communication back and forth. Right? So we talked about how banks, they want that unidirectional connection. That's why private link is usually a good fit for banks. Transit gateway is a good fit for Telcos because it allows communication back and forth. And Telcos generally have like a lot of like hardware vendors let's say, right? So there's the Nokias of the world, right? All these different companies that make Telco equipment, you want to be able to integrate that with Confluent Cloud relatively easily. And one of the benefits of Confluent Cloud is we have a huge managed connector library. And when you use transit gateway, because it's a bidirectional connection, it can facilitate being able to use more of those fully managed connectors over that private networking tunnel, so to speak, to integrate with those external vendors more easily. So that's a big benefit that I've seen personally working with Telcos in Confluent Cloud specifically with Transit Gateway.

Kris Jenkins: (19:59)
Does that mean for banks you end up having to do something extra for the connectors they want to use?

Dennis Wittekind: (20:06)
Yeah. Generally depend, and it obviously depends what exactly you're connecting to. Right. If it's something that's, maybe not sensitive, right. You can usually use a fully managed connector. The thing to keep in mind is when you're using something like private link where it's unidirectional on the Confluent Cloud side, we have no way to egress other than out through the internet. So, and additionally, we can only resolve DNS names that are publicly resolvable. Yeah. So, generally with banks, right, they don't have a whole lot of stuff that's poking outside the firewall that we can revolve to a public IP address. So generally you end up having to run your own connectors locally. So that's actually one of the big problems that our product and engineering teams are trying to figure out, like how do we allow customers that are still using this private networking solution that's unidirectional to still have some secure way to use connectors. Right. And it's a difficult problem to solve. I'm not entirely sure how they'll do it, but got some good engineers on it.

Justin Lee: (21:13)
You repeat that, Kris?

Kris Jenkins: (21:14)
Justin, do you have any thoughts on how to solve it?

Justin Lee: (21:16)
I mean, I've seen some of the roadmap conversations and I think there's a lot of really interesting ideas out there. I don't think they've decided, right. There's a couple different schools of thought around like, oh, we should do pattern X versus pattern Y. And so I don't want to speak too much to it as to what they're discussing internally because it's possible that they choose something else later on.

Kris Jenkins: (21:36)
Fair enough. Yeah. Ne never commit to things on the podcast. You'll just be expected to deliver.

Justin Lee: (21:42)
We're not allowed to talk about roadmap because we're just the field folks.

Kris Jenkins: (21:45)
Yeah. We're just humble programmers, mate.

Justin Lee: (21:49)
Yeah.

Kris Jenkins: (21:51)
So another thing I know wanted to ask you was like, are there cost implications to this? Like has it suddenly become, what's the overhead of needing special networking requirements?

Dennis Wittekind: (22:07)
Yeah. So this is one of those things where I'm like, probably my least favorite part of this subject is like how much it costs. Right. But it's really important. And it adds a lot of interesting, kind of, I would say it throws a couple wrenches in there, especially for really high scale use cases. Right? So things like telemetry ingestion where you're pulling in hundreds of terabytes in an hour potentially worth of data that half a cent or quarter sent per gigabyte AWS Ingress charge over a private link starts becoming really significant, right. Or the cross [inaudible 00:22:45] charges. And there's lots of kind of different like, depending on where the traffic's coming from and where it's going irrespective of the networking type, sometimes that can add complexity. So peering, and for AWS specifically, peering is very inexpensive. So I was working with a customer where using something like private link was cost prohibitive to their business, right. And peering in AWS, there's actually no, there's no charge across the peering connection.

Kris Jenkins: (23:17)
As long as you're within the same region, it's free to do data.

Dennis Wittekind: (23:19)
Yeah. And there's some charges related to if you're going across availability zones, but in general it's much less expensive. So whenever you're choosing the networking type that you want to use, like if you're super high scale, that's a huge consideration, right. Is where am I getting charged from the cloud service provider side for getting my traffic in and out of these. Fully managed cloud services.

Kris Jenkins: (23:47)
So that comes on their AWS bill. But it's seen as a cost of carry for Kafka in the cloud.

Justin Lee: (23:54)
It's not something like if you're running Confluent Cloud, we charge you for the Kafka stuff that you use. We don't charge you necessarily for the AWS bill, right. That shows up as a separate bill. So while we, when we do like the, here is the pitch, this is like how much Kafka or Confluent Cloud costs. We say, this is the component that actually comes on your Confluent bill. And then by the way, these are the things you should be aware of that aren't going to be on your Confluent bill, but are still going to be a cost to your business.

Kris Jenkins: (24:19)
Right. Yeah. Is it, so is the pitch basically, you're going to spend a bit more in networking costs, but you're going to spend a lot less in Kafka maintenance costs?

Justin Lee: (24:33)
I think the pitch-

Kris Jenkins: (24:35)
[crosstalk 00:24:35] we're making versus, what's the trade we're making versus running it on-prem.

Justin Lee: (24:39)
I think the pitch here and we've seen this played out multiple times is that running Kafka yourself can be both complex and expensive, right? So from a cost perspective, there's the hard costs of, oh, we need to maintain all of this infrastructure, including the monitoring infrastructure, including the scaling infrastructure, we need to put in the engineering to build it, all that kind of goes there. And then the second part is the much softer, but still very tangible cost of saying, Hey, in order to run and maintain this Kafka infrastructure, we need to pay three full-time engineers. And there is a cost to that, right. And we see, especially with those customers that are in the business of doing things for their business, it really doesn't make sense to spend lots of engineering or technical resources on building a solution that other companies have done for you.

Justin Lee: (25:32)
Right. So in our case, we'll run Kafka for you. We'll manage it, we'll monitor it. We have alerts and we have an entire SRE team that is responsible for making sure that your Kafka cluster is performing to some SLAs. And then you, as a customer or as a business, can focus your engineering resources on building things that actually build business value. Right? So you can say, Hey, rather than spending those three full-time engineers building, running, maintaining in a Kafka cluster, they can go build applications that are valuable to the business. Right. And that's the high level value proposition of Confluent Cloud. The other thing is we have, I don't know if we're allowed to say the specific internal metrics, but we have tons and tons of expertise running both Kafka and the infrastructure that Kafka is running on.

Justin Lee: (26:17)
Right. So we have, several X thousands of [inaudible 00:26:20] clusters running several X thousands of Kafka brokers. And we have lots of experience just from a management and monitoring and making sure that this thing is running as well as we also have a lot of the engineers that contribute to the open source [inaudible 00:26:34] Kafka project. So when something goes wrong, we are really, really good at troubleshooting and fixing it and we know how to address those various issues. So it's a whole story picture, right? You're like, you can run a small coffee cluster, right. That's pretty, it's not conceptually hard to do, but when you're running in production, you need that to be, to continue to run and operate. You don't want to spend time troubleshooting that. And that's where we come in. We'll take care of that burden for you and you can focus on your business.

Kris Jenkins: (27:02)
Right. So it is worth the effort of jumping through all these networking hoops. You're saying?

Justin Lee: (27:07)
I think so. Absolutely.

Dennis Wittekind: (27:09)
Yeah. And I generally say it's like at the, the networking stuff is kind of something you figure out at the beginning when you're deploying your first use case, and then maybe as your use cases expand and you have, okay. You know, first we were just connecting between an AWS, VPC, and Confluent Cloud. Oh, now we have some workloads on-prem. Okay. Now we have to add this some new networking types to the cluster. Right. Those things happen like much less frequently than you would think, right. Generally you kind of plan out and then try to set up a networking plan and then just deploy it. And then every time you deploy a new Kafka cluster, you just deploy kind of copy and paste the network [inaudible 00:27:51].

Dennis Wittekind: (27:52)
That being said, right, like the one thing I want to emphasize is like, you don't necessarily have to get the networking type right the first time. So as I mentioned earlier, right, like maybe your first, your MVP is just to connect a couple of apps in AWS to Confluent Cloud. And maybe the best networking type or option for that at the moment for your requirements is peering. And then, six months down the road or a year down the road, you're like, oh, okay. We actually need to also connect some on-prem applications. Now we need to move to private link. Like a lot, I've seen a lot of customers getting like this analysis paralysis mode when they're trying to select their networking type, because they're afraid. They're like, oh, we haven't accounted for all of the possibilities at the enterprise. And the thing is, a, you can in some cases combine some of these networking types.

Dennis Wittekind: (28:39)
In other cases, it's really easy to migrate, right, between one cluster to the other. Like we have tools like replicator, like Justin mentioned earlier, we also have cluster linking. These things make migrating between Kafka clusters and Confluent Cloud clusters easier. And so choosing the right networking type upfront necessarily, isn't necessarily the most important thing, right. It's making it work for your initial use case and the requirements at hand. So...

Kris Jenkins: (29:08)
So have you got any tips for like making that first best approximation other than go and watch the course?

Dennis Wittekind: (29:16)
Yeah. Yeah. So I think it's important to kind of get your stakeholders in a room. Right. I think a lot of time, and I think we cover this a little bit in the course, right. There's kind of two kinds of people, right? There's the Kafka people in the room that know a little bit about networking. And then there's the networking people in the room that know a little bit about Kafka. And if those two groups of people are kind of operating in their own little silos during this exercise of figuring out what networking to choose, like you're probably not going to make the right choice. Right. So it's important to get the networking people in the room, the security compliance people in the room, and then the Kafka engineers in the room that know how like the Kafka protocol works and the fact that there's advertised listeners and things additional to the bootstrap that you need to be open to. And then kind of start at the top, start at the secure public endpoints, simplest option.

Dennis Wittekind: (30:11)
And then the security guy will probably say, oh no, actually, we need to have this private networking requirement. Okay. Now we need to look at either peering private link transit gateway in AWS, for example, and then, okay. Then start to understand, okay, what from your networking people, do they prefer one of these over the other? Is there an archetype that they can copy paste from some other project? Right. But the collaboration's important between those teams. Right. Cause if you're just the Kafka team working in isolation, you may not know that there are archetypes that the networking team's already leveraged that you guys can leverage for your Confluent Cloud deployment. So...

Kris Jenkins: (30:52)
You've got to fight against Conway's law, reduce down to one team of people solving the problem and you'll get one solution. So is that the motivation for the course then? Have you had a lot of these conversations and finally thought, okay, we're going to write this stuff down.

Justin Lee: (31:10)
Yeah. I think so. Right. Like we, I mean, part of my job, part of Dennis' job is to have these conversations on a day to day basis with customers. Right. And sometimes it becomes like the same conversation with every single stakeholder multiple times. Right. So having it in a single consolidated place where they could say, okay, go watch this course, learn a little bit like, like the networking people can learn a little bit about Kafka. The Kafka people can learn a little bit about the networking requirements. They can get everybody roughly on the same page. Then we come to the table, we can have the conversation, and it'll be a much more productive conversation, I think.

Kris Jenkins: (31:43)
Yeah. Yeah. Give people the resources to get the back story before you enter into the room, right? Yeah.

Dennis Wittekind: (31:49)
Yeah. And for the Kafka people that know a little bit about networking, like the good thing about the exercises is like, you read our documentation, which quite frankly, our documentation is actually pretty good on some of this stuff. Like, I love to rag on docs all the time, but the networking stuff's actually pretty good, but it's still really dense. Right. And if you're not like somebody that's familiar with cloud networking concepts, or like the specific way things are named, like reading those docs is like reading Greek, right. You don't understand it. So having like the videos where we actually walk through and the exercises and do the clicks and actually configure all the things, I think it kind of, for the people who aren't super networking excerpts that are looking at this course, it helps kind of bring everything together. And I think it'll help in those discussions with the networking team, because you'll have a better idea of exactly what all these things are that we're configuring what they do.

Kris Jenkins: (32:45)
Right? Yeah. Sometimes you can have the best docs in the world, but they end up being too dense and you want kind of boil down version that puts things in the right sequence. Right?

Justin Lee: (32:53)
Yeah. I think the other way to think about it is some people are visual learners. Some are auditory learners. Some like to read the instructions and some like to like watch a video. So having the course or the content in multiple formats is always a good thing. Right. Just depending on who the audience is.

Kris Jenkins: (33:08)
Yeah. So maybe we should wrap up then with, let's start with Justin. What do you think the most important thing to take away from the course is?

Justin Lee: (33:19)
I think the big thing is, like Dennis said, get everybody in the room, make sure you have all their requirements kind of available up ahead of time so that you can start this conversation early. Right. There's nothing worse than saying, okay, we're going to get to production. We're like, we're going to, oh, we need to go to production. Oh, we need to re-architect our entire thing. Right. So get, start the conversation early, make sure you get the requirements in place and start talking with everybody who needs to put in input so that you can help start making some of those decisions earlier.

Kris Jenkins: (33:50)
Oh, Dennis, anything to add to that?

Dennis Wittekind: (33:53)
Yeah. That was such a good response. Actually it covered on a lot of things. I guess the thing I will add to that is, kind of going back to my earlier point, like don't be afraid to just go, jump into it, learn. It's one of those things where if you're not, if you're not super primed on cloud networking specifically in whichever cloud service provider you use, AWS, GCP, Azure, right. Jumping in and working on this kind of stuff. If you're a Kafka engineer will kind of open up your eyes to some of the larger problem, challenges, problems with just integrating applications in general inside of a cloud provider. So I think it's an important skill to have just kind of in general, as an IT professional programmer in this world. So yeah, I think the course will be super valuable to anybody who's interested in the topic.

Kris Jenkins: (34:48)
That reminds me of, I heard about a guy who taught people rock climbing, and he used to give them the bare minimum information to be safe and then send them up a rock and they would do disastrously, but they learned so much about what they need to pay attention to when he's trying to teach them.

Dennis Wittekind: (35:05)
Yeah, exactly.

Kris Jenkins: (35:06)
So maybe they can go through the exercises and do a safe disastrous first run and then really start learning.

Dennis Wittekind: (35:15)
Exactly.

Justin Lee: (35:15)
That's a good way to [crosstalk 00:35:16].

Kris Jenkins: (35:17)
Something like that. Yeah. Well, cool. That course is available now. Right. It's been published. So we will put links to the show notes, but for now, thank you very much for joining us.

Dennis Wittekind: (35:27)
Thanks very much, Kris.

Kris Jenkins: (35:27)
Cheer.

Justin Lee: (35:27)
Have a good one, bye.

Kris Jenkins: (35:30)
So that's the executive summary. Should we call it the executive, let's call it the geek summary. I feel more comfortable with that. If you want more technical details, like really gory technical details, go and check out their course. It really starts at like IP ranges and what they are and works all the way up the stack. So regardless of your level of expertise, you'd be able to pick it up at the right point and run from there. There is a link to it in the show notes. So that's there waiting for you. Also around there waiting for you, you'll find things like the like button and the rating button and the comment box. So now's a great moment to click, leave us a message, leave other people a message, all that kind of feedback stuff helps us to know which episodes you found most useful, and it helps other like-minded people to find us. So give it a second.

Kris Jenkins: (36:20)
Meanwhile, if your problem is that you have cloud and networking, but no Kafka then take a look at Confluent Cloud. You can use it to get a Kafka cluster up and running in minutes for any size of project from a little side hustle thing through to an enterprise grade installation. So sign up at Confluent Cloud. And if you add the code PODCAST100 to your account, you'll get $100 of extra free credit to use. And with all that said, it just remains for me to thank Justin Lee and Dennis Wittekind for joining us and you for listening. I've been your host, Kris Jenkins, and I will catch you next time.

Intro
Cloud networking requirements
Transit gateway vs. Private link
Special networking requirements
Deployment tips
Cloud networking 101 course
It's a wrap