Imagine you can process and analyze real-time event streams for intelligence to mitigate cyber threats or keep soldiers constantly alerted to risks and precautions they should take based on events. In this episode, Jeffrey Needham (Senior Solutions Engineer, Advanced Technology Group, Confluent) shares use cases on how Apache Kafka® can be used for real-time signal processing to mitigate risk before it arises. He also explains the classic Kafka transactional processing defaults and the distinction between transactional and analytic processing. 

Jeffrey is part of the customer solutions and innovations division (CSID), which involves designing event streaming platforms and innovations to improve productivity for organizations by pushing the envelope of Kafka for real-time signal processing. 

What is signal intelligence? Jeffrey explains that it's not always affiliated with the military. Signal processing improves your operational or situational awareness by understanding the petabyte datasets of clickstream data, or the telemetry coming in from sensors, which could be the satellite or sensor arrays along a water pipeline. That is, bringing in event data from external sources to analyze, and then finding the pattern in the series of events to make informed decisions. 

Conventional On-Line Analytical Processing (OLAP) or data warehouse platforms evolved out of the transaction processing model. However, when analytics or even AI processing is applied to any data set, these algorithms never look at a single column or row, but look for patterns within millions of rows of transactionally derived data. Transaction-centric solutions are designed to update and delete specific rows and columns in an “ACID” compliant manner, which makes them inefficient and usually unaffordable at scale because this capability is less critical when the analytic goal is to look for a pattern within millions or even billions of these rows.

Kafka was designed as a step forward from classic transaction processing technologies, which can also be configured in a way that’s optimized for signal processing high velocities of noisy or jittery data streams, in order to make sense, in real-time, of a dynamic, non-transactional environment.

With its immutable, write-append commit logs, Kafka functions as a flight data recorder, which remains resilient even when network communications, or COMMs, are poor or nonexistent. Jeffrey shares the disconnected edge project he has been working on—smart soldier, which runs Kafka on a Raspberry Pi and x64-based handhelds. These devices are ergonomically integrated on each squad member to provide real-time visibility into the soldiers’ activities or situations. COMMs permitting, the topic data is then mirrored upstream and aggregated at multiple tiers—mobile command post, battalion, HQ—to provide ever-increasing views of the entire battlefield, or whatever the sensor array is monitoring, including the all important supply chain. Jeffrey also shares a couple of other use cases on how Kafka can be used for signal intelligence, including cybersecurity and protecting national critical infrastructure.


Hello, you're listening to the Streaming Audio podcast. And let me start with a bit of science fiction. Indulge me for a second. The year is 2059. A soldier checks his battlefield computer and an analysis of seismic activity declares there are no active threats. So he heads back to the Jeep. And all that battlefield data gets synchronized to the Jeep's computing cluster, which then analyzes it and says the patrol has covered the whole area. So the squad leader says it's time to head back to base. And when they head back to the base, all the Jeeps coming back synchronize their data to HQ, which then analyzes the entire platoon's data to get a complete situation report.

In that scenario, you're storing event data at multiple tiers. You're processing it at multiple tiers for different reasons, and you're synchronizing it whenever connectivity allows. Well while the military example is a nice dramatic one for storytelling, there are dozens of situations where you might want to do real-time stream processing out in the field and back at HQ, and maybe several places along the way.

My guest today is Jeff Needham. Hi, Jeff. Welcome to the show.

Hello there, Kris. How are you?

I'm very well, good to have you here. Let me see if I've got this right. You're a senior solutions engineer at the advanced technology group. It's a great title.

What's it mean? What's an advanced technology group?

So ATG is a very unusual collection of characters and we do kind of crazy hair on fire things. We try and push the envelope of what you can do with Kafka, where Kafka can operate. At least in terms of my role, I sort of do the hair on fire, explorer kind of activities. Just to give you a sampling of the people who work in my group, Jeremy Custenborder, who a lot of people know both within Confluent and within the community. He was the first SE ever in the history of the company. We have the mighty Kai Waehner, the machine who flies around the world and never sleeps. As far as I can tell he never sleeps. Peter Gustafsson and Sudarshan Pathalam are our cloud networking geniuses. So that's a lot of horsepower when we get into complicated, especially, cloud resident deals or hybrid clouds.

My background is I worked in Yahoo Global Engineering about 15 years ago. So I definitely appreciate the cloud is not always much fun. And the networking componentry side, especially in hybrid secure encrypted environments, there's a lot of wiring. And so Sundarshan and Peter sort of handle when people need that kind of horsepower. And then we work for Nick Dearden and Nick ran case ksqlDB engineering for quite a while. And then came out to be part of this group. And the last member, Heinz Schaffner, up in Toronto, he came from Solas as did Hans Jesperson who we started out being part of this kind of SWAT team set of SEs. And now we're in PMAX CSIT organization, which is a innovation group. And so we're kind of within this CSIT innovation group and that's where we live, but we're very closely tied to the field. I started out as an SE in federal as I was at Horton Works. And so I still have-

So just for those who aren't U.S natives, federal is working for the U.S. government.

Mostly the U.S. federal government, which includes public civilian, which would be commerce and interior, and then the DOD, defense department, and then the intelligence community. Those are the three buckets that make up the U.S. Public sector. I also do some work with Canberra, Australia Defense, and I also do some work in the UK space as well. Piece of trivia, I'm a dual citizen. So I actually carry a Commonwealth passport as well as a U.S. passport being a Canadian citizen as well. So that sometimes helps. It sometimes deters me from getting into very classified meetings because I hold two passports and you're not supposed to do that. But an interesting tidbit is that the intelligence community five eyes is a term you might have heard of. It's the five of the English speaking intelligence communities, the primary one being the U.S. national security agency. The other four are all Commonwealth nations, UK, Canada, Australia, New Zealand.

Oh, is it eyes as in ocular eyes, looking at you, right.

Signal intelligence.

I thought this is going to be a really long acronym. So the five spies.

Yes, so to speak. And so they're all signal intelligence organizations that are loosely coupled into what's called the five eyes and four out of the five of them are Commonwealth agencies.

Jeffrey Needham: (06:21)
So I [inaudible 00:06:24] public sector, global SE sort of, kind of informally.

Probably not as glamorous as the stuff you get in spy films.

Absolutely non-glamorous because it's all signal intelligence. But it's a really good thing because when we get into some of this stuff in the podcast, when you say signal intelligence, everybody assumes that the word that goes in front of it is military. If you think about Caltrans or Cal FIRE in California, or parts of the UK infrastructure, National Rail, British Gas, signal intelligence is signal intelligence. It's not always the military variety like it is in the five eyes world. And so a lot of what I do in terms of doing platform design for customers actually bleeds into the commercial side because situational or operational awareness is a form of signal intelligence. It's just not the military kind. It's the kind that Cal FIRE or Caltrans needs.

I think I need a definition here. What are you classing as signal intelligence?

That's a really good question. Signal intelligence is being able to essentially improve your operational or situational awareness by understanding the telemetry coming in from the sensors. The sensor array. Now that sensor array could be satellites and fancy pants James Bond, MI6 kind of stuff, or it can be sensor arrays along a water pipeline. Like in California where monitoring water is the most important thing we do aside from trying to put out fires. Signal intelligence is making sense of what's coming in. And when I talk to customers, whether it's cyber or oil pipelines or water pipelines or classic DOD stuff, IC stuff, what it really helps them understand is there really isn't a needle in a haystack. And so there's usually an aggregation of signals, which I call a signature. And so you don't usually find an aha needle going, ah, that's the thing that's really bad. It's a combination of telemetry that produces this signature. And that really plays into the sort of Kafka backbone, Uber, Netflix, LinkedIn, telemetry back plane story is from taking-

So are you saying you've got to convert it into streaming audio terms, right? So signal intelligence is this bringing in of event data from external sources analyzing it?

Kris Jenkins: (09:18)
Jeffrey Needham: (09:27)
Jeffrey Needham: (10:35)
Kris Jenkins: (11:52)
Jeffrey Needham: (11:55)
Kris Jenkins: (12:24)
Jeffrey Needham: (12:31)
Jeffrey Needham: (13:38)
Jeffrey Needham: (14:19)
Kris Jenkins: (15:25)
Jeffrey Needham: (15:52)
Kris Jenkins: (16:53)
Have I got right we're going to look at the log stream and say, okay, these addresses are hammering our system and they have-

...are hammering our system. And they have these telltale IP headers, which say, "Okay, this is a particular kind of attack." And the graph says it's coming perhaps from these hostile regions around the world.

Right, because not only do you get a source implementation. And then a lot of hackers use ToR switches so it's very difficult to trace the IP, but you can trace the pattern of traffic because then you can at least trace it back to where the ToR originates. Now, that's where it gets a little more complicated but that's a good example where you take three sources of telemetry in the cyber threat world, and then combine that to make sure you see a clear enough picture, and it's not just the log entries.

That's where Kafka has a big advantage over a classic SIM, because NetFlow and PCAP are very high velocity, so high volume, it would be impractical to sync all that into a classic forensic database. Also, it's a forensic database, it's running a little behind, it's not real time. So being able to detect these patterns in addition to curating the topic of interest, the signature topic of interest, now you can do some of this because now you can have streaming analytics run right on top of this destination topic of interest.

Okay. Yeah, I can absolutely see that somewhere in the world there's some system that is batch loading these ETL into a relational database, and trying to join the latest tables overnight to get that data by the morning. And that's not...

My history is I spent about a decade working in the Oracle kernel group focusing on the performance and scalability of the Oracle database. So I understand the hardest thing to do on a relational, on any database, is to do essentially random writing, index update, deletes, right?

Mm-hmm (affirmative).

Most sims are index driven, so if you have to ingest a crapload of material, used a technical term, what you have to do is hammer on the database cluster and ask it to do the hardest thing imaginable. When it runs forensic queries, it's just read-only, that's good. Queries are easy, because they're fun, they're read-only. Especially in version-based databases like Oracle, but there are other version-based database kernels, right?

When you have to write, it hurts. And so then you're constantly ingesting NetFlow, PCAP, and logs into a database and asking it to ingest ETL, so ingest filter, then write it into the index before you can even run the query, so then if you're doing these three random write hard steps before you even can run the query, it just adds time. And it assumes you've tuned the living daylights out of your database, so it has an enormous amount of random small block write headroom. In my experience of doing this for quite a while, hardly any database is designed to have that kind of write IOPS headroom. And to do periodic ETL filtering at high velocity, you need write ops headroom. And almost invariably the hardware platform isn't set up for that, so not only are you adding a bunch of steps, you're adding the steps that hurt the most. Where on our side, we're clean, we're not 500 microseconds away from the data but we're a hundred milliseconds away, so we're pretty close.

And the amount of... There's no question that we can handle the scale and velocity, because Uber and Netflix and LinkedIn prove that every day. So then you have this capability that is we can do the ingest and the analytics faster, because we're not batch and our write layer is so lean and it's just commit logs, and they're append only. This makes a huge difference, because again, if you're looking at the right overhead... We still have to write persistently to disk, we don't have to write rep three because it's analytic data, it's not transactional data, it's not your payroll check. But that's something you trade off and so that even gives us a little more write ingest headroom, while we're doing the case SQL streams, read from topic, write back to topic loop. Now, if you configure this quickly, this thing is... Again, you could figure out a signature in the 100 to 500 millisecond range. You can't do that with a classic batch forensic ETL pipeline.

Yeah. Yeah. Because you're loading that kind of volume of data into a relational database let's say, and you know that you're writing it once, you know that effectively it's immutable. But that doesn't help you because the database has to cope for the fact that you might be concurrently writing, even if you're not, the whole thing is architected for concurrent writes. It has to give you the guarantee that would work.

This is so cool about Kafka when I first started to really understand it. I had a good conversation with Jun Rao about this. I know how the Oracle transaction layer works, and you look at Kafka and it's like, "I get a hand tunable customized transaction layer for every single topic and you're like, what?" It just blew my head off, because as a designer it means if I need that transactional integrity, I need classic acid mode, sure you can get that, here's your classic acid mode topic. I don't want to do classic acid mode on NetFlow or PCAP headers because I'm going to just shred the brokers, so I'm going to need more of them.

Because again, in order to do that kind of writing and be asset compliant on rep three with ISRs and Axall, that takes time because that's your paycheck. But it's PCAP header, so I can loosen that requirement and what I get back in exchange for loosening the acid semantics is headroom, write headroom. And that means less time spent, and therefore less time spent to understanding whether or not I've actually found the signature of interest.

Yeah. Because you're saying you can tune it so you say, "Every event is precious, you must not lose a single event." And that's fine. But you can also say, "You know what? If it gives me more throughput, I can lose a few events because I don't even care about individual events. I'm looking for a pattern." Yeah.

This is very important because this is the difference between transaction processing with Kafka and analytic processing with Kafka. I'm in the signal processing business, and any data scientist I talk to they're in the signal processing business, they're not looking at an individual column or row because they're not doing the asset semantics transaction thing. They're going to look at 300, 400 million rows looking for a pattern. So if you dropped 10,000 or 30,000 or 140,000 people are like, "Oh my God, you dropped 140,000 rows. What's wrong with you?" You're like, "There's 190 million of them." And what the data scientist is looking for is a pattern.

So you draw a distinction between transaction processing and analytics processing? Take me through that distinction a little bit.

So transaction processing is what we've understood for a million years, since OLTP was invented in the 60s by IBM. It uses the acid property, acid semantics, which is atomic consistent. And I forgot what the I means.

It's not item potent, is it? Oh, we're going to fail the job interview now.

We're going to fail the job interview. And the D is durability, which more often than not is underwritten by the [inaudible 00:25:21]

I is isolation.

Isolation, transaction isolation.

There we go.

So you can't update the same row at the same time. And so something like the Oracle database kernel, which is something I know but they're all often are all very similar. Microsoft SQL server is based on the original Sybase kernel. And then there's DB2, and then there's nine zillion flavors of DB2. That asset semantics is important because that column in a row could be worth $80 billion, so you really don't want to lose the... You want to have acid semantics around that column and that row [inaudible 00:26:04]

It could easily be like a trade in a bank that is for X million or billion dollars.

Absolutely. And Oracle numbers like IBM DB2 [inaudible 00:26:14] numbers, they're byte encoded numbers, so two or three bytes is easily hundreds of millions of dollars. So those few bytes are very, very, very important, and that's the foundation for OLTP. Over time, these databases accumulated transaction histories and they try to start to do OLAP or analytics on their existing transaction profile histories, that were sitting around in an IBM, Microsoft or Oracle database, so that's where data warehousing comes from. But you're running queries to analyze forensic data, but in order to insert more data like we talked about with the earlier SIM ETL loop, there's going to be a traditional OLTP acid centric ETL loop, that's also hard on those databases, so that chews up their bandwidth. So you'll see customers running on Exadata and Teradata and they'll try and run queries.

And I've talked to customers when I was working at Hortonworks where they had running their business on Teradata, which is pretty good OLTP engine, not quite designed for that but pretty good at it. And they would not be able to run their forensic analytics queries, because they couldn't afford to take the bandwidth away from day to day business, so then they stopped doing it. This was a conversation which spills into the Kafka world because when I was at Hortonworks, Hadoop is a non-acid semantic driven database. It has durability, but it doesn't support this classic acid mode because it was designed from the ground up to be an analytics database. So Yahoo engineering, which reversed engineered big table GFS from the Google paper and that's story for another time. But Hadoop is really... It's designed to be a purpose built analytics processing platform.

So it of course doesn't have acid semantics, because it's not doing OLTP. [inaudible 00:28:25] Kafka, and Kafka has inherited some of the rep three semantics and capabilities and distributed model that you see a little bit in GFS, big table Hadoop. There's a straight line between the history of these three products, and a straight line between LinkedIn and Yahoo. Jeff Wiener, who founded LinkedIn, came from Yahoo search engineering when I was there. And LinkedIn was down the street on Matilda from Yahoo, so it's a small family. But that technology, you can see that, and the beauty about Kafka is I get the option to do classic acid semantics, and I can also switch it and just let it run fast and loose as an analytic database. So I can trade off the semantics mostly in latency, for the bandwidth to handle the volume and velocity of something like PCAP, NetFlow and logs coming in the front end to go back to the cyber intelligence use case.

Okay. Yeah, I see that. Do you think... You'll have a much better idea about this than me because you've dealt with the guts of Oracle and Kafka and stuff. Is this somehow fundamentally related to the idea that we are separating storage and compute? Can we draw a line there?

Yeah, that's a good question because one of the things that storage and compute... In distributed computing systems, the reason why storage and compute are isolated onto a node like in the classic HDFS data node or in the classic brokers with drives bolted to the broker, is something that's called bi-sectional bandwidth, and it's an HPC term. And so bi-sectional bandwidth from the HPC, the old world so more history on my side, I started out working at a company called Control Data which is founded by a guy named Seymour Cray, and I worked a lot on Seymour Cray design, super computers. But in the 90s...

Kris Jenkins: (30:32)
Jeffrey Needham: (30:34)

Of Cray Supercomputer, yeah. Okay.

Of Cray... Yeah. So I worked on... I did some C-compiler work on basically Seymour Cray designed computers, and that led me to working at the Oracle kernel group because I had this weird history. In the 90s, super computers started to become distributed beowulf clusters on commodity Linux hardware. That's an important lesson or that's an important event in the history of something like Hadoop or Kafka, because it's high performance, distributed, scalable computing on commodity servers. So there's a straight line between classic Cray mainframes, beowulf clusters and Hadoop. And then that line continues into Kafka, because Kafka is a distributed rep three, horizontally scalable data processing architecture, so in my mind anyway, there's a straight line between all these three things. But the reason I mentioned the HPC world is that the beowulf cluster, once you had the working set isolated onto a bunch of drives with a computer then that working set...

Let's take classic HPC workloads like finite element analysis, you crank on your corner of the grid because you're modeling weather forecasting or the surface of an F-35 or what have you. So those used to run on big Cray mainframes and now they're a little slice of the wing that you're modeling, now runs on one server and you have a thousand servers and now you get parallelism. So as you add another server, you add disc bandwidth, network bandwidth, memory and compute bandwidth every time you add a component. So that's bi-sectional bandwidth.

In the classic Kafka sense, where you're running CP on-prem and you have a broker with a bunch of drives, most are NVME SSDs these days so they're pretty quick, then every time you add a broker, you get storage bandwidth, you get memory latency or memory bandwidth, you get computing power and you get network bandwidth because you just add brokers, add brokers, add brokers. And if you architect the app and the partition mapping, it's not magic voodoo out of the box, you still have to do some Kafka architectural design homework, but then the thing essentially scales horizontally. And if you look at what Netflix and Uber do for a living, they've dialed that in correctly, fully exploiting.

Now when you move to cloud computing or elastic computing, in order to be able to have the elastic shrink to work, you have to have the data on shared storage. So you trade off a little bit of bi-sectional bandwidth IO for the ability to certainly elastically expand, but also to elastically shrink. So the key to CC or to anything that runs, that can shrink persistent data access, this is very important because this data is persistent, it's state-full data. Normally, if you have stateless data, you shrink, you toss, so it's really easy to expand the cluster to a hundred nodes, fool around with-

It's really easy to expand a cluster to 100 nodes, fool around with something, and then the result gets written back to a persistent store. And then the intermediate results that are out on the nodes gets tossed, right?

You reduce. You throw away everything that got you there.

Right. In classic MapReduce, MapReduce, once you do the reduction and then you finally have the result, then you can actually... If you're not that interested in keeping the original working set around, you toss it and you shrink your nodes.

Yeah, yeah, yeah.

In transaction asset semantics processing, shrinking data, it's got to be on shared storage or you lose it. We used to say this, and I still kind of make this point, which is data is not elastic, but the fine print is persistent. Stateful data is not elastic. It's got to be around. It's got to still be acid compliant, especially in the transaction shrinking world. In analytics, again, as a trade-off, we get to shrink that. Because if those hundreds of millions of rows aren't interesting to us because that's yesterday's data set or this morning's data set, right? You do the analysis. This is important to understand. Because it's not the data that's important, it's the analytic output, right? What the data science algorithms result.

That result, that tells you that it is a signature of interest and that's the money. It's kind of somewhat more... It's a little beyond sort of the data is important. It's the intelligence gathered from the data. With my customers, whether it's Cal Fire, Caltrans, or the DOD, those customers, they're interested in what the patterns are, right? Whatever those patterns are. The pipeline relay pumps in Northern California are about to fail. Okay, that's bad. We need to... It's not so much the raw data.

So then you have this ability to, again, take the difference between classic asset semantic transactions and signal processing analytics, and the notion of elasticity, where you don't really have a stringent requirement on the elasticity of asset semantic data on the analytic side, you definitely do on this side. But on this side, now you have a little more rope to work with. As a platform designer, that allows me to build a more flexible or cost effective or efficient or at the edge more resilient platform so that the analysts can get their job done.

I'm trying to piece this together in my head with technology. I am saying, okay, I want a system that has transactional processing, but I also need to be able to bring in a vast quantity of data, process it, and then maybe look at something like topic compaction to throw away the not so significant data on the way through. Now I found the signal. Now I've brought it into one place and joined it and analyzed it. I can start to compact and throw away the old stuff.

This is a really good point because one of the most important tools of handling like super high volume velocities is you get two very important levers. One is compaction and one is retention. Because you might want to set the retention just a little outside of the KSQL window, for example.

Yeah, right.

Now, in the asset world, people would look at you like you're a crazy person. But again, when you're in the signal world using compaction because keeping a flight data recorded history is important forensically, so it's not useless, but it's probably going to go into a batched warehouse to do long haul, large working set forensic analysis, right? So then some topics need to be compacted, because it's like, "I really don't need the old data. I need to know what's going on right now. And what went on 190 milliseconds ago isn't really that interesting." But the windowing is interesting because then you get a little more of a signal tale, right?

And that's important to data analysts. Because if they can see the last 34 samples, then their algorithms can start to understand, especially if they start to do what I call machine doing, right?

Define that one for me.

Well, everybody knows what machine learning is, right? You have big ass working sets sitting out on some S3 set of buckets and you're running data bricks, classic stuff, right? Machine doing is actually deploying the trained algorithms out front right on the live streams. To me, that's machine doing, not just machine learning, because now it's actually doing its day job. I like that term because it helps you understand that we're in the doing business, right? Because we're about real-time stream processing. We don't do machine learning within a topic. But again, with KSQL user defined functions, UDFs, there's your hook into the model. The model can actually incrementally train if it's that sophisticated, but your trained model is sitting there probably in a jar.

That's just called out through the KSQL UDF interface. And now all of a sudden, you're watching, you're using your machine learning algorithm, not only to look for things in a more sophisticated way, but to learn from what it's seeing. And that's a big difference. I think that from my perspective, machine learning and AI, which are kind of buzzwords, but from a data processing point of view, what they really represent from my chair is they're dynamically adaptive algorithms. Where if you write a piece of code and you throw it in jar and off you go and you're like, "Ah, the algorithm is crap. We got to write a new one." Okay, you go to change code, resling the jar, blah, blah, blah.

What ML and AI really represent is a category of dynamically adaptive algorithms. They can dynamically adapt to the data stream that's coming in. Whether you use the hipster terms or not, if you can get those algorithms hung off live traffic right out on the stream's interface, then now that's the best chance that they have to certainly help you understand your awareness, but also learn as fast as possible because you're not going back into the batch forensic working set warehouse to retrain. This is kind of an evolving concept. A lot of data scientists who do machine learning are still doing the classic big working set training model.

Dynamic machine learning and dynamic machine doing are kind of a cutting edge topic, but I've run into a few customers and systems integrators, SI, that are definitely looking at this, but it's kind of state of the art for the data science world. But when they're ready, we're an ideal back plane for that, because it's like you hang your jar, we'll call it. As soon as the traffic hits the topic, you can do your job. Think about the latency in terms of milliseconds that that cuts out of the time you become aware.

Signal processing and adaptive signal processing.

It's adaptive signal processing.

Yeah, cool. To try and to get this more grounded, where else in the world is this being used? I mean, what are the actual use cases. We've got intrusion detection.

Classic cyber stuff. Yep.

Give me some more.

We're looking at being able to do some of this in the smart soldier use case category, which is probably better understood as intelligent disconnected edge. Because if you can run Kafka on something as small as a Raspberry Pi... We've been doing that on and off, or I've been doing that on and off for a few years. Officially, Confluent doesn't support ARM64 yet, but everything I've done seems to work. It runs very fast on a pi. Now you have the capability of everything we've talked about in terms of being able to actually handle a reasonable amount of velocity on a Raspberry Pi, which might sound crazy, but it handles a lot of velocity and it does flight data record.

If you decide not to compact or not to set really short retention windows that are KSQL window based, then what you get is intelligent, disconnected, edge processing. You get everything we were talking about, but now we get it where you're out in a ditch somewhere, where there's no communication channel. No comms is a term you might hear on that side. But there's lots of opportunity where you're nowhere near a cell tower, water pipelines, utilities, infrastructure, anything, there's lots of places where you don't have 4K streaming 5G cell towers. Like in the T-Mobile world, right? Their network predominantly is near... Their towers or near freeways. If you live near a freeway, you can get really awesome 5G 4K streaming.

If you live in the middle of Northern Saskatchewan, probably not. That capability... Or offshore in the North Sea, in the Baltics, anywhere. If you look at those circumstances and you're collecting telemetry, we're working on projects with NOAA, for their ocean buoys. Australia project is weather stations that are spread all over Australia, which is a sizable continent, which is not...

Yes, indeed.

...clad with cell towers, right, like Canada.

Yeah, yeah, yeah.

Especially when you get out away from sort of large urban areas where cell coverage is really good, comms are really good. Of r any kind of operational technology modernization efforts, which is kind of the larger topic, what I'm kind of skirting up against, is now you can help customers collect the data, analyze it, certainly forensically record it for later when you do have some degree of upstream comms, but also, again, you use KSQL and a select list and a little bit of predicate. What you send upstream over your precious 3G five minutes a day kind of connection has a really high signal to noise ratio. There's hardly any noise and it's really high quality.

You're saying something like so I stick Kafka onto a Raspberry Pi with a battery pack and I've got this tiny little thing that's running Kafka in the field, perhaps attached to a soldier's backpack or a weather station out in the back beyond in Australia. That is real time signal doing real time signal analysis and sending up the important stuff over a precious connection, but also keeping the history of all the, like you're saying, flight data recording, a complete history if we need it.

This is really because again, the way I've pitched Kafka's core values is it's a flight data recorder. It is the ability to be resilient around comms in the cluster linking feature we really take full advantage of. Yeah, that's huge, because cluster linking is just a service layer on top of the broker kernel, right? It's very super clean, right? It's built on top of the replica code. All they had to do was add a thin service layer on top of code that's been there for a million years, so you're like, this is going to be pretty resilient out of the box. And then the third thing is the stream processing.

You have intelligent disconnected edge analytics with the flight data recorder forensic capability on a topic by topic basis, and you have the ability to do analytics, including machine doing if you really want to go that far down. Lots of organizations aren't ready to go that far down, but those three things, that's what we do out of the box. That's what Kafka has done out of the box all along. It's just sort of a really amazing combination that gives designers the ability to sort of twist it into the shape they need, whether they're in the classic big data center, or if they're in a ditch with a bag full of Pis slung into the back of a F-150 pickup truck. We don't care. We really don't care.

And then it's really up to sort of the imagination of what the customer wants to do and how aware they want to be, but we're like we're good. We can do this. Smart soldier project that I've been working on for the last year or so is putting Raspberry Pis on squad members, but with a mobile command post as their base of operations using cluster linking. When the squad members come back into the command post, they kind of just walk back in and the data that are set up for mirrored topics, not everything's mirrored, but the ones that are mirrored, nothing. Noninvasive.

You're syncing your mobile field Kafka topics back in whenever you get the chance into the main command hub.

Kris Jenkins: (48:18)
Jeffrey Needham: (48:20)
Jeffrey Needham: (49:09)
Jeffrey Needham: (49:54)
Kris Jenkins: (50:07)
Jeffrey Needham: (50:22)
Jeffrey Needham: (51:03)
Kris Jenkins: (51:40)
Jeffrey Needham: (51:42)
Kris Jenkins: (51:57)
Jeffrey Needham: (51:58)
Kris Jenkins: (52:31)
Jeffrey Needham: (52:44)
Jeffrey Needham: (53:35)
Jeffrey Needham: (54:51)
Jeffrey Needham: (55:43)
Kris Jenkins: (56:55)
Jeffrey Needham: (57:20)
Kris Jenkins: (57:20)
Jeffrey Needham: (57:21)
Kris Jenkins: (57:29)
Jeffrey Needham: (57:30)
Kris Jenkins: (58:30)
Jeffrey Needham: (58:34)

Jeffrey Needham: (58:35)
Kris Jenkins: (58:56)
Jeffrey Needham: (59:06)
Kris Jenkins: (59:07)
Jeffrey Needham: (59:19)
Kris Jenkins: (01:00:06)
Jeffrey Needham: (01:00:13)
Kris Jenkins: (01:00:18)
Jeffrey Needham: (01:00:22)
Kris Jenkins: (01:00:52)
Jeffrey Needham: (01:00:54)
Kris Jenkins: (01:01:34)
Jeffrey Needham: (01:01:49)
Kris Jenkins: (01:02:44)
Jeffrey Needham: (01:02:47)
Kris Jenkins: (01:02:52)
Jeffrey Needham: (01:02:52)
Kris Jenkins: (01:03:11)
Jeffrey Needham: (01:03:14)

Right, yeah. That's amazing.

And I'm saturating ... There's only a GigE interface. So I was saturating the GigE interface on the Pi, but it's ... You can produce and consume. This is where actually the extra six gig of page cache helps, right? Because you produce at like, it's about 60 megabytes, right? So yeah. It's around 50 to 60,000 messages per second. And then if they're hanging in the page cache then you can consume at the same speed on the other side of the duplex of the GigE link. So now you have like 50K in and 50K out. And if you hit the page cache window, then there's really no IO going on anyway. It's all SSD, so for the most part, the network interface is the bottleneck, not the storage of the processing.

Yeah. Yeah, of course.

You could grind it up if you got ambitious with KSQL. I'm not saying that there isn't code path there, but if you just wanted to go in and out really fast, that's an example. If you're not consuming and you're just probably producing at a more sane rate, which is probably more like a thousand a second, like, how many vital sign messages are coming off your body? Well, that's a function of the sample rate, right? And that capability, like a thousand messages a second, that's a lot of telemetry at the disconnected edge. And then now you have all that headroom back in the Pi to run some pretty interesting KSQL scripts.

Yeah, absolutely. Well, I think we should leave it there, but I look forward to seeing where else you take this in the real world, but now I most want to run into my own little private world.

Kris Jenkins: (01:04:53)
Jeffrey Needham: (01:05:02)
Kris Jenkins: (01:05:04)
Jeffrey Needham: (01:05:06)
