#007 - Unlocking the Potential of AI in Embedded Systems with Daniel Situnayake Artwork

The Embedded Frontier

The Embedded Frontier, hosted by embedded systems expert Jacob Beningo, is a cutting-edge podcast dedicated to exploring the rapidly evolving world of embedded software and embedded system trends. Each episode delves into the latest technological advancements, industry standards, and innovative strategies that are shaping the future of embedded systems. Jacob Beningo, with his deep industry knowledge and experience, guides listeners through complex topics, making them accessible for both seasoned developers and newcomers alike.

This podcast serves as an educational platform, offering insights, interviews, and discussions with leading experts and innovators in the field. Listeners can expect to gain valuable knowledge on how to modernize their embedded software, implement best practices, and stay ahead in this dynamic and critical sector of technology. Whether you're an embedded software developer, a systems engineer, or simply a tech enthusiast, "The Embedded Frontier" is your go-to source for staying updated and inspired in the world of embedded systems. Join Jacob Beningo as he navigates the intricate and fascinating landscape of embedded technologies, providing a unique blend of technical expertise, industry updates, and practical advice.

All Episodes

The Embedded Frontier

#007 - Unlocking the Potential of AI in Embedded Systems with Daniel Situnayake

August 02, 2024 • Jacob Beningo

0:00 | 47:45

Summary

In this conversation, Jacob and Daniel Situnayake discuss the future of AI and machine learning in embedded software development. They explore the challenges and opportunities of implementing AI and machine learning at the edge, and how tools like TensorFlow Lite for Microcontrollers and Edge Impulse are making it easier for developers to deploy models on resource-constrained devices. They also discuss the importance of balancing model accuracy with resource constraints and the potential for AI-generated models in the future. Overall, the conversation highlights the growing interest and potential of AI and machine learning in the embedded space.

Keywords

AI, machine learning, embedded software development, TensorFlow Lite, Edge Impulse, resource constraints, model accuracy, AI-generated models

Takeaways

AI and machine learning are being increasingly applied to embedded software development, opening up new possibilities for edge devices.
Tools like TensorFlow Lite for Microcontrollers and Edge Impulse are making it easier for developers to implement AI and machine learning on resource-constrained devices.
Balancing model accuracy with resource constraints is a key consideration in embedded AI development.
The future of embedded AI and machine learning holds the potential for AI-generated models and more sophisticated applications at the edge.

Jacob (00:02.277)
Hello, Daniel. Thank you for joining us on the embedded frontier podcast today. How is everything going with

Daniel Situnayake (00:08.109)
Hey really good Jacob, thank you very much for having me.

Jacob (00:11.629)
Excellent, absolutely. Yeah, we appreciate you joining us. Today we are going to be talking about AI and machine learning and where the future of that lies with embed software development and maybe just software in general. There's so many fun things that are coming along. And I know a lot of the audience struggles with figuring out where to put AI and machine learning, how to implement it, where it's going. And so we really appreciate an expert like you being able to join today to chat with us.

Daniel Situnayake (00:37.612)
Yeah, I'm excited to talk. mean, this is a really new field, the kind of application of AI and machine learning to the edge. So there's a lot of stuff that is still getting figured out. hopefully, you know, we can do a little bit to spread the

Jacob (00:52.559)
Perfect. Now, just because, you know, I, we've had several conversations. So I know about your background, but the audience may or may not. Hopefully they do, but would you mind giving us maybe a little bit of a background about your journey? And, you know, kind of how you got into AI, what your background is, and even how you got started with Edge

Daniel Situnayake (01:13.952)
Yeah, absolutely. So I'm the director of machine learning at Edge Impulse. So I run our research team that's looking at how we apply AI and machine learning to the Edge. I sort of, the company's been around for about five years and I was lucky enough to get involved right at the beginning. But prior to that, I'd been working at Google on the TensorFlow team and was involved with launching the product called TensorFlow Lite for microcontrollers. And if you're familiar

TensorFlow is basically a set of tools for training deep learning models. So certain type of machine learning models that are the big, famous ones that are behind all of the big recent advances in AI. And TensorFlow is Google's in -house developed framework for training these models. But then there's a big difference between training and deployment. And with traditional server -side AI, you're running models on

beefy GPUs in the cloud. But if you're thinking about deploying to other places like cell phones, for example, or IoT devices, you need a different solution. So TensorFlow Lite was Google's solution for getting models deployed down onto cell phones and embedded Linux type targets. And then Pete Warden, who I worked with at Google, came up with this idea of creating a framework that's basically like highly portable C++

bringing models down to really constrained devices like microcontrollers. And so I worked on that product at Google and ended up writing a book about it called Tiny ML. this book, was supposed to be, you know, the basically like low level, a high level introduction to everything you need to know to get started. And it ended up being like a, you know, inch or two thick and

While I was working at Google, I met this guy, Zach Shelby, who ended up, he was a VP at Arm. He ended up leaving to found the company Edge Impulse. And Zach managed to do a demo where he pretty much reproduced what required this two -inch thick book in about a five -minute demo of Edge Impulse with the prototype of the product at the time. So that kind of blew me away.

Daniel Situnayake (03:38.926)
I decided I needed to jump on board and see where this is going to go. prior to that, I have a varied background. I've worked all over engineering from everything from AI call center agents back 15 years ago through to front end web development and social media stuff. So pretty much everything, but I've always gravitated.

towards AI and machine learning and ended up just kind of learning a lot about it on the side and finding opportunities to work on it at my day job until finally at Google, started doing it full time. so it's been an interesting journey. You know, I've made that journey from not really knowing much about it at all to feeling confident enough to write a couple of books on the subject. It is a long journey and part of what I hope to do with the work that we do at Edge Impulse is to

make it easier to make that journey so that if you're an engineer building real world applications, you don't have to spend a ton of time learning all the ins and outs of how machine learning works in order to make use of it as a technology.

Jacob (04:51.173)
Excellent. Now, you mentioned there a little bit about the book in TensorFlow Lite for Microcontrollers, along with the Edge Impulse tools and things like that. What is the place of TensorFlow Lite for Microcontrollers today? Is that still something that embedded developers should be looking at for their machine learning endeavors? Or is that something that has stayed up to date with the times, I guess?

You mentioned being able to do something in five minutes versus maybe what you're doing with TensorFlow Lite.

Daniel Situnayake (05:21.293)
Yeah, it's a

Daniel Situnayake (05:26.508)
Yeah, exactly. I mean, it's been a really important sort of infrastructure piece for all of this stuff, because what Google did essentially was make a standard for how you represent these fundamental low level operations in the execution graph of a deep learning model. And then TensorFlow Lite Micro provides a kind of set of those implementations that are able to run on embedded hardware. And

They worked with ARM to create optimized versions of some of these operators and with other vendors and other Silicon IP companies to create optimized versions of those ops. So it's sort of a nice layer to translate between the high level languages or high level frameworks where you're defining these kind of execution graphs and then actually what's running on the hardware.

Companies like ours have sort of built on top of that and used it as a foundation to create our own technologies. We have something called Eon Compiler, which essentially takes some of those underlying operator implementations, but then runs them in a more efficient manner and does all sorts of optimizations on the compute graph itself to give you different trade -offs between latency and memory use, for example. But it's really cool to have this sort of Rosetta Stone

underlying op implementations that are portable to a lot of different devices. Of course, when this stuff all got started, we were talking about running deep learning models on general purpose compute. And there are things like SimSysNN, which gives you these sort of vector extensions that are allowing you to accelerate some operations on particular ARM devices.

But now, you know, six, seven years later, there's been this huge explosion of variety in all sorts of different accelerators, which are designed from the ground up to run deep learning models quickly. So all of those vendors that are creating those typically have their own tool chain, which takes a graph, does whatever kind of fancy things to optimize it, and then turns it into whatever

Daniel Situnayake (07:53.494)
set of instructions are necessary for this particular accelerator. And all accelerators are going to have a slightly different approach to like, maybe this one is trying to accelerate certain convolutional architectures for computer vision. Maybe another one is focused on accelerating certain just fully connected deep networks.

Each of those is going to be good for certain applications and not for others. So now there's this giant universe of all sorts of devices that all have different trade -offs and benefits. And it can be a little bit hard to navigate if you're trying to figure out what hardware to use in a project. So that's another thing that we try to do at Edge Impulse is give you some engineering tools to help you explore that space. Because if you have a model you're wanting to know if

you know, what hardware do you need to make sure the model runs quickly enough and is energy efficient enough and so on. That's a lot of trial and error and even getting the going through the workflow of compiling a model for certain device and getting the tool chain set up can be a huge amount of work. So it's a more complicated world now, but we also have some tools that are able to help you navigate that complexity.

Jacob (09:18.029)
OK, excellent. So there is some stuff there that can help us identify whether we should be using a Raspberry Pi Pico or whether we should be using something that's more like a Cortex -A processor that's got a little bit more juice to it, or GPU, for

Daniel Situnayake (09:29.794)
Mm -hmm.

Yeah, exactly. we try to, it's interesting because it's a two way problem, You don't just start with a set model and then put it on some hardware. You can do that, right? But it really should be a design process where there's this kind of iterative feedback loop of trying to find a model architecture that works and then profiling it on the hardware that you think you might need or the hardware that you want to run it

and seeing how it performs. Is the latency good enough? Is the model small enough to actually fit in the available memory? Does it fit alongside your application as well? And OK, maybe it doesn't fit. So then you go back to the architecture of the model and tweak it slightly to make it fit a bit better. And then you try it on the device again, or maybe you try it on some other devices. And it's just this iterative development process. So the faster you can go with

the better because you're going to get to a solution more quickly. So we provide some tools that help with that. For example, estimating latency and memory use on specific targets so that you can just upload a model or create a model in our platform. And we can tell you immediately which targets it's going to have acceptable latency on. But then the next step beyond that is to actually do an automated search of the space of possible models.

to find models that fit on the hardware and give you the best results. you can do that by hand, trial and error, but it's much nicer if you can use a cluster of GPU computes enabled machines in the cloud to do that for you. And you just sit back and get a nice ranked list of results telling you the performance of the model and how well it's doing the task that you've trained it for.

Daniel Situnayake (11:28.93)
So it's a good time to be coming into this type of development if you're thinking of developing an application in this area. We've just come out of the woods where previously it was a real pain to try and navigate all this stuff. And now there are some pretty solid tools and techniques for exploring all of this chaos.

Jacob (11:49.645)
Excellent. And you've mentioned that there's a, you know, if people have interest in this or doing an application in this area, what, what type of applications are people doing? I've seen a lot of, talked with a lot of embedded developers who they have an interest in AI, they have an interest in machine learning, but it's still something that they're not sure where they should actually apply it. Right. It's a tool almost with, you know, almost like looking for a problem to solve, but not really because there are problems to solve, but maybe the developers don't know. So what problems.

do you see people applying machine learning to?

Daniel Situnayake (12:22.71)
Yeah, it's a really good question, especially now that, you know, there's been this huge boom in AI and so much hype around AI and it's really like begs the question, what is AI and which parts of it do we actually care about? so what we particularly focus on, in my little corner of the world is using machine learning algorithms to understand sensor data. And the way to think about this is if you have a device with a sensor, whether it's a,

imaging sensor, microphone, any kind of sensor that's giving you a time series like from accelerometers through to RF strength or pretty much any kind of signal that's coming in. You're typically going to run into this problem where you're building an application that's connected. It's prohibitive in terms of either cost or energy efficiency or the

design constraints that you have to send all of that sensor data off the device. You're going to tear through any energy budget you have if you're constantly sending images to the cloud or audio or something like that. so that's meant that you've got a big limitation where you may have sensors on a device collecting data, but you can't really do that much with the data because to really make any interesting inference on it, you

going to have to send the data to the cloud, but you can't actually send enough of the data to the cloud to do anything useful. So there's some statistic where basically the vast majority of sensor data that's collected on device is discarded almost immediately, and there's no real benefit or use to it. And previously, you could do some stuff on device. There's pretty sophisticated DSP out there for all sorts of different tasks.

Essentially, you're going to be doing something simple and thresholding or something like that and throwing away a lot of the data. So the thing that edge machine learning allows you to do is actually make some of these more sophisticated decisions on the device. And the way I think of it is it's a form of compression. So what the model is doing is taking the raw sensor data as input, and it could be from one sensor, like an image sensor.

Daniel Situnayake (14:47.488)
Or it could be a combination, so like an image sensor and a microphone, or a microphone and an accelerometer, or anything, pretty much. And it takes that raw input and kind of digests it and turns it into something much more compact that is much more sort of dense with meaning, right? So an example of that could be, imagine you have a production line where there are little widgets

going past, and you want to determine whether the widget is perfectly manufactured or not, or maybe there's some kind of defect. And so the old school way of doing that would have been having a camera, sending all the data to the cloud, doing some processing on the cloud to figure that out, or even having a person look at it. But with machine learning model running on the device, when the image comes in from the sensor, the model can actually

make that call, like, does this widget look like it is perfect, or does it look like it's been mismanufactured? And then the only thing that needs to go to the cloud is that binary, or maybe it's like a score of like from 0 to 100. How likely is it that this particular widget has a problem? Or maybe you're not sending it to the cloud at all. Maybe you're using it to control the machine and halting the line if the number of mismanufactured widgets gets

above a certain threshold or something like that. So you can probably imagine, you know, that's just one little example, but there's so many use cases, whether it's like wearables, industrial kind of work sites where there's not good connectivity, places where there's need, real need for low latency operation, like in safety or in, imagine like a self -driving car. That's kind of one of my favorite examples of

an edge AI device, they're like the most big and powerful edge AI devices, but you couldn't do a self -driving car where the logic is in the cloud. So someone walks in front of your car, but you had bad cell service, so you just keep driving. But there are lots of examples like that on kind of a smaller scale too.

Jacob (17:02.873)
All right, excellent. We really appreciate that. Those are some great examples. And hopefully that spurs some ideas for folks on what they can use machine learning for in those types of, in different applications and situations.

Daniel Situnayake (17:16.512)
Yeah, one thing I really enjoy as a rule for this, there's this acronym BLERP, B -L -E -R -P. And you can Google this if you search for like edge AI BLERP. And it stands for bandwidth, latency, economics, reliability, and privacy. And those are the five big factors that, you know, if your application can benefit in a couple of those factors from deploying a model to the edge, then it's probably

worth exploring. And I won't go through them all individually, but there's a lot of good write -ups online

Jacob (17:54.789)
Perfect. Now, are there any trade offs that people would look at for, do I just hand code an algorithm myself to try to figure out if something, for example, was defective versus trying to use a machine learning model? Where does the real benefit come from using

Daniel Situnayake (18:13.474)
Yeah, it's a cool thing to think about really. the idea behind machine learning is it's sort of turning traditional software engineering on its head. So in the process of designing an algorithm, usually you'd sort of look at some situation. You have a model of how this situation works, maybe based on physics or based on kind of your domain expertise and understanding of

the situation and then you create an algorithm that encodes that understanding and write tests for it and then try it out in the real world and see if it works. And hopefully it does. But it requires, you know, a lot of domain knowledge, a lot of engineering expertise and in very, very complex situations, like imagine keyword spotting. If you're trying to understand when a certain word has been said.

You could write some rules that allow you to understand when I said a certain word, but then someone with a different accent and a different type of voice, it might not work. And you end up with this huge multiplication of all the rules that you need to write. And it can get really, really hard. So machine learning turns this on its head and it's like, okay, what if you start with the data instead of the rules? And then we use an automated process.

that essentially is allowing the computer to discover the rules based on the data. And then you end up with this algorithm and you still have to test it in the same ways, right? You still have to make sure it works in the real world and run a test data set across it and understand the performance. But you didn't have to actually sit there and hand code all of these rules. So it's very effective in situations where it's quite hard to model like the keyword spotting example thing that I gave.

Imagine with vision, like, it's pretty easy for a human to tell the difference between a cat and a dog, but it's not very easy to write a hand coded algorithm that describes the difference between a cat and a dog and allows the, the computer to, to determine those. But that's actually quite a trivial thing to do with machine learning. You just have a ton of pictures of cats, a ton of pictures of dogs and a certain model architecture that's optimized to, work efficiently

Daniel Situnayake (20:37.536)
image data, and then you can train the model and then test it and make sure it actually works. And you have this cat versus dog detector.

Jacob (20:47.407)
Perfect. Now, when you do go through and design these models, how do you typically approach the balance between model accuracy and the resource constraints that you might find on an embedded target to ensure that things work the way you it to in the field?

Daniel Situnayake (21:05.09)
Yeah, that's a tricky thing sometimes. So it's, it's this kind of balance and negotiation and you're, you know, on the one side, trying to have good enough performance at the task that you're actually building an application that's effective. And on the other hand, you've got all these budget constraints to work within, whether it's like not taking up the whole duty cycle on the device, or it could be even fitting the model on the device and being able to run

fast enough for the application that you are trying to target. So one thing that's kind of interesting is that it's actually less of a problem, less of a common problem that you might think. It's quite rare that we end up with customers who are unable to deploy a model to a suitable device because it's too big. If you're looking at like

sophisticated, crazy things like large language models and stuff like that. Like, yeah, you're probably not going to get that running on a microcontroller, right? But if you're doing things like keyword spotting, it is very within reach to have a model running in real time on an MCU. Vision, you know, you can get down pretty, pretty quick, you know, doing an inference every couple of seconds or so on a even kind of lower mid range microcontroller. On the faster devices.

This stuff is very, very doable. And there's a whole range of hardware targets from like MCUs with a bit of acceleration all the way up to these embedded Linux boards with essentially like a small GPU. Pardon me, a small GPU there that can run multiple simultaneous high resolution streams of video and run inference across all of those in real time. So it's usually

possible to find something that's going to work. But the process generally looks like if you can separate it into pre -processing or DSP or feature engineering, depending on which domain you come from, you've got the model itself and then whatever kind of post -processing you're doing. And then you have the hardware. And there's this process of finding the balance of

Daniel Situnayake (23:25.048)
How much of the work can I shift onto different parts of this? So can I shift some of the work onto a really good solid DSP algorithm, which will take the raw data and create this kind of intermediate input? So with audio, you might create a spectrogram, for example. And then you feed the spectrogram into the deep learning model. another approach might be to just feed the raw audio into the deep learning model.

And depending on the target that you're going for, one of those approaches might be better than the other, because if you have a target that's super fast at running deep learning models, because it has, it's got an accelerator that's designed for that, then maybe it takes more time to do that input transformation with the traditional digital signal processing algorithm, but then it would to just feed the whole thing in. But if you're working on something a little bit more conventional, maybe an MCU that has

ability to accelerate like FFTs or something like that. Maybe you can get the lift from that DSP part and then minimize the amount of work that the deep learning model is doing. So there are unique trade -offs like that for every device that you're targeting. And so a really nice way to explore that is with this kind of automated tooling where it tries a load of different variations, identifies the ones that are most promising, tries more that look like

and eventually finds the ideal balance. You can do that by hand, but it's kind of nice to have a system. So we have something built into Edge Impulse called the Eon Tuner, which does that work for

Jacob (25:02.339)
Excellent. Yeah, it's kind of speaking about some of these tools. What are some of your favorite tools to do machine learning on the edge? And you can certainly include Edge Impulse stuff. So I've used the tools quite a bit for like in trainings that I've done and things like that, and even a little bit of development work. I think they're fantastic. So you don't have to be shy and not mention your own tools. Feel free to mention what you like.

Daniel Situnayake (25:24.321)
haha

Daniel Situnayake (25:28.076)
Yeah, that's kind of you to say thank you. I'm, you know, I've been really happy with the reception we've had from a edge impulse over the, over the years. think it was just such a huge barrier to entry with all of this stuff back in the day. you really had to learn like massive amounts of stuff that just, probably as an embedded engineer wouldn't have that much interest in usually. And it's just getting in the way of, you're trying to solve an actual direct problem.

We've tried to build a platform that minimizes that learning you have to do with all these different Python tools, all of the model training tools and things like that. And if you just want to come in and train a model to solve a problem, you can pretty much do it by clicking through. And we have good suggestions for what to do and default architectures and really nice DSP algorithms that we've built and optimized. So if you're trying to

started with this kind of stuff and you haven't got much experience with it yet. I'd recommend, there's some really good tutorials that we have around Edge Impulse, how to go from kind of zero to having something working in literally five minutes. It can be really quick. And then it's a matter of like building up your data sets and improving your model to get production ready. But there's a whole world

really in -depth, highly technical tools out there for people who have a bit more experience with machine learning. So things like TensorFlow, PyTorch, Jax, all these frameworks that are basically designed to help you define deep learning algorithms and build them. those people who are working directly with those tools are really important to us as well, because that's a lot of the ML experts.

who are working within companies to try and solve these problems. the problem that they often have is they don't have a lot of embedded experience. know, the crossover of people who have deep ML experience, deep embedded experience is not that big. I would say I'm, you know, a sort of intermediate level embedded engineer. We have a whole team of like absolute incredible, unbelievable embedded engineers at Edge Impulse. it's sort of humbling

Daniel Situnayake (27:50.2)
to work with them and see how much I have to learn. But I know a bit more on the ML side. And there are very few people in the world who can do all of it. And so we've built a bunch of tools as well that can help you go from that ML world. So imagine you're training a model in your own scripts in TensorFlow or in a notebook, which is like an interactive computing environment that are very popular with sort of ML and data science crowd. And you can just import our library and then pass

a model that you've trained to function on our library and run it. And we will turn it into a zip file containing C++ that you can run on a device. Or we'll turn it into a firmware that you can flash on whatever kind of exotic accelerator you're using. try to start from either point. either you have a bit more embedded experience and you need a bit of help with the ML stuff, or you have the ML experience, but you need some help with the embedded stuff.

there are different kinds of lines you can take. it gets even more important once you start working with all of these different crazy accelerators and heterogeneous compute, because there's no one who knows all of the intricacies of all of these different things. So the tooling is really important. And the companies that are creating these devices and this IP for the accelerators.

do their best with creating tooling and documentation and that type of thing. But it just becomes such a lot of work to learn all of these individual platforms. So having kind of a unified approach is really time saving.

Jacob (29:30.447)
absolutely. Especially since the technology evolves at such a rapid rate that probably by the time you learn a tool, the industry has moved on to the next one, right?

Daniel Situnayake (29:39.264)
Exactly, exactly. And you never know, kind of make an investment in learning something and it turns to be a dead end. So being able to operate sort of a level above that is quite nice.

Jacob (29:51.255)
Absolutely. It kind of helps teach you the processes that are required to think through and solve the problem. And then even though the underlying technologies may change, you can still use whatever technology tool comes along to help you solve the problem because you have that high level of experience.

Daniel Situnayake (30:04.458)
Yeah, that's such an important point. think ML is all about processes and workflows. It's not so much, you know, the technology gets so much attention, but the things I'm talking about today, like a lot of these model architectures do it for doing useful stuff on the edge. This is not like cutting edge ML research. In some cases it is. We're doing some of that work at Edge Impulse, but a lot of these are pretty simple machine learning algorithms. But the trick is how do you

build an effective application that works in the field and how do you have confidence in the application that you've built? And so going through that workflow of collecting data, training a model, testing it on device, evaluating how well it works and having that feedback loop and flywheel that helps your application keep getting better. It's a little bit different than conventional software engineering. So it's important to kind of learn how to think about these tools.

and how to apply them and then how to evaluate

Jacob (31:06.467)
absolutely. I guess kind of circling back a little bit, I just wanted to kind share an example with everybody because there's hand coding and then there's these tools that we can use. Probably five or six years ago, I had a project I worked on where it required basically a, it was a touchless sensor that shot infrared light up. And then you had some photo diodes that would detect a hand swiping across or was something present? Did it move left, right, up, down, that sort of thing, right?

trivial thing for machine learning algorithm to determine. But I worked on this project pre machine learning, pre like edge impulse. And I remember collecting all this data, spending probably up to six weeks, hand coding an algorithm that would very nicely kind of, you know, detect these different gestures. And then, several years ago, just a couple of years ago, I was like, Hey, I have all this data. came across it and I, you know, use some machine learning tools to see how, how hard is it to get, you know, the results here. And instead of, you know, spending six weeks, it was.

you know, a day or less to get something that had a very high accuracy. And it was like, man, I these tools were around, you know, five, six years ago, but it kind of gives at least an idea of how much these things can solve some of your problems. But maybe importantly, like gives you the idea of to some degree how, where, the technology might go, which kind of leads to my next question of where do you, where do see all of this going? mean, we've seen some pretty rapid developments over the last five years. I'm sure what, where do you think.

some of this is going into the future for machine learning at the edge and for embedded

Daniel Situnayake (32:41.004)
Yeah, that's a I love that example, by the way, that's just a really cool sort of highlight of how the the different approach to building an algorithm can really save save a lot of time, which is what we're all about, really. In terms of the future, I like to think in terms of like, near term, and then further future where we can kind of have a bit more fun with like weird, weird stuff. But near term,

Jacob (33:06.105)
Yeah.

Daniel Situnayake (33:08.3)
Like the technology is here, right? We've got these like amazing, amazing, super capable, extremely energy efficient devices that can run really genuinely useful workloads on the edge for ML. And there are tons of applications already out there that are making use of this stuff from like medical devices, industrial monitoring, all sorts of different applications.

You know, some people are skeptical about ML on device, fewer and fewer these days, but at the beginning, people were skeptical. But even five years ago, basically every single cell phone had already a keyword spotting model running on it for the wake word detection for voice assistants, like the Google Assistant or Siri.

Now, mean, obviously that hasn't changed at all, but there are just even more devices that have this stuff built in. there's still, we're relatively early in the kind of broader awareness of this technology and how it can be deployed and what kind of applications work well and what sort of processes organizations need in order to build a big data set and create a solution and evaluate. So I think we're going to just see this explosion in

companies that are figuring out how to use this stuff. Like we're starting to see it now where more and more companies are just coming together and you know, the research teams are coming together and thinking, hey, we've identified some applications for machine learning and we'll talk with a customer and they'll have a list of like dozens and dozens of potential applications where they're basically adding this little bit of ML to their existing products in order to create new experiences that weren't possible before.

So it's going to be quite a rapid period of change over the next couple of years as this stuff starts to roll out in bigger organizations that take a little bit more time to get rolling. But once they're rolling, it's on. And then longer term, we've all seen the hype with AI over the last couple of years. And these big, crazy models, the chat GPTs

Daniel Situnayake (35:29.038)
Dali and things like that for generating text, generating images and audio. Those will be coming to the edge, right? It's a matter of time and it's a matter of the hardware getting more capable and more efficient, but more importantly, the models themselves getting more efficient and smaller. Like some of these models are like hundreds of gigabytes in size, right? So you can't really run that anywhere. But people are realizing

Part of the reason why these are so big is because the algorithms being used are quite inefficient. And so there's a lot of incentive in terms of reducing the cost of running these models to make them more efficient, make them run more quickly on less capable hardware. And we're already starting to see the crossover point where you can run some of these things on very small efficient devices, meaningfully you've got

companies that are making cell phones and software for cell phones that have on -device foundation models and generative AI. So it's only a matter of time. in, you know, a decade from now, I think we're going to be seeing some really crazy stuff with things like generating video in real time on device. Like what if you had a little set -top box that's plugged into your TV and it just generates TV shows for you based on, you know, what you, what kind of mood you're in.

all kinds of weird and wonderful things that no one's even thought of yet are going to be possible. But we're still in the very early days of that journey. So it's going to be really exciting to see. It was the same, you know, six years ago, seven years ago, where the general purpose microcontrollers had got powerful enough that they could run some of these more efficient, basic machine learning models. And it's

Both these technologies existed side by side for a while, and then some people realized, wait, they actually work very well together. And it takes a while to roll out, but we'll see the same thing with the bigger, crazier stuff

Jacob (37:38.053)
Excellent. you think we'll get to a point where we have, for example, where I can essentially either tell or list out my requirements and we'll have AI generate my edge machine learning model that will be deployed to my device? Maybe I say I want to be able to predict the maintenance on this motor and temperature, humidity, combo, and give it some requirements and just let it figure it all out on its own.

Daniel Situnayake (38:07.5)
Yeah, I mean, we're trying this already, right? So one of the biggest hurdles with machine learning in general is just data and the availability of data sets because data is expensive, right? And data is rare and sometimes it's even impossible to get. So imagine you're trying to build a model to detect some kind of failure mode in a system, but the system very, very rarely fails.

So fundamentally, you're going to have a hard time getting data. But even in normal circumstances, like imagine you're building a wearable that can track somebody's, I don't know, athletic abilities or their performance in a particular sport. In order to have something that works across lots of different people, you're going to need to train that model on data from lots of different people. And if you're just developing the system, that means you're going to have to pay.

lots of different people to do this sport. And you have to collect data from them while they're doing it. And then you have to label that data and it can get really costly really quickly. And so that's always a challenge, but with some of these new generative models, you can actually create data synthetically. So what that means is you're telling the model, Hey, I want, don't know, imagine we're doing a vision application and we're trying to identify when there's a dog walking around in the backyard.

or something like that. What if you can have the model generate lots of pictures of dogs walking around in the backyard and create a data set for you? And then you can use that to train the smaller model that runs on device. So effectively, you're taking knowledge that's encoded in this big model, turning it into data, and then using that data to train a small model. And so that's kind of the first step towards having the system where you can just describe what you want and

the model will be created. And as we figure out the workflows around this and what works and what doesn't, we'll gradually get closer and closer to having what you described, where it's just like, OK, I have some domain expertise. I know what I want to build. I know the conditions I want. Maybe I've got a test data set that I want to have good performance on. And then you can just unleash the system to go and explore and figure out a way to solve that problem. It's a little way off, I think,

Daniel Situnayake (40:33.358)
There's always going to be this like, how do you, the most important thing is the domain knowledge and the expertise that the developer has. So how do you make sure that they are integrated into this workflow? And the system is kind of working with them to make sure that that thing that's being built actually works. That part will take a while to figure out, but I do think, yeah, this stuff is going to continue to get easier and easier with the application of

Jacob (41:04.151)
Excellent. So if we look at developers and trying to help them start to figure out these processes and tools and even imagine the applications and where they can apply this in their day -to -day jobs, what do you recommend? Where can people go? What should they be checking out? And that sort of

Daniel Situnayake (41:24.546)
Yeah, cool. I mean, I'm kind of shamelessly first going to plug a book that Jenny Plunkett, one of my coworkers and I wrote recently, which is called AI at the Edge. And we kind of wrote it with this in mind, right? It's the roadmap and like high level guide to thinking about this space. And, you know, how do you solve problems with Edge AI? How do you make sure that you're doing so effectively and your application really works? What kind of

organization do you need to build? Like who needs to be on the team working on these kind of projects? All the sort of stuff that it's like adjacent to the technical stuff, but doesn't necessarily get covered that much, but it's really the important stuff for making a successful application. So we've got a lot of good resources in there and you can actually get that book for free from the Edge Impulse website. There's a PDF or if you want a physical copy, it's

published by O 'Reilly, but I'll make sure that you've got the link to that if anyone wants to go and download it. There's just also a ton of resources online. For our platform, there's so many documentation, tutorials, guides, example projects from people in our community who have created projects to share with the world, to show people how to do this stuff. There are Coursera courses. There's some awesome courses from

Harvard University on their online learning platform around TinyML. There are a whole load of talks on YouTube. If you really want to get into the deeply technical stuff, search for TinyML on YouTube. And there's the TinyML Foundation, who basically do a lot of work to promote this field, do regular interviews with people, presentations from people.

And there are like live events and they're also recorded on YouTube. And those tend to get like really, really into the nuts and bolts, which can be really fascinating. So yeah, there's just a ton out there. If you search for edge AI or tiny ML, tiny ML tends to refer to the smaller end. you know, very resource constrained and energy efficient devices. And then edge AI is broader and just anything on the edge, which could be a self -driving car potentially.

Daniel Situnayake (43:47.712)
or something more like cell phone size.

Jacob (43:51.909)
All right, perfect. Any recommendations for conferences or anything like that that people should check out as

Daniel Situnayake (43:57.664)
Yeah, so there's actually I can give a little plug for our own conference, which is coming up in September. It's called Imagine, and it's online and in person at the Computer History Museum in Mountain View, California. It's on September 24th. And so we're bringing together a bunch of people from industry and from research to talk

what's happening in Edge AI, know, what are people actually building in the real world to solve real problems? And how is this technology being deployed in industry and making consumer products and really, really actually making a difference in the world. There's also a really great conference or set of conferences that are put on by the TinyML Foundation. If you just search for TinyML Foundation, they have Europe, Asia and US conferences.

So if you're interested in the field, that's a great way to go and just meet a ton of people who are really working on solving some of the big problems.

Jacob (45:02.297)
Perfect. Yeah, think thanks a lot for joining us and share your knowledge with us. We really appreciate it. Do have any final thoughts or recommendations that you'd like to leave with the audience? Maybe even where they can reach you if there's questions or follow you on LinkedIn and that sort of

Daniel Situnayake (45:16.536)
Yeah, for sure. Yeah, feel free to follow me on LinkedIn. And if you want to reach out and chat, I'm just Daniel at edgeimpulse .com. You know, always happy to talk with people who doing cool stuff. And I just say, if you are interested in this at all, the barrier to entry is very low. Now you can really get started and like build something in a few minutes that works end to end. Literally, like you can, if you have

dev board from one of our dozens of supported vendors. You can plug it in, train and capture data from the device's sensors, train a model, deploy it to the device in literally a couple of minutes. And we also have loads of pre -built examples and projects that are already out there that you can look at to do stuff. So don't feel kind of daunted. This is like a big field, but to dip your toes in is very, very straightforward, very, very easy now. And it's quite

a fun way of building things. think if you haven't worked with machine learning before and you give it a try, it's like this, like, wow, this is so cool kind of moment, which is, think that's what technology is all about really. So have, have fun and don't be shy with trying stuff

Jacob (46:34.341)
All right, awesome. Thanks again, Daniel. I really appreciate it. It was a pleasure to talk with you, and I look forward talking with you in the future.

Daniel Situnayake (46:40.534)
Likewise, Jacob, thank you so much for having me

Jacob (46:44.101)
Thanks.

Jacob Beningo

Host