EDGE AI POD

What happens when AI learns from the fire hose—and tests itself on silicon

EDGE AI FOUNDATION

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 59:05

What if your model pipeline started with a simple goal—your dataset, your target chip, and your latency or energy budget—and ended with measured results on real hardware? We sit down with Model Cat CEO Evan Petritis to explore how AI can build on-device AI through a closed loop that’s grounded in silicon, not estimates or hopeful benchmarks. From a live demo to a tour of their “chip farm,” we dig into how the platform searches architectures, tunes hyperparameters, and validates performance using vendor kernels and compilers across MCUs, MPUs, and specialized accelerators.

We share the story behind the rebrand from Eta Compute to Model Cat and why the shift matters: AI research moves too fast for traditional, component-by-component toolchains. Evan breaks down five pillars for trustworthy, autonomous model creation—closed-loop goals, reality grounding, system-level intent, modular learning from new research, and a single-step, transparent experience. You’ll hear how teams can upload datasets, get automated analytics on splits and distribution shifts, set constraints like sub–5 ms inference or energy per inference, and see success predictions before training even starts.

The demo highlights the silicon library and how each device is profiled in depth—supported ops, kernel speeds, memory footprints—so accuracy, latency, and energy are measured on the actual target. Results come as clear Pareto trade-offs with downloadable artifacts that reproduce on-device. We also field audience questions on exporting to Keras and TFLite, supporting time-series and audio keyword spotting, integrating labeling partners, onboarding new MCUs and accelerators, and the roadmap toward neuromorphic targets and cost estimation.

If you care about edge AI, embedded ML, and shipping models that meet real-world constraints, this conversation shows a practical path forward: use AI to navigate the fire hose of research, then prove it on silicon. Enjoy the episode—and if it sparks ideas, subscribe, leave a review, and share it with a teammate who lives in notebooks but dreams in devices.

Send us Fan Mail

Support the show

Learn more about the EDGE AI FOUNDATION - edgeaifoundation.org

Conference Recap & Edge AI Vibes

SPEAKER_01

Good morning. Good afternoon.

SPEAKER_00

Yeah, well, yeah.

SPEAKER_01

You live there.

SPEAKER_00

So I live here. And I was we it was great seeing you last week at the Things Conference.

SPEAKER_01

Yeah, no, that was great. I you know I haven't been to Amsterdam in a few years. And um yeah, it's such a nice city.

SPEAKER_00

And uh did you get into the center though? Because where we were at for the conference was not really.

SPEAKER_01

Yeah, no, I did a little extra time. I went to the week museum with all the things. And uh yeah, and it was a great event. So hats off to Nikki Geisman and team over there at the things. And uh yeah, it was great. It was the first time we had brought um a bunch of the edge AI community there. And I know HMP has been a long time supporter as well.

SPEAKER_00

But um And the Scavenger hunt was awesome. I thought it was such a great idea because this was actually the year they had like double the partners, double the booths. It was huge this year. It was it was really fun. So I can't even imagine what they're gonna do next year.

SPEAKER_01

I know, no, it was good. And we had like I was on four panels myself. There's a bunch of AJ panels and uh you're a busy man, Pete. It was a busy week. Um, but um yeah, it's you know, the over I was telling people the intersection of low power wireless and low power AI is kind of a good overlap. So I think cross-pollination there, so that's fun. So I'm DJET lagged. And um apologize to some of the folks that joined earlier. We had a little scheduling.

SPEAKER_00

Yeah, I saw the comments.

SPEAKER_01

There was people here like I think we'd originally accidentally scheduled it for 7 a.m. Um which we always have at Tuesdays at 8 a.m.

SPEAKER_00

And so, but you know, blame always blame it on like daylight savings time, you know?

SPEAKER_01

Uh okay, cool. So we're here and we're going to bring on Evan and um we'll talk about um you know his topics, and they recently rebranded. So some people may have known them as Eta Compute, but now they're called Mod Compute, which I think is yes, and I was just gonna say I think that's a much cooler name.

SPEAKER_00

Their logo is also cooler, but I look it was it's so different. I was like, wait, who is this? But it's so exciting to have the CEO on today, Evan.

Rebrands: From Eta Compute to Model Cat

SPEAKER_01

Yes. Oh, let me do a quick DSA before we bring Evan on. So if folks have not if you're in the area of Taiwan in November uh or or are interested in going, we're gonna have a two-day event in Taipei November 11th and 12th, and um AJI Taipei 2025.

SPEAKER_00

I might be there.

SPEAKER_01

Cool, cool. Yeah, here I'll show the little banner. Um if you go to that URL, there's still some seats available. It's a bit of a limited seating thing, but um we're gonna have uh, well, Jenny might be there, Edge Impulse, but we're gonna have, you know, your parent company, Qualcomm, I think will be there too. Um, NXP, ST Micro, Dell Technologies, Advantech, Intel, Climax, you know, and then we're gonna have professors from NTU and NTCU uh and a bunch of cool uh presentations on research. And we're gonna have a cool mixer there. Uh, we're gonna have a panel on startups, uh, a panel on deploying and and commercializing edge AI. So really kind of a packed couple of days to meet who's who in the space and learn what's what. Um so go to that URL and get some tickets and uh and I'll see you there. So that'll be good. So that's my that's my PSA. Um now that we've done the PSA.

SPEAKER_00

Well, another quick PSA. If you have any questions today, please ask them in the chat and we will answer them live um at the end of Evan's presentation.

SPEAKER_01

So yes, yes. And we may even interrupt Evan depending on depending on how juicy the question is. You kind of kind of kind of bundled in there. But uh yeah, definitely. This is interactive, and this is your opportunity to talk to the experts on what they're talking about. So, yes, please do that. Uh, why don't we bring Evan on board here? There he is. Good morning. Good morning, Evan. Yes. Excellent. Um, so yeah, thanks for joining us. Uh let me get you into the guest configuration here. So let's see. I believe you lost you for a second. There it is. No, but I want to give you a guest spot. There you go.

SPEAKER_03

Here I am.

SPEAKER_01

Okay, you got the prime time spot. Thanks. So you're the CEO of Model Cat. Yep. Welcome.

SPEAKER_03

As you said, people may have known us as Etacompute, but uh we recently rebranded as Model Cat. Wanted to really reflect our new mission, um, which we'll be hearing about a bit today.

SPEAKER_01

Okay. What was the is there a backstory to Model Cat? Are you a cat person or what's what's this?

The Problem: AI’s Breakneck Velocity

SPEAKER_03

Uh yeah, you know, well, you know, like all of these things, we're back annotating the myth in real time. So uh I am a cat person, and I do look, you know, uh I do have a cat, and in fact, we've got quite a few cats in our company. Okay. Um, but it was more a thematic thing, you know, the the the the way cats always land on their feet and have this kind of poise, do stuff that that seems difficult, but with you know, out getting their whiskers out of joint. Um, so we we had this idea of of using the cat for that reason. Got it, got it. And you're based in Sunnyvale, I think you said, Yeah, in Sunnyvale in California, Silicon Valley.

SPEAKER_01

The heart of Silicon Valley. So that's exciting. That's right. Yeah. Well, appreciate you uh joining us today and enlightening the audience, as Jenny said. Uh throw some questions in the chat. We may uh chime in uh on occasion if we get a good bundle of questions. But um again, you had a I think you have a presentation you were going to go through, and you have a live demo too, right? So yeah, I've got a pro and a demo. Yeah. We like live demos here because that's yeah, let's see it working. Yeah. Okay, let's bring that up. Um, yeah, cool. So I think Jenny and I will kind of hang back a little bit. Okay. Speaking of rebranding, in fact, I was sharing with Evan before that we got started, how it's been about a year where we changed the name from Edge AI Foundation to TinyM from TinyML to edge AI. So yes.

SPEAKER_00

Weren't you the mastermind of that decision too?

SPEAKER_01

Uh yes, I guess I was. Yeah.

SPEAKER_00

I think it was a good choice because it really does expand what we can talk about and do.

SPEAKER_01

Yeah, yeah. And then we we have been, yeah, there's all kinds of generative edge AI things happening to computer vision and all kinds of goodies. And uh yeah, so rebranding is uh, you know, is important. So I appreciate that uh Evan and the team have made the leap. But yes, we did our own rebranding last year. So we're still we still still say from tiny ML to the edge of AI. We love tiny ML, and it's it's a critical part of this whole thing. It's a container. So but yeah, okay, cool. So I'll just I'll just do that. Okay, so we'll step back a little bit. Evan will give you the floor and uh take it away.

Why Existing Tooling Falls Short

Model Cat’s Five Pillars Explained

Live Demo: Data, Chips, Constraints

Grounding on Silicon: The Chip Farm

Results, Metrics, and Measured Performance

Partner Spotlight: NXP EIQ Model Creator

Audience Q&A: Exports, Modalities, MCUs

Onboarding New Hardware & Neuromorphic Paths

Data, Labeling, Users, and Time-to-Model

SPEAKER_03

Great. Well, thanks for having me this morning and um welcome to everyone watching. Uh appreciate your time. Um, I'm Evan Petritis, and I'm the CEO of Model Cat. And as you just heard, we formally were called Eda Compute. If you've been in this game for a while, you may have heard of Edacompute because um we were one of the originally one of the um inference, edge inference silicon companies. But we pivoted away from that, and we we have a new mission, and the name Model Cat comes along with the new mission actually. It and our really our goal now is to build tooling that gives you truly autonomous AI model production, and um, you know that that is uh actually a pure software um uh sort of proposition. Um, but we actually have a bunch of hardware components that you'll see a little bit of as we go through today that that are part of our uh model cat solution. Um, and they they you will see echoes of the um original origin story in that, probably. So I wanted to spend a few slides and then we'll get to the demo. Um, and I'll leave plenty of time for some questions at the end. And I guess they're going to interrupt me if you if they get a really good one. Um, but a few slides first to sort of explain our philosophy and what we're trying to do here. Starting from what we see as one of the key observations, which is the velocity of change in uh in the AI domain. So, you know, I've kind of been in the technology industry for a long time and seen, you know, I was very lucky to be part of the internet boom, see see the internet grow from um you know an academic network up to you know what what it is today. And that was a lot of change, but it never had the velocity that AI technology, uh velocity of change that that AI technology has seen. And you know, it's it's really like we're seeing a new innovation every day. And the the problem that development teams basically have is they see you might read an abstract for a paper, or you might read a press release or a data sheet from a from a chip vendor, and then if you're trying to actually develop products, that very often the problem the question in your mind is is this innovation important enough for me to pay attention to? Will this give me some kind of meaningful business outcome that I need? And you know, examples of these things, you know, that every every day there's there's more neural network architectures. The the chip guys have been extremely busy with different hardware architectures. We're in the early, you know, we're in the in the phase where the the hardware architectures are still proliferating, actually, even today. Um, you know, it could be something like you know, a new approach for a training optimizer or uh, you know, training scheduling. So there's this there's basically a fire hose of research and new ideas in the field. The issue is that that these are the are, or some of these at least are probably applicable immediately, but it's very hard to know whether you should invest your time in investigating, even reading the whole paper. Should I read the whole paper? It's a question that is actually a hard question for any of us to answer. So this is this is like a characteristic of a of a technology that is got super high velocity and kind of unlike anything that we've seen in the past. And you know, uh the other the background environment to this is if you're trying to build products, there's a pretty well established uh toolkit for how you do good product development, you know, you have good system architecture, you do decomposition. This is all engineering, come software engineering kind of motherhood and apple pie. You build in testability, you stand on the shoulders of the giants, you use platforms and mega components, you know, those platforms that we use all the time. You maybe don't even think about the operating systems or the or the IP protocol stack, the whole internet stack. Um, you you do a lot really of incremental improvement. So trying to reuse because it's so hard to get stuff to work in the real world, that once you've got it to work, you don't want to lose that. So there is a lot of building on modules, on components, and so on that you've already proven out or that have been shown to work. So this is all great. This is all great, but you spray a fire hose fire hose of AI innovation at this, and none of these kind of uh mechanisms and none of the processes in traditional product development are really capable of responding and absorbing those innovations fast enough. It's uh it's it's really uh pretty tough situation if you're in the game of trying to build real real products. So we wanted to address this, and so of course, you know, we aren't the first people to observe that this is a problem, this mismatch. And and you know, people have tried many approaches, and some of the approaches, there's there's actually you know, a couple of other major ones as well, but some of the approaches are you know, better libraries, you know, maybe maybe if I make my compiler better, or maybe if I make my um you know my my training setup better, or my cluster management better. These are this component by component improvement, it's part of the story, but it's not really been enough. Another sort of uh angle is the the whole product angle, which is okay, the the issue here is that we just need an end-to-end tool chain which gives a development team everything they need, and that gets part of the way as well, but but has not been dramatically successful. And of course, the one that that that many players in the in the game come back to because it's actually kind of known to work is here's it here's a can solution. Okay, here's a can solution for X, for problem X. Um this is known to work, and P it works, but it gives you these spot points in the in the product space where you've got a solution, then to get from that solution to actually something that you this useful for your application is not always easy. So none of these have really been broadly successful. And we set out to ask how we can actually solve that problem today. Um, I really like this this this stock image, or don't like it, in the sense that you know they've put it at the word AI on a chip. It must be easy, hey? That's it. There there's there's the picture. So that's in some way that's what we want, but it's very it's not it's been proven to be not very easy. But we have one hugely powerful weapon that that you know we have never had in the past to help to solve this problem, and that is AI itself. Using AI as the core of the tooling for this to help solve this. How do I build AI models for a particular application on a particular chip? That is the key to um making this a broadly accessible and usable solution for across a variety of different domains. And that's what I'm going to show you today. That's what Model Cat is. So, you know, as usual, AI is really critical and is the linchpin of this, but it's not a magic wand. And anyone who you know uses any of the coding tools or even any of the chat chat AIs today knows that you need some scaffolding. You can't just throw a general purpose AI at a problem where you actually want a very deterministic and well-characterized output, which is really what we want if we're trying to build a product. And and so AI itself is not really a magic wand, but it's the central component. And we've built this scaffolding and kind of characterized it as five pillars that you need for AI-created AI. So the very first, and probably the I think the most important one, and and look, a lot of the uh the chat bots are now getting to this, is that you must have a closed loop around the AI. That is, the AI is working towards your the user's goal in in a closed loop until it reaches their heads. And so, of course, then that brings up how do you limit how much resource it's going to use and all of those sort of things. Yeah, you have to think about that. But this is really the this the central architectural um uh characteristic. The second thing is it must be grounded in reality. You know, this is we're not in the game of dealing with AI hallucinations or inaccuracies or estimates. We need the AI to be continuously grounded in reality, and you know, I use that term very broadly. The reality of what the chip can do, the reality of how much memory you've got, the reality of how much power it's using, it must be grounded in reality. And I'll tell you a bit about how we do all these things in a minute. Third thing is it's got to think in systems, you know. Um, you know, my I'm I've got a sort of systems, I'm actually a hardware engineer originally, and I've got a systems background. And the the the thing that we need to do here to make this broadly applicable is is provide a way that you can tell the sys the tool or the platform, Model Cat in this case, you can tell it in a natural way what you want out of it. So this is not the same as giving it lots of little micro instructions about what neural network architecture you want to use, or exactly which transfer learning data set you want to use. It's here's what I need from you. Um, and you know, it's an input. Now you give it to me. The fourth thing is all about that fire hose. We need a way uh that the system the system is modular enough that it can learn from this fire hose of new research, and that that can be highly systematized and highly automated, that learning. And then the last one is that it needs to be a trustworthy single-step process, and um, you know, trustworthy in the sense that the user really trusts it. And of course, that's something that builds up over time with using any any any of these kinds of um AI systems. Okay, so now let me uh just skip over to very quick kind of demo walkthrough. So ModelCAD itself is a SaaS or cloud-based system, and so I'm running this. This is in my browser. I'm running this in my browser, and you can see here the data sets page. You know, this is showing a variety of different data sets. And today I'm just gonna concentrate, show you one particular data set. So, this is a data set that's a public and open source data set that's been bopping around. Um, it's pill detection. And so, you know, you can probably see that the um they've got a variety of you know, somewhat difficult um backgrounds and and distractions for the system to try to find pills within these images. It's a good example, just simply because it's easy to understand. So, within Model Cat, of course, there's a way that you can upload your data, and and this data set has been uploaded by one of our engineers actually. Um the system provides a bunch of analytics on your data that helps you to quickly understand if you've got the typical sort of data SKUs, and and our AI will sometimes give you a warning that says that, like in this case, it's saying that it's found some distributional shift between the training and validation splits. So there's there's the system automatically analyzes your data set, and then we'll give you warnings and sometimes even up to the error level if if there's um things that are wrong with your data set. Um, this is all the a pretty simple uh press button process to get that data in. Now, one other kind of core input into our model creation, or as we call it, the model build process, is our silicon library. So we we have a variety of chips that have been onboarded onto Model Cat. When I say onboarded, that means that Model Cat has characterized each of these chips for their ability to run these neural network models. And it knows just about everything that you could need to know about the kinds of operations that are supported, how fast they are, how um how much memory they take, what the parameters that they accept are. So you can think of this as a very detailed, highly automated profiling of the chip and its low-level runtime. You know, most often there's a runtime that the chip vendor makes, and sometimes, you know, in some cases there's a there's specific kernels that chip provider provides, and in other cases there's actually uh compilation uh code or compilation tools that that the chip provider provides. So all of that is characterized by model cat. So you can think of this as the library of chips that you can target for your model inference. And then when it comes to actually asking model cat to build models for you, it's a very simple process. So you you basically can specify the um the data set here. So, you know, so I might sp I might take this pills data set that we were looking at. So I've just selected the data set, which is one of those data sets that we looked at earlier. The system you might have noticed then that it auto-populated a bunch of stuff. So this is this is the internal AI AI at work. It knows something about that Pills data set because of all the analytics, and it makes a bunch of decisions about the best way to set up to build a model for that. But the one thing that I still haven't told it, which is critical, is which chip do I want to target? So here I have again that same library of chips. So maybe I choose at this NXP IMX8M Plus chip. And once I choose that chip, this left-hand pane is really everything I need to tell the system. That's the the essentials of what I need to tell the system. I want that model, I want models built to solve this PILS data set on this chip. If I like, I can optionally specify. Oh, I would like you know, I'd like the inference time to be less than five milliseconds, or I or I perhaps I've got a battery-powered device, and so I'm very worried about energy use or power. So then I can specify the energy per inference. So these are constraints that you can give the system, which are optional, and you can see here that there's a little AI advisor pane down at the bottom here. So it it I don't know if you can see it on the on your if it's big enough, but it's saying that the job success prediction for this is high, which means that with this set of um constraints, that data set on this particular chip, and with the constraint of say a five millisecond inference time, it's it's expecting that it's a high chance of getting success on that data set. Um before we leave this page, I just want to show you the advanced options. So that when I click the advanced option button, there's a huge number of panes that pop up, and you can you can get inside and start giving the thing much more detailed directions. So you know you can you can specify the input color space, you know. If you if you think you know better than what the AI knows, then you can specify resolutions for the inputs, you can um specify target values for for accuracy, how much memory you want to give it, any of these parameters, and then a whole bunch of stuff about about how it does um the training and optimization. Let's turn that off. Now I wanted to so you you know when when you hit this create button, the system goes off and it does a lot of things. And I just wanted to, I'm just gonna skip to back to a slide for a moment, just to tell you the things it's doing. So behind that button, the sys the system is understanding, taking its understanding of your data set and its understanding of all everything it knows about the chip that you specified as a target, and it's starting to ask itself questions. It's starting to ask itself questions about what would what would good model architecture be for this. So this is neural architecture search. It's what would a good transfer learning uh data set be for this if I have one? Um, and then then it does some testing of those hypotheses, and this is really important for that grounding thing that I mentioned. The system has access to what we call the chip farm, which is literally a hardware array of all of the chips that we've onboarded. So, in that case, I was using the NXP IMX8M Plus. So, within our chip farm, we have a bunch of those uh chips, the real hardware. We have the NXP conversion tools, the latest version of the NXP kernels. So the system is able, if it needs to, to go and test its hypothesis for oh well, this model. Seems I think this model would be a good a good candidate, good model architecture for this for this problem. It's able to go and test that on the real hardware if it hasn't tested it before. And what it gets back from that are things like what's the accuracy, what's the inference time, and what's the energy per inference. So all of that is rooted in real measured data. And then the system can once it has some neural network architectures that it thinks are very you know likely to be successful and come in under the constraints, then it starts to expend more computation on training and hyperparameter optimization of those neural network architectures. And it's continuously learning. So every time every time it sort of tries out a new architecture on a particular trip chip, it learns all of that and essentially has a big knowledge store inside it. Okay, so now if we go back to so I've I've created the job, and then I come back sometime later. So during this time, the system has done all of those steps, spun up a cluster of computation for training and other computation-intensive stuff that it's doing, and then it starts to give you results. And this is a pretty typical example of a results page. So it says here that it's actually the pill detection data set, and we kind of collapsed all of the complexity of the data science results down into a single single page, which starts off with here's here's a graph of my prediction time versus my accuracy. So x-axis is the prediction time or the the uh inference time, uh y-axis is the accuracy, and that that particular metric that it's chosen there is whichever the the um the original job chose as the primary metric. So, you know, you might have seen um uh scatter plots like this before. Every every dot on here is a model, and up and to the left is better. The black line is the Pareto frontier. So the most interesting ones are generally on the Pareto frontier. Um, so if I click on one of these models, then I get you know a whole bunch of detailed parameters, all the different metrics there. If I if I'm interested in different metrics, and you can see that there the system understands and reports measured uh accuracy metrics or data science type metrics, as well as these kinds of things, energy per inference, prediction time. These are very, you know, very hardware specific things, or RAM usage, ROM usage. Um, and so you get a complete picture here of the performance of that particular model. Um, you may be satisfied, and at that point you can download that model. And if you download and run that model on that same chip, remember this is all actually this was a this particular run was targeting an ST chip. If you download and run that on that chip, you'll get the exact same performance. So this is guaranteed because this is measured, these are measured performances. Um, so this is avoiding the problem that you you often get where you know the the tools need to do an estimate, or you're doing you're doing your data science in a more abstracted environment, building the models and measuring them in an abstract environment, and then you've got this porting or deployment process to get it to work on the actual chip. In this case, these the numbers are um measured on that chip. Okay, and then you know, within each of these models, there's uh actually gruesomely detailed reporting, and this is all normalized, so it it kind of makes it nice for the end user because it's all normalized from the vendors' ways of describing this stuff. So the low level that you know that the different SDKs describe this stuff or report this stuff in different ways. We normalize all of that and basically provide very detailed reports which are very comparable, you know, one chip to another. And you know, so this is this is actually called our our model benchmarking report, which is as you can see, is mainly about around layer timing and and sort of um uh resource usage on chip. And then we also have this model scoring report, which is uh all the data science uh metrics in gruesome detail. So just going back to here, I just wanted to show one more thing. The uh within this graph, you you actually can access all of those different metrics. So if you decide that, oh, you know, Coco MAP at 0.5 isn't what I'm interested in, I want it at 0.75, it's just a simple matter of you know selecting that off this axis. So, you know, a lot of the sort of stuff that you might otherwise be doing in you know, writing a lot of Python code that the sort of day day-to-day stuff is done by Model Cat in a very easy way and accessible way. Okay, so with that, I'm just gonna skip back. I'll get I'm gonna finish up now so that we have time for questions. One more thing I just wanted to mention. Modelcat has a partnership with NXP, and we we co-developed a version of Model Cat with them called EIQ Model Creator, which is their their cloud-based ML enablement platform for all of their um chips, all the way from uh MCXA, which is uh a relatively inexpensive, low-powered microcontroller with no acceleration, up through the RT series and the IMX series, uh multiple series of IMX. And so we've got Model Cat, which is uh our own brand and our own product, and then we've got the EIQ model creator, which is co-developed with with uh NXP. All right, and I invite you all to do a test drive. Give us a yell, and we'd be happy to give give uh test like um licenses, test drive licenses. Okay, that's that's it from me. I'll leave that up there for now. Questions.

SPEAKER_01

Well, thank you, Evan, for uh running through all that. That was pretty cool. And uh we have a few, uh quite a few in the uh queued up here in the chat. Uh Jenny, do you wanna you wanna do the honors? You had some good questions on your own too.

SPEAKER_00

So we I got some questions in the chat. However, we just got one from Dimitri in the comments, so maybe we'll do that one first. Throw it up there. So Dimitri asked, yeah, can you export your models to PyTorch or Keras? Or is it only to the hardware that's supported by Model Cat?

SPEAKER_03

It is. We do have um, we're kind of more in the Keras universe, so we have a direct export that will get you into Keras, but not PyTorch today. Okay. You know, the the canonic the standard output the the the chip guys have made it relatively simple for us on the chip side, uh tier flight, you know, has is still uh the dominant um interchange format and we support that. So for for almost all almost all the direct-to-chip uh interactions, we output tier flight.

SPEAKER_01

Sounds good. Uh what are some of the other questions we got here? Jenny, you had some good ones you were covering.

SPEAKER_00

Um yeah, I just asked some questions um sort of chronologically throughout your whole presentation. Um, I guess the first one I had was um we I know you showed an image classification, maybe even bounding box object detection example, but what are the other types of models that you can generate with Model Cat? Is it just computer vision or can you do time-based frequency?

Costs, Access, and Closing Thoughts

SPEAKER_03

We we've done um okay, so it's mainly computer vision, um, the usual computer vision use cases. Uh the um we have done uh a bunch of what I what yeah single-dimensional, you know, um time series data type stuff. And and um that so that yeah, people use it for that and have used it for that. Um and we we are also working on uh other other modes, so you know obviously the language models and so on are coming. Um probably not gonna reach these little microcontrollers, but you know, that we already have chips that are large enough to run language models, for example.

SPEAKER_02

Cool.

SPEAKER_01

Here's one that came in from Dale. Uh will you be supporting more MCUs?

SPEAKER_03

Yes, we we keep adding various types of, you know, some of them are MCUs, some of them are really microprocessors, some of them are these special purpose inference chips. We keep adding more. And the the nice thing about it is that we've now got a kind of factory for adding those. This is not a highly, highly uh artisanal process to add to to onboard those chips. So, yeah, we are, and we we work with you can see that we work with the major vendors to date, but we we've got some very interesting discussions going on with a bunch of the startups at the moment, which is which will will sort of supplement.

SPEAKER_01

I could kind of create some libraries to plug into your stuff.

SPEAKER_03

Yeah, yes, so we we've got we've got a we've got a uh roadmap bullet point for self self-onboarding, and we are working towards it's not there yet. So, right now there's still a bit of engineering work that one of our engineers has to do.

SPEAKER_00

So I I guess my my follow-up question for that would be are do the embedded engineers use AI to generate the support for a new platform for the AI generating AI, you know what I'm saying?

SPEAKER_03

Uh yeah, no, I'm not saying um the the the uh yeah no. Uh I think you know the deep embedded stuff, and and of course, the last bit that the embedded engineers on our side that is are doing is the deep embedded stuff, right? You can imagine. Yeah, that's we've got some great ideas about how to do that. Uh today, today we we we haven't done it.

SPEAKER_00

Awesome. Well, we got a couple more questions in the chat.

SPEAKER_01

Yeah, yeah, yeah. Should we do the Qualcomm one next? Why not? Of course, the Q490.

SPEAKER_03

Yeah, so we've got we've got uh not that Qualcomm chip, we've got another Qualcomm chip whose name I've immediately forgotten on there at the moment. Um we we're pretty happy to work with whichever vendors and we we tend to be driven by you know the business opportunities in terms of sequencing the chips on. We uh we I believe that that that 6490 has got uses the same um MPU, yeah, no, it doesn't really matter to us, but as long as the SDK is the same, uh and I think it does use the same SDK, and they've got a they've changed the name of their SDK a couple of times.

SPEAKER_01

It's part of the dragon wing um world there. Yeah, cool. Uh actually I had a quick question. So talking about architecture types and things like that. I mean, we one of the panels we had in Amsterdam was our one called the farthest edge talking about spiky neural networks and neuromorphics and things like that. Is have you done any experiments with that in terms of portability to or is it your uh supporting kind of more spiking neural network type?

SPEAKER_03

Yeah, so we uh we're talking to a a few of the startups in that that and and and you know, a couple of those do these kind of spiking neural network type approaches. I mean, it's it's it the the general architecture that we have is very very applicable to that, but the the lower level, you know, doing the the abstract jump jumping the abstraction gap between where they are with their runtime or whatever, you know, the chip vendors runtime and and where we are, which is you know, we speak pretty standard uh neural network architectures, uh and interchange formats. That's that's a bigger, bigger, you know, bigger um step. Now that said, I'm hoping that that in the next uh six months or so we have we have one or other of those that that gets mature enough that we we get it up into production.

SPEAKER_01

Cool. Here's a question from Dmitry Godofsky. Do you have KWS or WWD models? Maybe you can enlighten the audience as to what those acronyms mean. Uh I don't know what those meaning.

SPEAKER_00

Uh KWS is keyword spotting.

SPEAKER_03

Yeah, yeah, yeah. We've we've uh audio keywords we've done, yeah. Yeah. Uh WWE.

SPEAKER_00

W D I need to look up.

SPEAKER_03

Yeah, okay. That's not one that I'm familiar with. I'm sorry.

SPEAKER_01

All right, Dimitri, if you can enlighten us in the chat, what is WWD? Uh what else we have here? Uh there's a question from Katya. Uh latest research papers at the beginning as well as NAS.

SPEAKER_00

Oh, wake word detection. Sorry.

SPEAKER_01

Wake word detection. Oh, oh yeah, yeah, okay. Yeah. Okay, model building AI closer to NAS or indeed managing to learn from latest research, yes.

SPEAKER_03

Yeah, yeah. So so our general approach is a superset of the of both of those. So so this is this is really important, actually. You know, you think about how human learns, you know, you if you're someone who's working in research in that field, how do you learn about new ways of doing things? I mean, I use that and it sounds like a very, very high-level description, but you have to jump up to that level to get to get this point. Um, you you you might read a paper, or you might scrip scribble on the back of a bit of paper and go, oh, if I take a uh uh mobile net three and replace this block with uh something else, then I've got I've I've just done some you know neural architecture research, or at least ideation, and um and uh I've got I've got a new neural network architecture. Um so it's both is really the answer to your to the question that's asked here.

SPEAKER_00

Good. Well, I had a quick question. Um we we you showed the pill detection model, um, for example, in the in the demo. Um, how are do you have to upload your own data? Did someone take images of all those pills, or were they synthetically generated using some sort of Python script?

SPEAKER_03

Or um that that's it's an open source data set. Uh okay, all right. So I don't I think I think it was basically uh look, I don't actually know the answer, but just looking at it, it looks like yeah, it's actual real images and they've done background replacement or something because it because it's got a lot of a lot of flack and hash kind of visual hash in it.

SPEAKER_00

For sure.

SPEAKER_03

It's pretty hard.

SPEAKER_00

Yeah, some sort of image manipulation script for sure. And just okay, but so you have to upload your own data sets.

SPEAKER_03

Yeah, to come back to your question, yeah, that the we we provide a bunch of the open source data sets that we've already uploaded. People are always uploading other open source data sets that we've never heard of, and then like for the real production quality models that when people are serious about building something for a product, they typically have to either create their own data in some way or or or gather their data in some way.

unknown

Yeah.

SPEAKER_00

Okay.

SPEAKER_03

Cool.

SPEAKER_00

Do you have to bring it in already labeled as well?

SPEAKER_03

Yeah, we have a we have a labeling service that we partner with, and that's within the platform. Um but but most of the time, yeah, it's it's already labeled.

SPEAKER_00

Sure, sure, sure.

SPEAKER_03

Good.

SPEAKER_01

What else we got here?

SPEAKER_03

Any more goodies?

SPEAKER_02

Oh, also oh, sorry, go ahead.

SPEAKER_03

I just saw one question there that I thought would be a good one. Who is the audience for Model Cat? Do you need to have strong knowledge of ML already to use it?

SPEAKER_02

Yeah.

SPEAKER_03

Um you might have been able to see in the user interface that that we we really worked hard to make it so that you can you can kind of think functionally and and not have to be an expert on the minutiae of data science and ml to use the thing. And I I can say that we've we've we've you know we've come a hugely long way and we're continuing to to squeeze that, squeeze that. It's it's actually a very hard thing to do because it pushes a lot of the functionality further down lower down into the stack. The AI further down has to be more and more responsible, if you like. And and so you know, the last thing any of us want is to be arguing with the thing about what you know what what it wants to do or what its ideas are or whatever. That's not its job. The job is to kind of just do what you ask. So so the the our aspiration is to make it that it can a product manager can use it. So that that you know that that is um, you know, analogous to what some of the the code um development tools in the in traditional software uh aiming at.

SPEAKER_00

Cool.

unknown

Cool.

SPEAKER_00

Yeah, um, I I was looking at in the demo you showed there was a lot of options to show like advanced mode, and then the page got so big, and I was like, whoa, I I think that would be a little bit intimidating, but um I'm assuming you guys you have some sort of way to do, I suppose, is there some sort of like partner uh programming for a bit? Is there a package like or do you just have to jump straight in on your own with your team?

SPEAKER_03

It's you know, what we what we suggest is do a test drive if you're interested, or you you're you're asking yourself that question. A user could well be asking themselves that question. What we suggest is do a test drive, it costs nothing, it takes all of a few hours to actually get a flavor for what what you need to do. Yeah, um, it really is that easy. Uh and and um and so you know most users once they've done a trest drive, they they kind of get that you have to recalibrate your mental model because if you're if you're kind of used to Python note books and or you're an you know maybe a traditional software developer and you're looking at all the libraries that which you know which of these libraries do I really need for developing my AI model, um it, you know, this is you know a hundred or a thousand times less complex. So, yes, we we we we have the advanced mode because we need to straddle. Look, the reality is that many of the people listening here are data scientists or ML people in some in some capacity, and we need and they need control and they want to have control, and they probably do know, you know, and will be able to contribute beyond what our AI sort of can do. And so we, you know, we want to, you know, we need to enable that as well.

SPEAKER_01

Right, right. So when you say qualified buyers, is this available to like folks in academia as well, or do you have some sort of program if they're yeah, yeah, yeah, yeah.

SPEAKER_03

For the the test drives are, yes.

SPEAKER_01

Okay, so qualified academic academics.

SPEAKER_03

Um it's just qualified in the sense that you you know it's not it's not a completely automated process. We that's an email address, and that's a that's a little form, and yep, okay.

SPEAKER_01

We will we encourage people to click click on that.

SPEAKER_00

Yeah, this and this sort of plays along with Katya's question earlier. Um, how long does it take to get from start to finish? Um, how long, you know, does it does it would it take for an average user to get from uploading their data, clicking start, and then deploying onto the device?

SPEAKER_03

Yeah, yeah. Um okay, so so it's an excellent question. Everybody uh many people ask it. Uh unfortunately, it's uh it's a how long is a piece of string question. Um it can take between, so Katya, your range of estimates is is about right. You know, hours up to a week is is about kind of what I would say. So a week for these large data sets, like you know, we've got some, you know, people who've been doing automotive, you know, safety-related, high-resolution images, large data sets, um, very large transformer models that you're training. Um, and then you know, uh that that that example that I showed, I mean, that was probably a few hours, I would guess.

SPEAKER_01

Got it. And do you run this on uh what's your back end? Are you running this like an AWS or is it?

SPEAKER_03

Yeah, it's AWS. We use AWS, and so it's and it and Katya, you might be wondering how we do that in a few hours. It spins up a parallel a cluster of machines for you for the particular job. So for you for you, when you hit that build button or the create button, it spins up a cluster for for you. It's not just one machine or anything.

SPEAKER_00

Oh, you also mentioned, oh sorry, Pete. Um, there you also mentioned that there was a I can't remember what you called it, but uh a farm of devices uh that you've already support. Are these like like you have a room physically with the devices that it's deploying to to check the accuracy and the performance?

SPEAKER_02

Yep, yep.

SPEAKER_03

Yep.

SPEAKER_00

I don't think a picture of that.

SPEAKER_03

Uh yes, yes, there is a picture floating around. I think I think our hardware guy is gonna do a beauty shop soon. He said he was going to. Yeah. Yep.

SPEAKER_00

Very cool.

SPEAKER_01

Good, good.

SPEAKER_00

Oh, Dale has a question, Pete.

SPEAKER_01

Dale has a question. How much does it cost?

SPEAKER_03

Yeah, unfortunately, that's the piece of string again, Dale. Um, but but you know, um we we have uh we have a little one pager which kind of gives some some kind of guidelines, small, medium, and that and that sort of production quality. Um uh which we could share. So if you if you if you just ping me afterwards, I'll dig that out and I'll just I'll just send that to you. That just it again, it gives you a flavor, uh kind of like what um there was a tool where you could sort of do a cost estimator on your website. Yeah, yeah. You know, the the one of our engineers is promised me, promising me the cost estimation model. Okay, and and and you know, it's coming, we've got enough data. So the hard thing with that is is is the data. It's actually not that hard a model, but it's just a yet another roadmap item. So that little meter that said what your probability of success was will also say here's here's what your cost estimate is.

SPEAKER_01

Right, okay. So you do have human engineers, it's not all AI generated, you're not the only engineer. No, no.

SPEAKER_03

No, you know, that canonical slide of how development that's you know, for sure, a large platform development like this, we look like that.

SPEAKER_01

No. Okay, awesome. We got we got we're coming in the last couple of minutes. We got uh I want to remind folks, um, this show will be on our YouTube channel instantly after we finish today. Uh so if you didn't catch the beginning, you can do that. And then also um upcoming live streams are here at jfoundation.org live stream. So we have one of these every two weeks. Uh, you can subscribe to our YouTube channel or whatever your mechanism is, it's on LinkedIn and Twitch as well. Um but yeah, I think Evan, this is fantastic. Really appreciate your time. And Jenny, thanks for co-hosting. Um thanks everyone in the audience for accommodating our timing things. Um, and uh any closing words of wisdom, Evan, for for the world here?

SPEAKER_03

Uh all I'd say is ping ping us, and we're we're very open to um partnering. Uh, you know, I know I noticed that there was a bunch of of uh questions about the chips. We're very open to partnering with a variety of different chip vendors, and this proposition works for all of those, including the really innovative stuff that people are doing at the hardware level.

SPEAKER_01

So yeah, and this is like this is like exactly the kind of innovation thing that you almost don't even think of that's happening in the space, like you know, actually using AI to sort of help with model portability and development and deployment of solutions. So uh fantastic the that you're on the cutting edge of this. And um again, kudos on the rebranding model cat. Very cool. Thanks very much. I got my cat information merch.

SPEAKER_02

We need some model catch or something.

SPEAKER_01

I don't know. Um cool. All right, well, thanks everybody. Jenny, thank you again, Evan.

SPEAKER_03

Thank you very much both for hosting and thanks thanks to the audience for listening today.

SPEAKER_02

Thanks again for take care.

SPEAKER_03

Okay.