Panel Discussion - EDGE AI TAIPEI - Revolutionizing Edge Computing with AI-Driven Innovations Artwork

EDGE AI POD

Discover the cutting-edge world of energy-efficient machine learning, edge AI, hardware accelerators, software algorithms, and real-world use cases with this podcast feed from all things in the world's largest EDGE AI community.

These are shows like EDGE AI TALKS, EDGE AI BLUEPRINTS as well as EDGE AI FOUNDATION event talks on a range of research, product and business topics.

Join us to stay informed and inspired!

All Episodes

EDGE AI POD

Panel Discussion - EDGE AI TAIPEI - Revolutionizing Edge Computing with AI-Driven Innovations

January 16, 2025 • EDGE AI FOUNDATION

Discover the cutting-edge world of AI deployment on edge devices with insights from top experts, including Dr. KC Liu. This episode promises to unravel the complexities of optimizing AI models for devices where memory and computing power are limited. We explore the critical role of model compilers and the innovative strides being made in AIoT sensors, as Matteo Maravita from STMicroelectronics offers an exciting glimpse into the future of machine learning integration into MEMS sensors.

Join us as we tackle the pressing need for standardization in the fragmented IoT development landscape. Our esteemed panel delves into the challenges developers face with diverse proprietary technologies from giants like STM, NXP, and Renesas. Hear about potential convergence through model zoos and frameworks, and the unique role of MLPerf Tiny ML in benchmarking AI applications specifically designed for edge devices. Our conversation shines a light on the balance between utilizing common tools and proprietary compilers for optimized performance on specific hardware.

Lastly, explore the promising avenues of AI in the realm of robotics and the innovative strategies shaping the future of AI systems. Learn about the layered architecture approach dividing AI systems into sensor network, edge AI, and cloud computing layers, and the potential for a sustainable AI ecosystem through collaboration. With a focus on benchmarking advancements and MPU design strategies, discover how AI integration with numerous sensors could redefine possibilities in the robotics field. This episode is a compelling journey through the landscape of AI technologies, emphasizing collaboration and innovation for next-generation AI products.

Send us a text

Support the show

Learn more about the EDGE AI FOUNDATION - edgeaifoundation.org

Speaker 1: 0:07

Okay, we're going to get started with our panel, kind of our closing panel. So let's see.

Speaker 2: 0:14

Dr KC Liu. Good morning, dr KC. I'm very honored to be here be the moderator for this session. I think people here some are from the Taiwan community and Professor Ye and Jack and Ricky and Ali, matthew and other. Ok, later in the session I will let you briefly introduce yourself first. Ok, so, ali, your turn.

Speaker 3: 0:50

Okay check.

Speaker 2: 0:56

Yeah, yes, okay.

Speaker 1: 1:08

Okay so.

Speaker 5: 1:17

Oli Ors with NXP Semiconductors, Global Director of AI Strategy and Technologies. I'm Matteo Marevita, responsible for APAC AI Competence Center, STMicroelectronics.

Speaker 6: 1:25

Hello everyone. I'm Samuel from HIMAS. I'm responsible for the AI processor chip design.

Speaker 7: 1:33

Hello, I'm Ricky from Tomofan and I'm responsible for the AI system design. Thank you.

Speaker 8: 1:41

So this is Zhongtai Ye and I'm from the National Jiao Tong University. Jiao Tong University yes.

Speaker 4: 1:48

Okay, Hi everyone, I'm Jack. I'm from HRI Taiwan group. I focus on the MCU AI Okay.

Speaker 3: 1:59

Hello everyone. My name is I'm part of the AI CoE team in Renaissance, I mainly work on the AI tools and compilers.

Speaker 2: 2:09

Okay, good, because of the original. Already some questions I plan to raise here have been answered in the morning and in the afternoon in all the speakers, so I suddenly might want to change my mind to randomly ask some questions. Okay, just a free chat. Okay, the number one question I'd like to ask about. Research question for Professor Ye first. Okay, so I know you have a paper appear in HPCA last year and this year, right. So I'd like to know, from the cloud to age, even for the tiny, what's your opinion for the research in the gap, how to utilize, how to leverage the cloud to the tiny. Or even today we were tiny, become the HAI Foundation. So I think you have some idea about when you're doing research. If there's a gap, I'd like to address this issue. Maybe later other panelists can follow the same question to give answers. Okay, professor Ye.

Speaker 8: 3:12

Yes. So this is a good question and basically we found more and more applications are used in these edge devices. We call it a tiny metal. We try to achieve the green computing or just use more, less energy consumption or use less cost to just employ these AI applications on the edge. So I think the biggest challenge is when we apply or deploy the AI models on the edge devices. There's two challenges. The first is the memory. Okay, we do not have a large memory space or the memory within our edge devices, so the second one is the computing power. Okay, we do not have the large GPU in our age, so this challenge is driving us to just see how to just squeeze our AI models on this age device become possible.

Speaker 8: 4:24

So I think some of the possible solutions to just enable us to enhance the programmability and also enjoy the performance and the power efficiencies on this AI or TinyML.

Speaker 8: 4:43

So what I want to say is the first one is the model compiler. So model compiler is a very key role to apply or deploy these TinyML models on the edge devices Because the model compiler they can just facilitate the flexibility of programmability and also optimize the language models or the CNN models on our edge devices. So the model compiler is the key one and the second one is the promising of a reconfigurable computing. Okay, for example, we have the APGA or we have the new promising this, the hardware. For example, we have the coarse-grained reconfigurable array, this CGI or this architecture, and then we also try to see if this new hardware platform can also apply within these edge devices and to just enable more of these tiny ML or these AI applications become possible or become promising within the edge devices. Okay, so we have the challenges and we have some solutions by using the model compiler or use this promising hardware to just advance the progress of these, the future AI or future TinyMail applications.

Speaker 2: 6:21

Okay, yes, yeah, thank you, professor Yeh. So any people here want to give some additional notes?

Speaker 1: 6:29

I mean everything that we agree with everything that has been discussed. What we see is that that's very applicable for some AI models that belong in or start off in the cloud and trickle down to the edge or get optimized, get quantized, pruned and get optimized to be able to run on the edge. But there's also a class of AI models that belong in the edge or are created in the edge, and that's part of what we're also seeing what used to be more the tiny ML view on microcontrollers. So they never were belonging on the cloud or high performance compute. They always belonged in an edge deployment. So you have these two classes. But I think it's very applicable in terms of that trickle down from the cloud of the model compiler, the efficient compilation, because you have memory limitations, you have compute amount limitations. I fully agree.

Speaker 2: 7:26

Yeah, yeah. So I think model compiler this is the key words. Can I focus on model compilers? Ok, yeah, ok, and maybe next question I will ask about product, because the people here in this morning and afternoon you have to show some of your product already. So the question is about the roadmap, because you have some MCU based by MCU with some NPU, but actually for AIoT sensors. So I'd like to know, from your point of view, by sensor sensing, ai, how will it be? How will you do okay about your product, the next generation for this kind of integrate sensor, for the data from sensor high speed, low power.

Speaker 5: 8:15

Okay, matthew okay, thank you for the question. Yeah, sst, we actually we already released in the market a few years ago a first sensor with the first what we call the machine learning core. That time was a small core that was able to run only decision tree model. And recently, one year ago more or less, we introduced the first MEMS with so-called ISPU. That is basically a small DSP inside the sensor itself that can run, is limited in term of memory. It has an internal RAM memory of 20 kilobyte but is far enough to run a lot of not only machine learning models but also smaller neural networks inside the sensor itself. And I was very quick on that slide today. But you can see that we can run the same kind of model that before we were running on the microcontroller three years ago, like gym activity recognition or human activity recognition to understand if I'm walking running, like what we have on the smart watches. Now, moving to the sensor, you are moving from five to 10 milliampere power consumption to five, 10 microampere, so it's one thousandth less and we see that this is a new trend.

Speaker 5: 9:51

At the beginning, when we launched the product, the customers were still looking into it because we were the only vendor releasing this kind of product. So they were a little puzzled if they needed to go only with a proprietary solution like a steam ms. But now they recognize the value and it's getting more and more popular. So we see also, ai is small ai course integrating in the sensor itself, but obviously small AI cores integrating in the sensor itself. But obviously the key point of this solution is about the power consumption. This is why this is feasible for only small models. If you are going to enlarge the model, like to 100 kilobytes of memory and so on and then more performing DSP, then you are going to use more power consumption and then perhaps it becomes less obvious that you want to run on the sensor instead of the microcontroller. But when you can differentiate you have a very big difference in power consumption. We see a trend there yes, matthew.

Speaker 2: 11:03

So I'd like to follow the question in more deep. So, because sensor is a time variant and a very real-time signal, right, so for the tool, just like a TensorFlow or some framework for the AI model, I think that is a traditional tool. So, specifically for sensors for your company, do you have any specific tools for sensor sampling for the AI?

Speaker 5: 11:29

We have basically two choices. So we can use the standard tool like TensorFlow Lite, any kind of deep learning frameworks at the end, or also SK Learner. We can use SK Learner just to train the model. So then, if the model is small enough and if it can fit 20 kilobytes, you can use any kind of deep learning framework. We have also another tool that is the NanoAGI Studio. That is an auto ML tool, but it's just.

Speaker 5: 12:03

The difference is in the entry point in them. If you want an auto ML tool or if you wanted to craft the model by yourself, but the result will be similar. Both the tools will create a model that, if is a small enough, can fit into the sensor. So you don't need a specific tool for the sensor then obviously what we have is an additional piece of tool or software that is converting this model inside a specific library for the for the sensor itself, basically the Like you have for microcontroller. You have a specific optimized compiler for a specific architecture. In the same way we have a specific compiler for the sensor, but all the tool chain before is the same. So the engineer can move the same algorithm to the microcontroller, to the sensor seamless, because the compiler is just the last step, okay.

Speaker 2: 13:10

So, ali and Samuel, do you agree, because you're also a product company? Okay, okay, yes, ali.

Speaker 1: 13:18

So of course we have different products doing similar things from our side.

Speaker 1: 13:23

So what I presented today the EIQ Time Series Studio is about that creating models for sensor-based systems, taking all sorts of sensor inputs from different sensors. We focus more on the processor side, but the sensor input can be any modality. As far as we're concerned and as Mateo mentioned, it's the step afterwards you can do AutoML or you can handcraft tools with the EIQ software stack that we have. So you have all of the time series studio can create time series based models automatically for you, or you can handcraft within the EIQ toolkit your own model. So we support that same flow as well.

Speaker 6: 14:06

Okay, yeah, I'm Samuel from HiMAS, From HiMAS of Vue, because we adopt some use case in the computer vision using the MCU. So for the computer vision the memory usage is very critical, just Dr Ye mentioned. So in current tool we found the memory utilization is common, not perfect, so it will impact the model size or inference speed. So if in the future there is a customized good model compiler or something can implement the memory throughput or usage, it will be benefit for the computer vision TinyML application. Yeah, Okay.

Speaker 2: 14:56

So Elder.

Speaker 3: 15:02

Okay, okay, it works now. Okay, yeah, good. So from the mainly in the presentation today, as also Ali mentioned, we also concentrate a lot on the microcontrollers and the microprocessors. But we do have application-specific sensors that are done specifically for specific applications, such as some of our kits come with Renaissance sensors for gas detection, so it's very difficult traditionally to detect what kind of gases we have. So we provide some sensors I forgot the name of it, I think it was a CMOD sensor and it's able to detect what kind of gas you have in the environment and it's as we know. We generate a lot of data around us and it's not reasonable to send everything to the processors themselves. So if we can do machine learning directly on the sensor, it alleviates a lot of the bus constraints. So we do have specific sensors that do ML, but I'm not sure if we can code what kind of ML model you have. It's already pre-provided for you, ok.

Speaker 2: 16:11

Yeah, thanks, ok. So next question I will about the Taiwan community. Ok, and before that I'd like to thank Odin, because originally Odin is a panelist. He'd like to invite the community in Taiwan to encourage our HAI. So, jack, as I know you, you are the founder of the HDI group. Okay to is empowered by leveraging open source. Could you share your experience that maybe about the EVB or the multi vendors chips and the pro and the cons because of the surprise over here? Okay, just take your opinion, please check okay community to maintain.

Speaker 4: 16:52

it is very difficult. Now my community is called its name, hai, taiwan about 10 000 more members every day. I will show more members Every day. I will share industry news and open source call and tutorial video for everyone In community. First we need can easily buy development board and can get the free tool chain. But now very confused is, for example, stm maybe have a kube AI, nano edge, stm maybe have a NXP, maybe have a EIQ, renaissance maybe have a reality AI. So very different, not compatible If we use, like a ARM, cmc's NN or CMC's DSP library but different company maybe have different solution. So developer need a standard Like MLPerf, maybe have a different group for tiny. For example, if you use LLM or GenAI models you can download from a but on the MCU we cannot do it.

Speaker 4: 18:48

It's very difficult to compare for every MCU provider.

Speaker 2: 18:51

So that means for TinyMailWorld. There's not such a camp that can unify everything by tooling, by just like NVIDIA there. So for the thing I'd like to come to the product company like yours, is it possible you guys have an alliance that make the tool more similar or can easy to? Because IoT is a multi-segment, so the tool, the ecosystem can be variances. Then you'll be heard to the programmer. Later I will ask the programmer. Okay, ricky here. So I'd like to maybe check your opinion because it's different to our ecosystem. So do you also in your house you suffer for that.

Speaker 5: 19:37

Yeah, no, in my house I don't use NXP microcontroller.

Speaker 3: 19:42

I don't suffer, no Part of the joke.

Speaker 5: 19:45

I understand the request. I think, sincerely speaking, it is going to be extremely difficult in the next few years that the vendors will have exactly the same tool in terms of compiling and, let me say, to compile the model for the specific microcontroller, for many different reasons. First of all, we spent several years in developing our tools and each of us has specific functions that we treasure and we think that is a differentiating factor and is giving a value to our customers. So this is a first point. The second point is that in the future we saw that we are going more and more to impure accelerated devices that are going to be different. So the hardware will be different, will need a different compiler. We are not going to converge only on one MPO architect from ARM, for example. So obviously we will need a different compiler.

Speaker 5: 21:03

What I can see there can be a convergence is more that I think is still very useful for the developers is on the model zoo portfolio.

Speaker 5: 21:16

So when you need to deal with a computer vision like today, if you need to develop an nlm solution, you you go to a high-game phase and then you can find a lot of standard models that everybody in the community can use the same LAMA 2.3 billion parameter or another model. So you have a model zoo that is the same, and then you can run the same model on your PC, on a Raspberry Pi, on an application processor, but inside the same community you are using the same model. This is perhaps where there can be a convergence, if there will be a place eventually where you may have a data set of models that are specific for AGI. So you say, okay, instead of having a 10 variation of YOLO in all the different version, mp and so on, we have a place where we have all the models for computer vision, llm and so on, and then you can make your project with the same models and then you just use a different compiler for the specific hardware. This is where I can see eventually there can be a convergence.

Speaker 1: 22:38

And maybe I'll add to that. There are a few points in that development flow where you can get into commonality. I think, as Matteo mentioned, it's difficult when you get lower into the hardware architecture or some of the benefits, because standardization limits differentiation and we're all trying to differentiate with what is critical and what is better on NXP versus somebody else or better on somebody else versus NXP. But where you have commonality is you have the model frameworks. This started off with a lot more. There was Keras, cafe, tensorflow.

Speaker 1: 23:16

Now we're seeing a lot more focus on PyTorch type of models. That's where the focus is from both academic and general development, like industry and research development. So there are a few model frameworks. So as long as the models, like in a model zoo, are coming from a specific framework or a few, there's that commonality. Then you also have commonality in types of operators supported by the hardware. So there you can also create opportunity to make sure that a model that runs on an XP will also run on a HiMax, on ST, on Renesas, and the end user, end customer, can benefit from that, knowing that the model will run if they pick the different architecture, but it will run differently and there will be different things on each of them. I think that's critical to us as semiconductor vendors differentiating.

Speaker 8: 24:17

So when we mentioned about this standardization of the tiny ML applications, I also want to talk about a benchmark. Okay, benchmarking, basically, we have the MLPerf tiny ML. This is the benchmark or benchmark suite. However, I'm thinking what is the spatial applications that is differentiated with the different from the general purpose? Or the AI applications shown on the cloud computing? Or the AI applications show on the cloud computing or data centers? So, basically, in my opinions, I found the benchmarks in the ML, perl, tinyml is just the included some CNN models so far, but the size of this CNN model is smaller than the one shown in the MLPerf benchmark. So I'm just the question is what is the some you know, some distinguish or some unique applications for our TinyML communities or applications? So any opinions?

Speaker 2: 25:35

Oh, thanks for raising this question. Okay, how many ask here? So the question is benchmarking. Oh, okay, Samuel can answer first.

Speaker 6: 25:43

So answer the uniform, the tool chain first, Because the Tiny ML is basically is running in the MCU. So this software and the toolchain is for bare metal, not like the SOC, there is a linear layer. So the upper layer toolchain maybe can be used to others platform but for the bare metalchain we need to consider the hardware architecture and the configuration. For example, different chip has different S-RAN size or different accelerator, so maybe not easy to using the common toolchain to get the best performance. So for the second question from Dr Ye, I think the application for the TinyML. From our product team of view we found the TinyML is a very good wake-up trigger in the system because they can maintain the always-on scenario under very low power consumption. So such kind of the always on a wake up trigger is very suitable for the 10M application.

Speaker 2: 26:55

Yeah, okay, so okay, other okay.

Speaker 3: 27:00

No, I don't, Hello. Yeah, for the first question about the commonality. Even though all of us probably use ARM, renaissance also has its own proprietary. But from a hardware perspective we also add additional things to the ARM architecture that we leverage within our compilers. For example, from the heritage from Renaissance, we have many DMAs that we add to optimize the performance of the hardware itself. So having the similar tool chain across each vendor would kind of remove that advantage each vendor has. And also, even at a higher level, the compilers we use. We have GCC, llvm, and then even ARM has its own compiler and has its own compiler, and they all have their advantages and disadvantages.

Speaker 3: 27:50

It's because at a real-time operating system you want to squeeze as much as possible from your architecture. So while you're going from an MPU perspective, you don't care as much to be as real-time as with MCUs. But going on a higher level, if you're trying to go from an AI compiler perspective, there is possibility to unify that realm even if you have different MP use. When you take from a higher framework, such as PyTorch, tensorflow, you can do an intermediate representation which is going to be a commonality between the framework and your hardware, and then you can start lowering it down to your hardware itself, so you can have hardware optimization and hardware without the hardware optimizations for graph layers as well.

Speaker 3: 28:38

So it is a bit possible but, from a hardware perspective, because we all have different hardwares, even though we use ARM, it's not very easy to have the same toolchain.

Speaker 5: 28:49

Yeah, Sorry, let me add a comment. It's true exactly what he's saying. I missed to add one detail. So actually it's possible to run a common software. Is a TF Lite interpreter on the STM32, okay, and we can also run the ARM CNN libraries okay. So it's possible actually to use a tool that can be common with other vendors that are based on ARM core, but exactly as Renesas is doing. Also in our case we have a specific hardware functionalities in the DMA and the memory controller. I think all of us have. So you can. If you use a common tool like TF Lite, you can do, but you are losing in performance, while if you use a proprietary compiler tool like of Renesas ST and NXP, you get the best out of the hardware and you get the best performance. So it's a trade-off. So you want to have the same tool, that is okay for everybody, but it's not so much performing, or you want to have the best. So how much is important the compiler for you?

Speaker 2: 30:17

Okay, thanks, Matthew, Okay, okay. So next question I raise to Ricky being a programmer and a user, you are at the first front line develop the code and meeting the customer. As I know, your company serves Amazon. You are a supplier for the the pet camera, right. So could you share your experience in using, in productizing the design by the chip, the system, the software? Okay, from the user experience, and give some suggestions to all the others people here? Okay, yes, Okay.

Speaker 7: 30:57

So, based on our overall experience in the product-level AI system development, we would like our core strategy is to separate the architecture into three layers. The first one is the sensor network layer, like the MCU-level processors, and the second one is the edge AI, edge computing layers, like the smartphone or like the pad cameras. And the third layer is the cloud computing layers. And the first one, the sensor network layer, like wearable, like smart glasses that work on the MCU, can prove like fast, real-world, physical raw data and play as noise reducer and also event handler.

Speaker 7: 31:54

And we believe that. So that's why TallyML can play in this role and in this way we can reduce the unnecessary data transmission to reduce the power consumption. And the second layer, the AGI layer, like the smartphone, like the camera, can play a more powerful event filter that maybe we can use more MPU in this layer to provide more powerful filter to reduce more unnecessary data transmitted to the cloud. And in the cloud layer, we can use the more powerful models to ensure the performance of our AI service. So we believe if we combine these three together, we can provide a more sustainable, more cost saving and more reliable AI service to our users, and so I would like to know more about how. Is there any roadmap or any strategy to combine these three together, because, as I know that the MCU label and also the edge label I think most of company has covered both of these two areas.

Speaker 2: 33:18

So it sounds like it requires multi-level solutions, but not the edge or tiny device only. So I believe we have the product, but product should consider from the whole system perspective, right? So maybe I think some companies like yours can give some idea.

Speaker 1: 33:38

So maybe I'll talk about the layers the way we see it and touching on how it was explained For us. The sensor level and the edge, in our view, is all edge in the way we treat it at NXP at least.

Speaker 1: 33:54

So microcontrollers and applications processors in our portfolio, we see that all as edge and the tie-in to cloud for us is more at a higher level of management where you need that cloud augmentation or cloud engagement. So a lot of what we're trying to do is to move more off the cloud onto the edge, create higher efficiency in terms of both cost and compute. But we still leverage the cloud or some of our customers still leverage the cloud in terms of fleet management, in terms of MLOps management. So there is that added layer, that is, a higher level application layer that is happening on the cloud to manage more devices efficiently. But the actual application, where it interfaces with the user, is all edge and we don't differentiate really too much from the sensor versus mobile device type of edge application.

Speaker 2: 34:58

Yes, Okay and okay. Yes, Samuel, please.

Speaker 6: 35:06

Yeah, we also see the trend for the multi-layer AI combination for the whole system build up. So the first layer, the MCU, the tiny ML the purpose is running in the OSM in the very ultra-low power condition and when the event is detected then we wake up the second stage, edge processor, and output the meaning for the metadata to the second stage and also to the third stage, to the cloud. So we think in the future we will see more and more the TinyML combine, the LLM application, because LLM can run in the cloud in the big edge and TinyML running in the front end MCU device.

Speaker 2: 36:00

Yeah, okay. One question to Professor Yeh about education how, I wonder? One question to Professor Yeh about education. I wonder if students can access TinyML EVV more easily, even if they can access some cloud machine. But the question they want to do some hands-on job. How do you suggest for the people? I believe some audience here are students, so do you have any suggestion for their career? If they want to devote themselves in TinyML, Okay, not only writing papers, okay.

Speaker 8: 36:36

I think the Basically the TinyML have several unique challenges compared to the other. You know the AI models showing the data centers. You know the AI models showing the data centers. So, basically, we have the hardware. So the hardware. We have lots of the constraints. So, for example, the microcontroller, we do not. You know, the memory is constrained, the computer power is constrained, so how to design or create a very low power or energy efficient?

Speaker 8: 37:06

This is a microprocessor. Okay, this is the one thing the students or the people who want to get into this tiny metal area should know. Okay, should know this.

Speaker 8: 37:19

So the other one is applications and if you want to know the associates or related this tiny metal applications, basically, you can guess the model from a lot of these vendors.

Speaker 8: 37:33

For example, ARM has its own model rule, or this STM also has its own model rule, or even NSP, or yeah, yeah. So you, basically, you can see, you can get these models from these vendors. To understand what is the difference model allocators compared to the one model, architecture is quite different from the ResNet or some of the Google Inception Network. So you can learn the differences in these model indicators to more understand or to understand deeply about what is unique, or the challenges to use these AI models in the edge computing. Yeah, so, in summary, you can understand from the hardware side, or can, to understand the limitation of the hardware, and the second one, you can understand from the software part, even the model parts, to understand the limitations and how to apply these model allocators to the resource constraint devices. Okay, so you can then, or you can, get into this the tiny metals. From these two, you know directions.

Speaker 2: 39:06

Yeah, thanks. Okay, maybe any panelists have any answers want to echo to students here.

Speaker 7: 39:18

Okay, yeah, so, if I can add, I would like to emphasize one thing Actually I think in Taiwan not only the hardware manufacturer of design is very powerful, I think also the software development and also the academic research is also very powerful. So I think if we can combine these three together, maybe we can produce more sustainable, more powerful AI service or AI products to other worlds. Yeah, and I think I would like to add one thing about the communities.

Speaker 3: 39:58

Yeah.

Speaker 7: 39:59

And I think last year Eldin has held a tiny annual AI session in the largest open source conference in Taiwan, that's COSCAP. Yeah, I believe Jack also involved in this event. So I think, if we can really actually we have a lot of open source community in Taiwan formed by these software developers. So I believe, if we can put these three things together the industry, the community and also the academic research and also these three things together, maybe we can have more powerful AI products. So, yeah, that's my two cents. Okay, thanks, Ricky.

Speaker 2: 40:45

Okay, then following will be the tough question from audience. Okay, anybody here want to raise any tough question to these tough guys? Anyone Holding the front arm? Any questions? Because this morning the time is limited, right, so I'd like to reserve some space for the speaker here. If you'd like to answer some, give some questions, then we'll give answer here, do you? Oh, yes, thanks, odin.

Speaker 10: 41:25

Hi, hi, it's Odin, so actually actually it's quite good. So I learned a lot. So I just want to thank everyone to, I think, the open-minded to use some of the technology from our company, or taking advantage how to use some of the tool chain, about the technology to work together. So I think we try, try to create a platform everyone can work with data. So it doesn't matter you are the software engineer, you are the highway engineer, you are the solution provider, so we won't try to some of the organic ecosystem. So, yeah, I just want to echo that a lot of panelists say that so is. It is a lot of opportunity, so we can find some of the space for each of our business opportunity. That's just my thought, thanks, Okay, thanks, odin and Peter.

Speaker 2: 42:20

Do you like to ask some question about compiler or some benchmark? Okay, I'd like to here to introduce Peter from SkyMizer. He is also acting in ML Common as ML Perf Tiny co chair. So anybody want to do some benchmark research, you can come to SkyMizer and talk to Peter and the question. Peter, do you have questions? Maybe you can raise one question to the panelists here.

Speaker 9: 42:45

Okay, Thank you, casey. Yeah, I'm, yeah, I'm as a volunteer in ML Commons tiny working group as a culture and I also joined I say semi monthly we meeting in the tiny email foundation. I know there's also another benchmark working group here in TinyML. Right now it's HAAF Foundation. And I also talking about benchmark I'm wondering about because we know some limitation right now in the ML Perf Tiny there's only two image model, two audio model. If you like to, if you have a chance to add a new model or suggest a new model, what kind of model, what kind of application you would like to the community to add on?

Speaker 2: 43:34

that's my question yeah, good questions, okay. Yeah, do you like to contribute something to max?

Speaker 1: 43:45

I mean we're definitely always interested in benchmarking because that creates that method of trying to compare performance and make sure that what we're building is, you know, is efficient, is able to meet our customers needs and sometimes even as a vendor, when we're having conversations with our customers, the customers are not disclosing exactly what they're running. So the benchmarks those like from MLPerf become extremely critical for the end customer to understand the capability of the hardware without disclosing what they're doing to the hardware vendor or to any other service provider in between. So they act as a proxy. And I do agree right now that the MLPerf tiny or most of the MLPerf categories, the models being used, are getting old. They get old very quickly but you still need some level of commonality to be able to compare. I think we'd be very interested in seeing newer LSTM models, rnn type of models, to see how that has progressed and to showcase capability.

Speaker 1: 44:56

I think on the vision side there is a lot of content available that maybe used to be big but can be run on smaller devices now. Used to be big but can be run on smaller devices now. So making those part of the tiny category could cover the vision side. But I think outside of vision. There needs to be a few more models that are added there that can showcase sensor type of modalities or audio modalities better. Yeah, okay, okay.

Speaker 8: 45:30

Yeah, okay, basically we need to explore more tiny applications by using these tiny metals.

Speaker 8: 45:37

So in my mind I would like to say if we can have more applications in the signal processing or use the different kinds of the machine. A model for simple a graph neural network Okay, you know post animation, because this, the, you know the, the model indicator of this graph GNN, of a graph neural network, is quite different to the traditional convolution neural network. Okay, if we can call, we can just if we can cover, or in campus, if we can cover on campus this kind of the different kinds of neural network modulators in our benchmarking, I would like to say it will be helpful for the advance of the tiny-a-metal communities. So in addition, I would like to say the tiny also have several this stream computing in the signal processing or in this, the audio, this, the you know audio applications, okay, so if we can also tag this kind of the signal inputs, data sets into accounts, I would say this also the one interesting part we can just try to consider and accommodate these kind of applications in the benchmark.

Speaker 2: 46:56

Yes, okay, yeah, thanks, okay Okay.

Speaker 3: 47:01

Ayoda, one thing about benchmarks itself. It's a bit challenging to have the same walls running on all MCUs because many MCUs have very limited RAM and ROM. So if you start targeting larger models it becomes challenging to target all MCUs fairly. But to extend the TinyML's benchmark right now it covers, I think, one ResNet, one MobileNet, one autoencoder and one DSCNN it would make more sense to add maybe an object detection model, which currently there are a few that run on MCUs. Well before they were like YOLO, which were pretty big, and now there are kind of nano YOLO that's run. It would make sense and from a time series or real time analytics perspective, it makes sense to add something in the realm of SVM, but SVMs are usually small, but it would also still be good for customers to see how they run. So extending it to our different models and maybe stress testing the hardware for edge cases that were not explored before, such as different filter sizes for convolutions.

Speaker 2: 48:12

Yeah, thanks. Okay. So my any question from audience, do you? Okay? So my last question is about robot, because you know the robot for the, just like the, every player, and you know human have the Cerebron and Cerebrus, okay, so I think, for real time, for robot, the demand from the tiny email, maybe you have some opinion, right? Okay, what's the difference when we do things for robots based on the AI, the chips, what will you do? What is that? Okay, what can we do by existing AI vehicle, the tools or the chips? Because, as I know, mcu, originally they for the consumer things, but next generation consumer will be robot to do something. So I'd like to have your answers. The robot industry.

Speaker 5: 49:24

Just to understand better the question. Sorry, industry better than the questions. Okay, I didn't get before you're asking about the roadmap.

Speaker 2: 49:29

You know robot have so many sensors okay okay, since the need to collaborate together, right, okay, so the the AI for that kind will be another perspective. I believe so For the company like yours, a big company using MCU, maybe your next step for robot what is the chip, what it looks like? Do you have any plan? Because people here always the MCU, we say MPU. Let's think about for robots specifically, for robots specifically. Okay, interesting question, yeah.

Speaker 5: 50:06

Now there is also a little of hype on humanoid robots. A lot of companies, also Chinese companies, are releasing more and more humanoid robots. If you see, now, the basic architecture of humanoid robot is basically you have just one centralized unit, okay, and you have a few sensors. You have a much less sensors than what you would expect very few sensors in the humanoid robots and basically you have a big Intel PC-based or NVIDIA PC-based GPU in the humanoid robot and basically it's used for computer vision. So it's not so much a difference from me. Say, a camera with a PC okay, they give a little of the shape of the humanoid robots, but is really a PC with a camera as of today?

Speaker 5: 51:11

In the future, what we see as a trend and what we were discussing is to have a decentralized architecture where, like a human, you may have a decentralized intelligence in a humanoid.

Speaker 5: 51:28

So instead of having the hand that is just an actuator, you may have an intelligence there. So there are a lot of research at the university level about touching, okay, to get the sensation, okay, I was just reading an article this morning in the taxi and basically, instead of transmitting all the raw data back to the centralized unit, you can have sensitive element in the fingers and you can add an AGI processor there. So for that, do we need specialized hardware for humanoid robots? So far I don't think so. We can use the devices that we are already using in consumer and in industrial, like microcontrollers and sensors, with AI-accelerated, also for humanoid, but they need to change the architecture. They need to change the architecture to a decentralized one. When you are starting to put the eye inside the sensor of the finger, inside the motor of the hand, inside the eye itself, where the camera, instead of sending the decentralized unit, this is where I can see, let me say, a possible evolution in humanoid robots.

Speaker 2: 53:02

Okay, thanks.

Speaker 1: 53:05

So from my perspective, it's also important to define. There's the humanoid robots, there is robotics in terms of mobile robotics, like AMRs, drones etc.

Speaker 2: 53:15

Different robots yes.

Speaker 1: 53:16

Different types of robots, definitely so in all of those we're not defining the products or processors specifically because it's going into a robot. We do have devices that are where robotics is one of the business segments where they can be targeted. The AI piece fits into better sensing, better context awareness, so that can feed in through either vision or different types of sensor modalities and leverage AI for the robot to understand where it is or what the situation is. And we're also seeing a lot of, and all of this requires more and more compute. So you're you're leveraging higher levels of compute, capable applications processors, you're adding more, both CPU as well as NPU power performance to be able to handle not just the motion control or the trajectory planning, for example, if it's a mobile robot of sorts, but also the interaction yeah as well.

Speaker 1: 54:29

interaction both in terms of within the space and with other users, but also voice, like leveraging LLMs into voice and conversational interaction. A lot of those add more and more compute needs to the overall and the roadmap is built with that additional compute requirement in mind.

Speaker 2: 54:49

Yeah, so the sensor synchronization will be multi-modality use case by different sensors together running. I just curious about because becoming maybe you have sensor, you have MCU and a lot, of, a lot of MCU come running together for body modality models will be maybe next generation things, right okay?

Speaker 5: 55:14

yeah, indeed, multimodals is a very interesting topic. Obviously, the consideration I was giving for humanoid robots can be applied also to service robots. So this is also a market that we are serving. Okay, still, I see the same approach in service robots like the ones. If you go I live in Hong Kong there are a lot of service robots that are cleaning the floors or bringing food to the hotel room and they are going around. There are a lot of them, that kind of robots. They have much more sensors. They may have like 20 sensors each of them, something like that, and in this moment, again, again they are basically sending the data to a centralized unit where you are running up, let me say, an algorithm.

Speaker 5: 56:10

So two interesting points can be, first of all, a multi-modal neural network architecture in the future, so you may get input from many different kinds of sensors in order to get a better awareness of an environment. So you can add information from time of flight, audio vision, and combine together in the same multimodal model. This kind of multimodal model can be obviously run on a centralized unit. At the same time, you can have, let me say, a mixed approach in which you are running a small model, that is, a local model, similar to the approach we were seeing this morning from the professor of the university. So you can have a small model that is already pre-processing the data, is sending the output and eventually you can have a transformer based on an LLM that is just analyzing the output of the neural network model without the need to deal with the specific raw sensor, raw data of the sensors.

Speaker 2: 57:23

Yeah, thanks, okay.

Speaker 3: 57:31

So on the same topic, yes, I agree that human noise are not the only robotics that we have. There is robotics as like a big area for us as well. But we target specifically in the MPU perspective. We have the RZV2H, which recently been released a few months ago and has been quite successful for the robotics field itself. However, it is power-hungry because it's an MPU, so you can add on to it. We have also something like a 16-bit mic controllers, which is the RL78, and we also have low-power 32-bit mic controllers that you can add to this very big MPU and you can use it to manipulate and do inference of machine learning models and then you can send that inference to the big MPU to do more advanced tasks, to reduce the power consumption and to have faster response, because an MPU still uses Linux. So you'll have like a big system working all together yeah, thanks, okay, I know you're here today.

Speaker 2: 58:29

Thank you very much.