
EDGE AI POD
Discover the cutting-edge world of energy-efficient machine learning, edge AI, hardware accelerators, software algorithms, and real-world use cases with this podcast feed from all things in the world's largest EDGE AI community.
These are shows like EDGE AI Talks, EDGE AI Blueprints as well as EDGE AI FOUNDATION event talks on a range of research, product and business topics.
Join us to stay informed and inspired!
EDGE AI POD
Smart Sampling Unlocks Edge AI Capabilities You Never Thought Possible
What if everything we assumed about AI and data was wrong? In a world obsessed with collecting more and more sensor data, LightScline has discovered something remarkable: we might only need 10% of it.
Drawing inspiration from the human brain's selective attention mechanism, LightScline co-founders Ankur and Ayush Goel have developed an approach that trains AI models to identify only the most information-rich portions of sensor data streams. The results are staggering—models requiring 400x fewer computational operations while maintaining state-of-the-art accuracy across multiple domains.
This revolutionary approach solves two critical problems facing organizations swimming in sensor data: spiraling infrastructure costs (cloud computing, storage, bandwidth) and mounting human capital expenses, where each additional hour of collected data traditionally requires 40+ hours of analysis. By focusing only on what matters, LightSkline's technology dramatically reduces both.
The real-world impact is already evident. A Fortune 150 company monitoring high-value industrial equipment achieved exceptional accuracy using just 10% of their raw data. Another major software provider saw 381x fewer computational operations and 85x faster training times. Perhaps most impressive is the technology's ability to run on tiny edge devices with as little as 264KB of RAM—enabling applications previously considered impossible on resource-constrained hardware.
This efficiency breakthrough isn't just incremental—it's transformative. It allows entirely new applications in wearables, industrial monitoring, and distributed fiber optic sensing (which can generate terabytes of daily data from monitoring kilometers of fiber cable). By bringing both training and inference to the edge, LightSkline is redefining what's possible in physical intelligence.
Want to push more intelligence to the edge while dramatically reducing your computational footprint? Discover how LightSkline's approach could transform your sensing applications and unlock entirely new possibilities in edge AI.
Learn more about the EDGE AI FOUNDATION - edgeaifoundation.org
Hi everyone. This is Ankur, co-founder and CEO at Lightscline. With me I also have Ayush Goel sitting here, who is the co-founder and CTO, and we are going to be here for all three days, so looking forward to discussing how we can contribute to the community and what use cases we can work on, and so. So the founding story of LightSkline is actually based on a very counterintuitive insight which is very relevant for the audience of this conference. The insight is basically a simple question does more data actually mean more information? And that's not necessarily the case. In fact, if we look at the human brain, there is something called as the selective attention mechanism, which means that, even in today's presentation, you will just keep the important things in your mind and probably ignore the rest, and so the brain is very efficient at what sort of information it parses. And we essentially thought can we take this question and teach AI models to sort of identify important subsets of data in large streams of sensor data? It turns out that's the case. We are able to get great test accuracies with just 10% of the raw data. So we are going to, today, focus on how we have accelerated some of the sensor AI models using LightSkline AI's SDK, some of the sensor AI models using LightSkline AI's SDK.
Speaker 1:So the problem we are trying to solve is we are essentially generating huge amounts of sensor data across different modalities, anywhere from space, air, land and sea. So it involves satellite data analysis, undersea fiber optics cables, industrial internet of things and so on. Now this leads to two major problems more infrastructure costs and more human capital costs. Infrastructure costs essentially comprises of cloud computing costs, storage costs, transmission bandwidth requirements and so on, and on the human capital side, it's more. Around every additional hour of data that we are collecting that leads to 40 plus hours of analysis. So you collect five gigabytes of data, and then data teams need to spend a whole lot of time figuring out how to do manual processing and things of that sort.
Speaker 1:So if we dig a little bit deeper into this, we basically again have different data modalities, different sensor modalities, generating terabytes of data per day. Now what this creates is we start at the edge or at theabytes of data per day? Now what this creates is we start at the edge or at the point of data generation? We are again limited by what size of compute we can put at the edge itself, so we are limited by things like the maximum milliampere hour rating, how much wattage supply can we provide, what is the battery weight, and so on. Next, we have to deal with the bandwidth. So, of course, if we want to, if we need to transmit everything, that has its own cost and complexities.
Speaker 1:Next is the cloud compute, which is again the cost and complexity of cloud compute is becoming more and more obvious, especially with AI training workloads and with the GPUs and so on. And finally, the human capital component, which, in our experience, has been very similar to the order of magnitude of the infrastructure cost, because each additional hour of data leads to several additional hours of analysis. And finally we get to the decision based on this current pipeline. So how does this pipeline get affected as we get to the terabytes of data per day? How scalable or feasible is this pipeline of essentially flowing the data through this edge compute, bandwidth, cloud compute, storage, human capital cost and so on? And we need to start thinking more about what are the end-to-end requirements of getting to a decision from the raw data. So dealing with more data has actually pushed us to the has just kept us in the more flops or the higher data volumes and in the cloud realm, whereas what we want to do, and especially the theme of this conference, is we want to push more use cases to the edge. So the entire argument is I will show how we have taken three use cases around variables, industrial Internet of Things and distributed fiber optic sensing, which is a new emerging capability, and how we have moved those capabilities both training and inference from Cloud all the way to the Edge.
Speaker 1:So now, if we imagine a future like this, so this can be on Mars, this is essentially visualizing physical intelligence, lightweight intelligence. There's no way we can just put additional GPUs and we can build bigger data centers and so on. In fact, physical intelligence requires different form factors because we may want to deploy intelligence on different sites. So robots require a different form factor, variables require a different form factor, and so on. So how do we build AI models which are specific to those, the form factor capabilities?
Speaker 1:And this comes back to the original question of how can we do more with less, and this is where we can take some inspiration from the brain around how can we do novel algorithmic development to exploit structure that is already present in real world sensor data? So in this respect, this is a snippet from IEEE Spectrum, which did a story on one of our research recently. The question we ask here is can we be more selective? Instead of collecting everything, can we start collecting what is relevant, and can we train different neural network, new types of neural network architectures to do that? So this is essentially the key insight that we have, which is does more data actually mean more information? That is not true. So you can look at two examples. One on the time series a thousand data points can still be represented by just one data point if we take an appropriate transform. Similarly, in images, we know that even if we discard 95% of the FFT coefficients, we still can get good inferences because there's not a lot of loss of information.
Speaker 1:So the foundational insight again is that there's a lot of structure in the data. If we exploit that well, we can unlock 100x efficiencies. In fact, past this year, three out of the four Nobel laureates from like there was a talk in Google DeepMind three out of the four Nobel laureates mentioned something like. The next big wave of research will be on how do we exploit the structure in the data, not focus on the models. But there is a lot that can be done at the data level and that can lead to significant efficiency in locks, and so that is pretty much what LightSkline has been building up to so far. So, to summarize, so far, we are making the argument that if we exploit the structure in the data, we can train models that are more than 100x lighter than current models and which are faster, and we can still maintain the prediction accuracies, the performance and so on. So this is some of the media coverage on the paper that we had in nature, which essentially shows a new neural network architecture wherein we got 435x reduction in the number of flops in comparison to convolution neural networks on a variety of data sets. So we'll also look at the universality of this approach on different modalities like variables, industrial distributed fiber optic sensing and so on, like wearables, industrial distributed fiber optic sensing and so on. So what LightSclient does?
Speaker 1:I would like to introduce this concept of useful flops. So everyone is talking a lot about flops. We see AI labs reporting, we train models on teraflops and gigaflops and all that. Why are we not thinking about what are the useful flops? It's almost like if I give an oil and gas analogy. It's like saying I have a million gallons of crude oil, but what I actually care about is extracting one gallon of gasoline like high power gasoline. So we should have more measures of intelligence that actually focus on how much intelligence or how many use cases we are solving, not on like how many flops can I make my model or how big can I make a model. So that's the foundational argument and we'll see how this 400x reduction has actually pushed use cases like variables, industrial Internet of Things and fiber optics from the cloud to the edge all the way training and inference everything on the edge.
Speaker 1:To summarize the value prop we are again leveraging AI to automatically train models and run inference on 90 percent less data, which results in lighter and faster models. There are two advantages to this. One is a cost-based, one is a value-based advantage. So cost-based is essentially we reduce 90 percent of the AI in front, people-related time and cost. Value-based is we enable applications that are currently unobtainable, and we'll walk through a lot. We enable applications that are currently unobtainable and we'll walk through a lot of these applications, especially on the distributed fiber optics variables where we have deployed on less than 264 kilobytes of RAM, and so on. So now we'll look at three use cases of how LightSkline AISDK has accelerated both model development and deployment.
Speaker 1:Throughout the presentation, I'd like to emphasize that this is valid both during training and inference, and that is why this is different than your conventional down sampling or dimensionality reduction, because it translates to the inference, the real time inference stage as well. On the variables, we look at how we have deployed sensor AI inference on less than 264 KB of RAM, so on a Raspberry Pi Pico, which is basically a bare metal device. We'll also look at a Fortune 150 case study wherein we did cloud training with basically 90% less data. We look at a use case on one of the largest software providers on how we achieved 381x less flops and 85x faster training on their datasets. And then, finally, distributed fiber optic sensing is very interesting. So recently there were incidents of fiber cable damage in the Baltic Sea, and so this is gaining a lot of traction wherein, like passive sensing within, distributed fiber optics can be used to extract a lot of information.
Speaker 1:In passive sensing within distributed fiber optics can be used to extract a lot of information in environments for use cases like traffic monitoring, urban infrastructure monitoring, critical infrastructure monitoring, earthquake prediction. This is also being used by oil and gas companies to do exploration of where to dig for oil wells and so on. So this is the first use case. Here we are looking at a Fortune 150 case study who used our SDK inside their own platform and they are essentially looking at high-capex industrial machine monitoring. So, as you can see, these are like high-capex assets, so bearings, gearboxes, centrifugal pumps and so on.
Speaker 1:These have a variety of sampling frequencies. So, again, very high sampling frequency, large amounts of data collected within a very small amount of time. Now you can see the accuracies here. They achieve pretty remarkable accuracies, but they train using just 10 percent of the raw data. So think about if you had one GB of data per use case. We are actually sampling and working or training our models only on the 10% of that data. We are never training on everything. Now how do we do the sampling and stuff? That is all the IP and that is how the smart sampling happens, because we cannot naively down sample, because then we use the Nyquist information. But this is to show a customer success. Study on like this essentially reduces a whole lot of their cloud computing costs because we are again training only on the 10 percent of the relevant data.
Speaker 1:Another validation they showed on how this is better than any conventional downsampling is. They compared our approach with principal component analysis, which is one of the very common dimensionality reduction techniques. Here you can see original and light-scline curve overlap on each other, whereas PCA is way off. So to summarize what this means is that our sampling and model training combined together preserves the information, whereas if we just naively downsample or use other dimensionality reduction techniques it does not. And the other thing is the other way this is different than any of the dimensionality reduction approaches is we sample during training. But this also means that while inference we are only running inference on a few samples. So that is how we achieve significant benefits. And the 400x flops reduction also happens during the inference stage, which is a big deal.
Speaker 1:Next case study is with one of the largest privately held software providers. So here you can see that we again took one of their use cases. We compared here with traditional ML, with feature engineering, traditional ML with raw data, so compared a lot of existing approaches, also deep neural networks like CNNs, which are at least still now state-of-the-art in these kinds of data. And then we used LightScline, so both sampling and training combined. But here with that we again achieved best state-of-the-art accuracies with 381x less flops, 85x faster than current models. So for them, the value prop again is less cloud cost, faster training times, data scientists not having to spend a lot of time on feature extraction and looking at all of the data streams, and so on. A lot of those things can be automated.
Speaker 1:Now this slide talks about the universality of the approach. So these results are again from the Nature paper, wherein we showed results on different modalities like variables, industrials, dealing with wide variety of sampling frequencies and so on. Because one of the questions can be how universally applicable is this and do we have to do a lot of fine-tuning or things like those types of things, depending on the data? The answer is no. You can still use the same four lines of code to work on a variety of use cases, which is what we have been able to do. So these are four different datasets which are like industry standard datasets. We validated them for both industrials and variables use cases and achieve different amounts of benefits in terms of how many flops we need, while maintaining the prediction accuracies.
Speaker 1:Let's see if I can play this video. Yeah, I think. How should I be able to play this video? There's a video here. Yeah, there it is. So this is the SDK itself. So it starts with. It's basically a wheel file, so you download that into your laptop.
Speaker 1:We next this section is importing the data. So here we look at 51 channels of variables time series data and then we'll show the dimensionality here. So once the data is ingested, then we will call LightSkline's sampling and training module together. So this is the data loading process. The data itself is big. This data is approximately 1.5 gigabytes in size. So this next block is going to ingest that data in the LightScline compute module and then this reduce data and train model is the key function which, if you can see that this does basically the sampling and the model training together, then you can see the loss going down. So standard, like PyTorch framework here this is again compatible with your standard machine learning libraries in terms of PyTorch, keras SQL. So not a lot of work on your end. It's very easy to use that. Then, finally, I think the video got finished but we also showed the training accuracies that we were able to get 99% training accuracies. So the main benefit here was this training actually happens in literally 75 seconds. If we were to use all the raw data, if we were to do manual feature extraction. That would be like several hours, like 20, 30, 40 hours, to do all of those things. But now, because we are training neural networks that can actually figure out what data to collect based on a loss function, we can actually significantly accelerate both the training and deployment phases.
Speaker 1:Next we will look at the distributed fiber optic sensing modality. So this data is actually very large. These are almost like images, but each data is essentially a time series, and then you have several channels, because fiber optic cables can be several hundred kilometers in length. So we are working on some datasets which are more than 50 kilometers of fiber optics data. These can typically generate a few hundred gigabytes of data, depending on what length of cable. We are monitoring and dealing with this volume of data on a daily basis. In fact, these are on very remote environments, so think about, like offshore environments or traffic monitoring scenarios.
Speaker 1:So we don't even have the capability to transmit all of that data. We don't have a very powerful server located on site to process that. So that itself, right now, the biggest, literally the biggest challenges for customers, even to send us data, is how do we send this to you? Do we send this to you in a hard disk, do we send it to you in a nas server and so on, because this is, this is something like that cannot be even sent on a OneDrive link. So one of the customers literally had 8 terabytes of data just from a few days of fiber optic sensing and so just sending that data. What sort of infra do we need for that? That is a challenge, but this is an emerging application. It's growing very rapidly. So here we were able to train again by sampling both in time and across channels. We were able to train these models in under eight minutes on an Intel i7 processor. But again, by exploiting the structure in the data, we don't necessarily have to build very complex models. We can again reduce the size of the models, make the training and inferences much faster.
Speaker 1:Now we'll start looking at some use cases of training on the edge itself. So here we are looking at how the training went on two edge devices. So we looked at NVIDIA Jetson Nano and Intel i7. So on the left you can see that if we use Jetson Nano, if we use more than 50% of the raw data, we cannot do training on Jetson Nano because it runs out of resources. However, we can do training on Jetson Nano if we actually use just 10-20% of the raw data, because that turns out to be enough RAM and compute. Intel i7 is more powerful, so that's why we can still do on that. If we look at the transfer learning scenario, so in that case again, transfer learning is only possible on the edge if we use light-screen compute, meaning that again, with 100% of the raw data, jetson Nano is not able to do transfer learning.
Speaker 1:So I had some questions on can we do on-device learning or self-learning? This is an example of that. So think about if person A has a specific signature of walking, we want to adapt that to a person B, and so we can do that capability on the device itself. So we don't have to do the retraining run on the cloud. We cannot retrain the last two layers, essentially on the device itself, and so that is how we are going to enable on-device learning.
Speaker 1:This is an example of the smallest form factor we have deployed our AI in. So this is Raspberry Pi Pico, which is 264 KB of RAM, a few milliwatts of power, and so we have been able to deploy. This is actually running inference on that. So we are collecting time series data and then running inference on top of that, on top of that, so these would be the benefits, which is this can enable training, inference and transfer learning on your current chips, because it unlocks a lot of power. And finally, also newer products. They can have inbuilt self-learning capabilities. And to end with, I think we partitioned the audience in two ways. One is chip companies or integrated device manufacturers, and one is the system integrators, so happy to work with you. You can scan the QR and put in if any use case comes to your mind. Me and Ayush both will be here, so we'll be happy to discuss over the next few days, thank you. Thank you, ankur. So we have time for some quick questions from Ankur.