EDGE AI POD

Counting What Counts: How Edge AI is Saving Japan's Seafood Industry

EDGE AI FOUNDATION

When SoftBank's team was approached by Japanese fishermen struggling with declining catches and inefficient practices, they knew technology could offer a solution. Their response: an innovative Edge AI system that transforms traditional aquaculture through sophisticated computer vision running directly on smartphones.

Japan's fishing challenges are emblematic of global food security concerns. With the country's self-sufficiency rate at just 38% and fish catches halved from peak levels, the stakes couldn't be higher. Traditional aquaculture has relied heavily on intuition rather than data, making it nearly impossible to optimize feeding (which represents 60-70% of operational costs) or accurately monitor fish populations.

The SoftBank and AIZip collaboration tackles these challenges through a remarkable edge computing approach. What makes their solution particularly groundbreaking is how it functions in environments with zero infrastructure—no power supply, no connectivity, and corrosive saltwater everywhere. By developing density-based crowd counting AI models that run efficiently on edge devices, they've created a system that can count hundreds of fish with 96% accuracy in clear water and 86% accuracy in muddy conditions.

Perhaps most fascinating is their development process. Unable to collect sufficient real-world training data underwater, the team developed a sophisticated Unity-based simulation that generates realistic fish behavior under various conditions. This simulator provided 65% of their training data, complemented by manual observation from divers who documented actual fish behavior at different depths and feeding stages. The result is an AI system that not only counts fish but can potentially detect hunger levels, health issues, and optimize feeding schedules.

This CES 2024 Innovation Award-winning technology demonstrates how Edge AI can transform traditional industries without requiring massive infrastructure investments. By bringing intelligence directly to the point of data collection, even in the most challenging environments, we're witnessing the beginning of a new era in sustainable food production. Whether you're involved in agriculture, environmental monitoring, or any field requiring intelligent sensing in harsh conditions, the lessons from this underwater AI revolution could transform how you approach your next challenge.

Send us a text

Support the show

Learn more about the EDGE AI FOUNDATION - edgeaifoundation.org

Speaker 1:

Thank you, okey dokey. All right morning davis, how you doing. Good morning, we're back. We're back back after a week in austin Texas, which was pretty exciting.

Speaker 2:

Yeah. No I had a snowy welcome, so I wasn't in, wasn't in Kansas, I wasn't in Austin anymore. So that's pretty clear from from the current climate here.

Speaker 1:

Yeah, no, we had a. We had a big us event for edge AI foundation in Austin last week. Three full days of talks and demos and workshops and beer and barbecue and all that good stuff. So I had my fill. I'm good with that. Next week is embedded world.

Speaker 2:

I need I need to upload files. Yeah, the, the, the saga continues. I mean it's a, it's a edge, ai start to the new year. I mean I think I need like a downlink or an upload of all the, the firsthand experience, conversations, presentations, to really digest it all. It's been growing.

Speaker 1:

Yeah, yeah, embedded World next week. I was on a call this morning. I'm hosting a few panels there. Apparently they're expecting 190,000 people in Nuremberg, so that's kind of insane. So we're going to be there with a booth. People should stop by Hall 4, 602 is our booth, but we're also going to be, uh, sponsoring an event with iot stars on the 11th march, 11th. If you're in nuremberg, uh, come by or get some tickets, um, and it's gonna be big, big party, iot stars, big community, um. So do that and visit our booth, visit the nxp booth. I'm sure you'll have some gargantuan pavilion there with state of the art demos and all that good stuff. So it's going to be another intense week of embedded edge AI stuff.

Speaker 1:

Yep the other thing I want to do another PSA, psa on some lot of good demos there a PSA on some you can see my, my gear that I'm wearing Oops, let's say this side, my hoodie. So we have now hoodies to the back. So we support our scholarship fund, if you want to go to our store. So we have a scholarship fund that supports fellowships and travel grants and underwrites education programs and we're now selling gear to support that. So if you go to hubi, h-o-o, dot B-E, forward, slash E-J-I we'll put that in the chat. There's a little store there you can buy. This hoodie is really cool. It's actually reverse camera, so it's a little disconcerting. Disconcerting there.

Speaker 1:

But yes, you can get this you can get a coffee thing, you can get some stuff, so appreciate that if you want to gear up for embedded world, there's probably still time.

Speaker 2:

But let's talk about what we're going to talk about today.

Speaker 1:

So this is exciting. We're going to talk about aquaculture aquaculture not to be confused with agriculture, but it's very similar. And we're going to talk about aquaculture Aquaculture not to be confused with agriculture, but it's very similar.

Speaker 2:

And we're going to talk about edge, ai and aquaculture with SoftBank and AI, zip and yeah interesting as usual, which is a problem we can all relate to how do we acquire food, how do we harvest what it has for us without compromising integrity, sustainability, and doing it in, as always, intelligent, clever ways. So I think we've really seen a bit of a trend from these real-world deployments and so really excited to have our guest today. Talk about again good problem, good solution.

Speaker 1:

Yeah, why don't we bring them on? We'll bring on uh yuan lu and yuko ishi waka from softbank. Yuan is president of ai zip. Actually, let me put you in a cooler format here. So you guys get the bigger squares because you're guests. We're in the small squares thank you but yeah, welcome to edge ai blueprints and uh yuko, I know it's a little bit early morning for you there, so I appreciate your indulgence YUKO HIRASHIMA yes, a little bit early, just only 1 AM.

Speaker 1:

MIKE PILAPILASI Maybe we should do an evening version of Blueprint livestreams. It's a better Maybe. And Yuan, where are you dialing in from? I think you're in California, right. I am from the Bay Area. Yeah, okay, so that's easy, I'm okay.

Speaker 6:

You and I are on the same page.

Speaker 1:

That's right, Davis you're in Toronto, right, davis, you're East Coast time.

Speaker 2:

Close Ottawa.

Speaker 1:

I'll be a few hours ahead, but nothing like what you asked to go through, so yeah, Sorry, no worries, but we're going to have a big audience today, brought in a lot of questions, and so we're really excited about digging into this. And so what was kind of the? My question for you, before we dig into the presentation and stuff, is like what was the origin of this project? Like what was the? How did AI, zip and SoftBank sort of get together on this and attack this problem? What was the? What's the backstory?

Speaker 3:

Yuan, will you or I should.

Speaker 6:

Well, I think you could say you should start. The problem comes from you guys, right?

Speaker 3:

Actually the software has started at agriculture produce and because Japan has a big problem to produce fish, so a guy who is a fisherman, he asked us to solve the problem to software. So we bought the three fish cages in Japan and we developed some AI solution. But unfortunately we need a cloud services, we need a GPU, so we asked to the AIZip to embed it to the HAI.

Speaker 6:

Yes, indeed. So the problem itself is you know, we will see Yuko-san, a very beautiful slides that talk about the problems From AIZip side. I think the value we provide is exactly as Peter said at the beginning is provide a solution on edge, on form.

Speaker 1:

And is this unique? Is this the first time that you know of that kind of AI, vision and edge AI has been used in aquaculture specifically? I mean, how is this? I think there's been fish farming and that sort of aquaculture has been around for a little while. But I suspect this is kind of a new twist.

Speaker 3:

Yes, actually, just only this year we succeeded to do, compared to the ATA.

Speaker 1:

Wow, fantastic.

Speaker 3:

Yes, but we already have the booth of the CES. Oh, right, right, right, yes, yes, that's right.

Speaker 1:

Yeah, I think I saw a post on it and so I think that was one of the things that piqued our interest. It was like, wow, this is a really unique situation Because I think I mean, probably the challenge and you'll get into this with aquaculture is, you know, efficiency. I mean we were talking about last week in Austin. We said efficiency is the new currency, that's the new metric for success, especially with Edge. Ai is the doing more with less cost, less power, and I suspect in aquaculture, being efficient with resources is critical for the success of the business. Right.

Speaker 3:

That's true? Yeah, I think so.

Speaker 1:

Yeah, sure, Of course. Okay, and we started to get our first question in, so why don't we? Maybe we should well, actually let me throw the question up and then maybe we can help this person clarify. So Hubom asked can you be a little bit clear about the problem and Japan example, which I think you're going to get into, Actually?

Speaker 3:

I think it's a bit of the start of my presentation.

Speaker 1:

Maybe we should just start it and then, he'll get the answer, he or she.

Speaker 2:

It's still not clear. We can always we have an hour. Yeah, that's true that we can always we have an hour.

Speaker 1:

Yeah, that's true, that's true. So why don't we? We'll bring this up and then what we're going to do is, if you want to get into it, and then we will, davis and I will sort of fade out and, although we'll be close, and then, as questions come in, we may, we may jump in and throw, throw a few questions in here from the audience, and we really encourage the audience. This is an opportunity, as you see things, ask questions, learn a little bit, learn something new today and learn about the solution. So does that sound like a good start, davis? Any final words of support and wisdom?

Speaker 2:

Oh, let's have some fun. This is a great opportunity to learn new things and go deeper, so I think let's get to the chase.

Speaker 1:

All right, let's do it, yuko, it's all yours.

Speaker 3:

Okay, thank you very much, I'm going to the second slide.

Speaker 6:

Do you want me?

Speaker 1:

to go back.

Speaker 6:

Okay, sure, let me go back.

Speaker 3:

Okay. So today I'd like to talk about approaches to solving the challenge in the fishery industry, and that one is the solution was we got the awards and the innovation award of CES of this year. And then I'm Yuko and Yuran is came from AIGIC. Okay, could magic? Okay, could you? Could you? Oh, it's so slow. Yes, my name is a yuko shiroka, and I came from South Park. Actually, I was in the academic side in Japan. There I was an Associated Professor of Hokkaido University and then I moved to the South side in Japan. I was an associate professor of Hokkaido University and then I moved to the SoftBank and I did some different research works, for example, neuroscience I'm still doing the neuroscience, but something multi-agent system and also these three years I'm doing for aquaculture project. Three years I'm doing for aquaculture project and I got the National United States Marine Divers to do this project. Could you go? And we made a video. So please watch the video at first. Could you go to the next slide?

Speaker 4:

Softbank has built its own fish cages in order to conduct research and development into making aquaculture smarter with technology. Currently, we are trying to help fish farmers with AI, not only to make their work, often dependent on intuition, more efficient, but also to solve their problems. First, it is necessary to understand the internal conditions of the fish cage. At Softbank, multiple cameras are installed underwater to observe the fish in the fish cages. It has become clear that there are differences in the behavior of fish depending on the depth and the season. In order to obtain information about the inside of the fish tanks that cannot be captured by cameras, employees who have obtained diving licenses dive into the fish tanks that cannot be captured by cameras. Employees who have obtained diving licenses dive into the fish cages themselves to take photographs and collect environmental data. The various experimental devices used in experiments are custom made as necessary by employees themselves through repeated trial and error. In the past, we only measured body weight, but we have succeeded in developing a body measurement device that simultaneously measures four data points body weight, body length, body height and body width that we now measure regularly. We have set up an experimental base near the fish cage where we hold briefings, prepare for experiments and repair experimental equipment.

Speaker 4:

The various information obtained from the fish cage experiments is used in a 3D CG simulation. We take the recorded video and apply it to the simulation. The data from the CG simulation is used as training data for machine learning. It is also possible to simulate changes in ocean environmental data due to seasons and weather. For example, the simulation can adapt to farms with varied cage shapes by simulating the behavior of fish before introducing a new kind of cage. It is possible to run simulations to optimize the timing and amount of feeding. When using our AI to count fish in actual footage, it was able to produce a result similar to humans in 1 60th of the time. In 1 60th of the time we are working to solve many problems faced by fish farmers to help stabilize their business. Softbank will continue to work towards sustainable societal adoption of these technologies okay, please go next.

Speaker 3:

I will explain in detail at the later of this video. So what is happening in the world today.

Speaker 6:

I understand what you're trying to do.

Speaker 3:

Yeah, yeah, yeah, yeah, sorry I couldn't use my PC. So we have so many problems in the world, for example, terrorism or inflation, and also human broke the environment, please, and this one threatened food safety, please, and this one threatens food safety. So now the trend in global marine product production is something like that and the fishing is stable and the aquaculture is growing up. But unfortunately the trend in Japan's food self-sufficiency rate is quite low Now it has only 38% and importing food is very increasing now, please. And also the amount of catch is on the half of its peak and peak. Also, the number of people employed in the fishing industry has been decreasing. It is a big problem of Japan.

Speaker 3:

The role of South Bank in the fishing industry is we want to expand the demand and supply of aquatic products. So today I'm talking about supply bank and that one is we are trying to create the small aquaculture products. So why is aquaculture growing around the world? And that one is sustainable production and please go ahead. It looks like an optimizing industry, but they have some issues. So we direct, break free from the computer manual approach with data driven management. So the problem is the current problem is production efficiency and the unstable management and the current intuited fishermen fish, we fishermen needs intuition, experiments and bravery. So in the future we drive to do the optimize of feeding. So we want to stop overfeeding and to do this one and we need number of the fish in the cage and the size and the weight of the fish and also feeding control. Also, if we can estimate the control of the cat, it is very helpful for economic fisheries Please. So we have the main tasks and the risk and the cost of aquaculture. So we need feeding optimization and that one is understanding fish conditions and overfeeding. And then now we need a new feeding machine and also work efficient. And they have every day some fish dead but if the fish remain in the fish cage it's some disease is increasing. So some they have the sick and once the one fish is sick and the sick is spread out of the rest of the fish. So they have to remove the bodies and also cleaning the cages and also feeding too. So an estimated and controlled catch. They need the size and the weight because that one is based on the price and also way for feeding, the possibility of growth, because they want to change, sometimes change their food, but they don't know that one is how effective it is or not, and also they want to reduce the death rate, could you? And also the risk and the main cost of fish farming is based on the food. The cost is 60 to 70% and they are doing the overfeeding and that is why we need feeding optimization, please, feeding optimization.

Speaker 3:

So what we are doing now and here is soft-bound fish cages. Could you next slide? These three are soft-bound fish cages and we were three of them and we are farming the red sea breams and per cage we have about 6,000 fish, please. And there is the softbank members and some of them have the National Marine Diver's License and also some of some of the boat drivers and I have the boat's license, I can drive the boat and also I can dive into the cage. Okay, go, please, go.

Speaker 3:

And here is a preparation for experiments on the and also they are trying to synchronize of the camera and here is the example of the dataset. Actually, we tried to get the data from underwater, so one is from the camera, mugged camera, and we set the camera per meter and the fish cage. The depth is nine meters, so we use at least eight or nine per cage to get the environment of the full situation of the fish cage, but because of the FOB and the underwater, we cannot see all the situation in the cage. So I decided to dive into the cage and also I observed the fish behaviors and correct the data Okay, please. And also, we want to estimate the size and the weight, so we get the data from the fish and that one, the size measurement. We made these materials by ourselves Because we have to measure very quickly, otherwise the fish will be damaged. So we want to do something measured simultaneously. So we made it.

Speaker 5:

Yes, please.

Speaker 3:

So observing the one year of the fish cage and we found something interesting behavior of the fish. So their behavior is different, depends on the depths and also they change. Their behavior is different depends on the depth and also they change their behavior before feeding and feeding and after feeding. So maybe we can detect that a fish is hungry or not. Okay, please, but it would be the deep learning, it would be very helpful. But the problem is how we can get the ground truth data set. So our approach is to create computer graphics. So the left top is actual fish cage and the right one is the computer graphics fish cage, and also we create also camera and then once we succeed to create some object and we can change the size of the object or shape or anything, and we can do on the computer graphics, please. So also we stimulate the water condition. So we normally, as you know, the light and the water is, the red is disappeared first and the next green, and the blue is the last one to disappear. But if the water condition is horrible, something moody, and then the disappear color is the changes. The first is red but the second is blue. So sometimes we can see only green, green color underwater. So we need to simulate that part. So we create the simulator of the environment and after we create the environment we put in the fishes.

Speaker 3:

So left bottom shows our simulation result and also the real one. The left-hand side is the real recorded data and the right-hand side is our simulated data. So it's quite similar to the real one and once we succeed with that one, we can add the fish numbers, something like the same as actual fish cage Please. Same as actual fish cage Please. So once we got the simulation and it's easy to get the ground truth data because we know everything, so that, for example, 3d bounding boxes and a 2D bounding boxes and also segmentation, and we can get all the data Then we try to do fish counting.

Speaker 3:

So we succeeded to more than 19% of the accuracy because even the human counting is not accurate because they use their eyes, they use the counting machine, so sometimes they do the double count and sometimes it's missed count, so no one knows the exact number, but it's quite accurate. So what we are working with AIGIC. This video shows the side of the actual fish case, that one, as you can see, it's a lot of holes and it is very unstable and sometimes a very strong wind is coming so we have to very do very get the balance of our body, and also there are no power supply and also no Wi-Fi connection. So please go next go next Do I move forward.

Speaker 3:

Yeah, and also we have to protect all equipment from the sea water. So everything is so easy to break broke, please. And I already told us that there's no Wi-Fi connection on the water and no power supply. That is why we need HAI. Yes, please. So we made Watatsumi and that one like the airport, some of the fish farms. Please see the video again.

Speaker 5:

The agriculture industry has expanded, but most fish farms We'll see the video again are often done manually, using traditional methods. More complex tasks can be performed by professional producers or neglected. Therefore, there is a possibility to pay ethical and efficient contributions to lower costs and improve efficiency for small-scale aquaculture producers worldwide. Lightweight and low-cost AI solutions are crucial for tracking fish growth, counting fish during cage transfers, determining quantities for sales or transport, and optimizing feeding to minimize fish meal waste. From a management perspective, these solutions promote safer and more sustainable aquaculture practices for both people and the ocean. Our solution uses AI models to automatically count fish, estimate their size and detect fish net damage. These models can be deployed directly on edge devices near underwater cameras To overcome the challenges of collecting underwater video data. We utilize 3D computer graphics technology to generate synthetic and ultra-realistic fish-form data for training our neural networks, leveraging AIZip's innovative efficient neural network architectures and TinyML platform. Our models are embedded in edge devices while achieving superior accuracy. This solution has potential to revolutionize the aquaculture industry.

Speaker 1:

Cool. Yeah, I thought we could pause for a second to take a few questions. We got a few questions piling up here. Maybe we can jump into that. Davis, there's a little bit of background noise for you, just FYI.

Speaker 2:

Okay, I'll try to mimic it.

Speaker 1:

I think the so one of the ones that came up here was, so we'll just start from the top. So, in terms of the camera deployment, imaging deployment, the question was do you have cameras working at night as well? I mean, what are you doing about low light situations? Have you considered using other technologies other than vision, like radar, LiDAR, things like that.

Speaker 3:

Actually we don't use camera in the night and we already try to do radar, but it doesn't work well because the water is very protected from the radar.

Speaker 1:

Right, okay, cool. Similar one from Alexander. A second follow-on question. So I guess this is just a question we always get with AGI deployments what kind of compute is happening at the point of collecting the data versus? Are you using anything, any kind of heavy computational gateways on the back end or any cloud resources, or is this all happening in the camera itself, or how does that work?

Speaker 3:

Actually the camera doesn't have any power source, so we use the battery and we use just the micro SD card to record the data. So just only one number we can get the data of the camera.

Speaker 1:

I see. So, in terms of running the AI, workloads to count and size estimation, are those workloads running like on land, using the cloud or using the phone?

Speaker 3:

Actually no, it's just only running on the phone.

Speaker 1:

On the phone. Okay, excuse me On the iPhone 14. Cool, cool, 14. Cool Excellent.

Speaker 6:

Go ahead. I just want to add one thing Beside the phone. We do have deep underwater protection for the phone, so the phone is not exposed to the water.

Speaker 1:

Right the phone is not exposed to the water, but you're using it as sort of the AI platform for running the water. Right, the phone's not exposed to the water, right? But you're using it as sort of the AI platform for running the models. Exactly, right, yeah, here's another question. Is that a limitation?

Speaker 2:

of the compression? Oh sorry, no, I was just Very, very quickly. Is that a limitation of the compression available for these models that you need a smartphone, a CPU or accelerator? Or do you think you could port them to even lower intensity because platforms like Edge SoC, so you could say you want me to answer that.

Speaker 3:

Yes.

Speaker 6:

As a matter of fact, when we started we discussed this. Instead of reduce the resolution, we probably want to increase. If you look at some of the images here, you could mention that in some of the cases in March and April the water is very muddy, so we like to increase the resolution. But fortunately, according to our trial based on simulation simulator that you could send has as some real life images, it seems like the accuracy is pretty good with a given form. But, david, the question you ask is really good. As a matter of fact, we believe at the moment that we can reduce. We don't need 4K or 2K images to do a good counting. So the biggest problem we have is a field of view for the phones. Yes, Interesting.

Speaker 1:

Let me throw one more question up there before we jump back in. Um question was and I think I think I saw a slide on this but um, how many different types of fish? This is where I was. I thought it was also interesting. You're using sort of sort of simulation, simulated data to do your training, which is a whole other discussion. But this works with. You know you're training it on multiple types of fish.

Speaker 3:

I saw three types of fish in here yes, actually it looks like a red sea brainam. There's a salmon and it is succeeded, but maybe not like eels.

Speaker 1:

Eels.

Speaker 3:

Eels.

Speaker 1:

Maybe it doesn't work.

Speaker 4:

We need to turn it a bit.

Speaker 3:

But the fish is okay, because I tried to do some, visited some aquarium and I used this application to over the fish and it works and so typically, I suspect, in these aquaculture, in these nets, in these cages, they're all the same type of fish yes, in the net. Yes, the shape is a fish shape is very similar, right right Interesting.

Speaker 1:

Okay, cool. Why don't we let you jump back into your presentation and then we're collecting more questions on the back end and then we'll take it from there. How does that sound?

Speaker 3:

So it's two slides for Yuan, so please explain your technical details.

Speaker 6:

Yeah, as a matter of fact, as Yuko-san mentioned early on, the key factor that is possible is we have a simulator which can generate a lot of data for us. So a typical thing we need to deal with is what kind of network we need to select. We very quickly, looking at the real life images, saying that object detection doesn't work. There is just the density of the fish is just so large. So we are seeing the images with hundreds of thousands of fishes. So instead of using the traditional object detection methods, we select a different method. It's called density-based crowd counting. We look at the state of art methods and come up our own architecture be able to run on the edge. Our goal here is, with resolution from the phone, normal images, even with muddy water. So should we be able to run supporting enough of the frame rate that the phone can count in the reasonable time, using the MPU possibly, or DSP on the phone? And so this is some parameters, problems that we are dealing with and the data supporting what we have done. So, as I mentioned in the previous slides, the total number of fishes can be over 500. As a matter of fact, in the CES, yucusan has a large screen, has a lot of fishes and we have a live demo with the phone in front of the screen, to be able to count over 500 fishes that I have seen and fishing. The distance on the water is very difficult to see, but we still need to somehow count it. So that's why, you know, having enough data from simulation, from simulator, is very important. Even the fish is very difficult to see the eyes but we still can see sheets, and shape of the fish is important. As I mentioned, the biggest problem we have to deal with is muddy water. The water itself is even green in the spring, so is this domain. Shift from normal water is very difficult.

Speaker 6:

Okay, so, based on all this uh information that we have obtained from the soft bank, we made some decisions on the training data. So we have we take 22 percent of public dataset, mainly from phish to 5K, which is available. Then 65% of the data is coming from the synthesized datasets generated by SoftBank simulator. We also take 5% from Cocoa dataset by just running the approach that you could send, already have and to extract those useful images, just running the approach that you could already have and to extract those useful images, and 5% from the public facial video frames and use that to conform a training data and train the model. Okay, so, as I mentioned, the model itself is density based crowd counting model.

Speaker 6:

When we started we are not really sure this will work, but fortunately it works very well when you have a large number of fishes. So we have a build-up test set with really human counting for the clean water images as well as muddy water images. I still remember there's one day so AIZ folks sit together and count every fish in every images we use. The accuracy for clear water is quite good 96 percent. In muddy water we will lost 10 percent of the accuracy. Yifu-san, I think I probably go back to you.

Speaker 3:

Okay. So what we want to do next? We would like to create an AR monitoring to detect fish growth. That will reduce cost and also increase profits. So, please, and we would like to rebrand the official interest from harsh work condition to attractive working condition, actually from 3D to 3H. Yeah, yes, please. Also, we would like to redefine the industry with the power of technology and working towards a sustainable society. Actually, in the fish farm, the fish food contained is fish, so we have to change the food of the fish from the fish to something vegetable or something different. But to change the food. But normally the fish doesn't want to eat the vegetable fish, vegetable food. So they want to track the growth of the fish if they change the food. So we need something to monitor the system of the fish, please. And actually, we got the awards of the innovation of the CES 2025 and we got the booth, and there are so many people coming to our booth. It is very appreciated.

Speaker 6:

Indeed.

Speaker 3:

Yeah, okay, and that will be the last slide. Yeah, thank you for listening.

Speaker 6:

Thank you.

Speaker 2:

Wow, so there's a tremendous amount of questions coming in from the audience. I don't know, pete, if you remember where we left off in the previous talk track.

Speaker 1:

Yeah, yeah, we have a lot of good ones in here. I think one of the things that was a question. I had a question too about the simulated training. You mentioned, 65% was simulated data and someone had a question around simulation tools. We're using generative AI for the simulated fish. How did you go about what?

Speaker 3:

was your tool chain for simulation data. Actually, we use Unity.

Speaker 1:

Oh, unity Okay so the gaming engine kind of thing.

Speaker 3:

Yeah, gaming engine and everything on the scratch. We don't use any other tools, just Unity, and we create some models and we create the animations by ourselves.

Speaker 1:

Hmm, okay Good.

Speaker 3:

Everything is original.

Speaker 1:

Right yeah.

Speaker 2:

One comment I noticed from yourself, jan, on AIZip's behalf, was it sounds like no matter how good AI models get, we'll always need people, because we'll still need people to label the fish or to get the ground truth or to get some kind of cold start. So I just wanted to highlight that. So it's always the two working together. You can't replace one or the other.

Speaker 6:

Yeah, that's very true. So I think, if you look at four years ago when we started, without the good data we cannot really train anything. I think the last four years is Unity is one thing that you could send. The first time I saw it I was shocked. So SoftBank was able to come up with this beautiful simulator not only simulate fish as well as simulate the water condition and then the outcomes. So at least from our experience, it's not 100 percent. It will give you 30 percent in some cases and 40 percent in other cases. But in order to be able to do well on real life, we do need human involvement for sure. So, for example, the test data I mentioned is done by human. We cannot trust using simulation data for testing purposes have to be real life.

Speaker 1:

All right, got it. A couple other good questions here. I'll just kind of bounce around. Abhishek asked about metrics. What are your KPIs to validate performance of the model, like what we're looking for in terms of accuracy and other metrics?

Speaker 6:

So let me answer from AI's deep side. For sure this KPI is coming from a software back right. From our standpoint view, our current model have done live testing in real life. You could take the camera into the water and test it, so in the normal cases it works quite well. So we have seen somewhere between 92 percent to 97, 98 percent of accuracy in real life testing, so we have a reasonable confidence that it will work. So the only thing here is how do we deal with various scenarios? We probably need one year of testing of this to validate, at least from our definition of the robustness, before we claim this is really top quality or not.

Speaker 6:

And also we will expect there will be feedback. That's a very typical from our experience of doing production. You could say we'll test it and have some images. Doesn't work well, then we need to come back and retrain the model. That's what will be expected. The good news that we had is we actually have tested this model using not only in-domain data from U-percent, we also tested with out-domain images from like a wide world. It's pretty reasonable. Okay, cool, why the work? It's pretty reasonable.

Speaker 1:

Okay, cool. Here's another one, uh, not really edgi related, more of just you know salt. I know there's like ip68, ip72, so is there? What is the designation for salt water tolerant microelectronics? Is there an ip designation for that or what's the? Maybe? That's more of a question for Dr Ishiwaka.

Speaker 3:

Yeah, so actually, what's the requirements for salt water? Salt water is an enemy of all electricity.

Speaker 1:

Right Requirements are don't get it wet. I think that's the requirement.

Speaker 3:

And actually, if it's not salt water, maybe if after dry, sometimes microelectronics survive, but if it's salt water, once they got salt water, it's horrible. No electricity it survives. Horrible, no electricity survives. So we needed some waterproof and we always use something housing Water, protected from the housing. But once we try to put in the water with some how can I say electricity, it's not without any cover. So we used some soft how can I say something? Very soft plastic pack to waterproof the phones, to waterproof over the phones, and then we put in the electricity in the soft one and it will survive until five meter.

Speaker 1:

But then we put in the nine meter and it was broke. Oh, I see, yeah, so water pressure is also our enemy, right?

Speaker 3:

Yes, all right, interesting, yeah, I can see that.

Speaker 1:

Good, let's see, yeah, so water pressure is also the our enemy Right. Yes, all right, interesting. Yeah, I can see that. Good, let's see. We had a couple other good ones here, and, davis, feel free to chime in If you see one that catches your eye, let's see.

Speaker 2:

Well, I think they're talking about the size of the models and they're so. Shuman, that also asked the question kind of similar to what I asked earlier about the compression, and is a smartphone the right edge inference engine or is it the limitation? So I think there were some more specific questions that maybe AIZip can comment on on the size and that's deployed on mobile and maybe future deployments as well.

Speaker 6:

Right, yeah, sure, as a matter of fact they're very much involved U-Person as well, because it takes us about two weeks to decide which machine to run on. So we pick mobile phones simply because it's not because we want to use it permanently, it's more like the convenience and it does have a little bit of waterproof. At the same time, the iPhone 14 already has MPU inside. Computation is good enough and the connection from the camera all the way to the NPU is given.

Speaker 6:

And as a matter of fact, I think, following the previous question that you could answer, we do want to have a unique, separate devices, self-built, because of all these issues water pressure, salty water, and you won't have something like you know, putting over a few days. So there will be. You know, I learned from you you could send that. There will be, you know, vegetation, climb over your phone, over your cameras. So all this stuff have to be somehow well thought as well as well made. So, going back to the phone, in my opinion, just in my opinion I think we can probably have a less powerful chip to be able to run even easier than the current version of iPhone 14 if we go to production and move to this more optimization.

Speaker 1:

Right yeah. I suspect you probably lower your price of your device, increase the ruggedness of it and maybe also extend the battery life as well, probably.

Speaker 6:

Yeah, yeah, you bet.

Speaker 1:

You know which is key. I mean, you don't want to have to recharge these things. Yeah, interesting, let's see. I had a question here that came in here from sam sam fuller aggregate. Do you aggregate video from multiple cameras into a single processor, or do you do you do the computation distributed per camera, or how does that work? Like, topologically speaking, how does that work?

Speaker 6:

um, so, so you could send uh. So, uh, I probably can only answer half of them because of the video come from uh ukusan. Like, really, from out my standpoint, I don't really care how many cameras, uh, just uh, you know, we will from ai zip side just every single images we need to cut right, so you, yuko-san, can use this differently. Go ahead, yuko-san.

Speaker 3:

Yeah, actually, for our side, the reason why we use multi-cameras, we want to observe the flow inside of the fish cage, so we try to synchronize first in the air by using our voice, and we try to stitch all the images. But we are still working on that part. It is not easy and we cannot adapt to the usual tools because all the fish seem similar so we cannot choose which one is the right one. So two cameras might have the same fish, but we human cannot detect which one is the same fish.

Speaker 1:

Right, right yeah.

Speaker 3:

So that's what we also. We have to use the computer graphics use the computer graphics. I see yeah.

Speaker 1:

Here's an interesting twist on that. You had mentioned that you're using the cameras to also detect damage to nets and damage to the cages. What about? Can you detect anomalies in the fish themselves? And look for diseased fish. Has that been tried?

Speaker 2:

yet Anomalies.

Speaker 1:

Is that in the pipeline?

Speaker 3:

Actually we are observing the fish behaviors and also the fish behavior is so different from the normal behaviors. So I suppose we are working on, but I think it's possible.

Speaker 1:

Cool, and here's a related question Does it work for shrimp?

Speaker 3:

Ah, yeah. It doesn't work for shrimp, but maybe we are trying. Yeah, some fish farm is asking to do that.

Speaker 1:

Okay, good yeah, I'm sure. Yeah, shrimp TBD, I guess.

Speaker 3:

Yeah, we have to create objects.

Speaker 2:

Yeah, some seafood fans in the audience. Yeah, that's right. Yeah, everyone's kind of getting there figuring out their favorite seafood lobsters.

Speaker 1:

I'm glad it's not working for eels, because eels is my least favorite I'm okay with that uh issue. Um, a couple other things. What do you see here, davis? There's so many questions coming in.

Speaker 2:

I mean there's a few more. I like the ones that are kind of non-technical, like what we're discussing here, but there was one that caught my attention from Alexander, or actually two people, asking a similar question what does the development flow look like for the AI? Are you training in common framework PyTorch TensorFlow and then are you converting to TF Lite for the phone?

Speaker 4:

training in common framework PyTorch TensorFlow. And then are you converting to TF Lite for the phone or using Onyx or using Qualcomm AI Hub.

Speaker 2:

Yeah, let's get a peek at that.

Speaker 6:

So you could say so. I answer that. This is probably some of our experiences. So you know, four years ago, tf and TF Lite is the primary thing. Tensorflow is the primary thing. Tensorflow is the primary thing.

Speaker 6:

But when we started we realized that the PhDs from my company they refused to use the TF Lite or TF TensorFlow. So unfortunately, we actually started with PyTorch. Unfortunately to us, that's become a trend that people are using PyTorch a lot right Last few years ago. And then the problem we had is how can you get PyTorch to the deployment? I think you know PyNML and HAI Foundation have a lot of people contribute over that beside us, so we are just one of them. So for us so from first we start, from PyTorch to C and then from Python to TF Lite and then to OX are all valid approaches.

Speaker 6:

The only thing that we realized, which we have been sharing with our partners as well as customers, is that when you do this compiler SDK, please leave the quantization to these AI companies, because quantization is critical elements. If you do that in your compiler and we don't have a way to play with it, we cannot do mixed precision. That's oftentimes a giver hurdle. Did not be able to do so, david. This is the one thing that I share with ali before as well, is that, uh, you know, give us the option that we can quantize and then, uh, give tf light all only x to you. You take that to comply. So I think you know I have been sharing this with a lot of people.

Speaker 2:

Yeah, and one technical follow-up on that. This has become my favorite question nowadays. Have you heard of Executorch? This initiative from PyTorch. Yeah, okay, yeah.

Speaker 6:

As a matter of fact, we are working quite closely with the meta team, including YouCodeSign, so if S-UQtouch can directly comply from PyTorch to executable, that actually solves a lot of problems that we have. Yeah, we're very closely watching and trying.

Speaker 1:

Cool.

Speaker 5:

Agreed.

Speaker 1:

I had a couple of questions come in Davis on the training. Yeah Cool. I had a couple of questions come in Davis on like the kind of the training, kind of training aspects. One was going back to you're mentioning about muddy water and things like that. Like how do you, what are you thinking about in terms of model training to mitigate, you know, the risk of that challenge and adding noise into the training data, like this is one suggestion from Abhishek, but so I think you can say probably to the talk of somebody who thinks it is.

Speaker 3:

Yeah, yes, and actually we try to use Sim2Real to cross to the real world, so I think it's very useful, yeah.

Speaker 6:

So I think the key solution what kind?

Speaker 3:

of noise. That's the problem. What kind of noise we should add? Yeah, I just want to mention this.

Speaker 6:

Unity-based simulator is easy. Without that we don't have those enough multi-data to train. Having the simulator become critical for these multi-situations.

Speaker 1:

Yes, there's a question here around Unity. Actually, it's more for the application development for the phone. Did you use Unity? I mean, what's the workflow on developing the application that you used on the phone?

Speaker 3:

Actually, unity is just for the simulation. Okay, yeah, Actually, unity is just for the simulation. So how about the answer from you?

Speaker 6:

So Unity is not on the phone. We essentially Softbank. When we started, we actually enjoyed the fruit from Softbank. They already have a Unity-based simulator. They can tweak it. You know the way to become a user of this simulator and create the data for us. So everything is running on GPU, on servers. So the phone side is, you know, get the model and run the model on the phone only.

Speaker 3:

Yes, just using the Unity for creating the ground truth data.

Speaker 2:

Right. What about Cosmos from Nvidia and some of these physical AI world models? It's maybe it overlaps with the Unity. It already covers that. Do you guys see validity to what they're doing? Of course they've done computer graphics for decades. Now they're doing this physical AI world modeling, simulation models as well. I'm curious if that is of interest.

Speaker 6:

So, yuko-san, how about you answer first, then I follow.

Speaker 3:

Yeah, actually I don't catch half the question, so maybe I think it's. The Unity has a very good engine and we can use something like the computer graphics. Actually, in the world of the computer graphics they have two options One is Unity and the second one is real engine. And Unity has a lot of assets and, for example, the ocean environment simulation. The Unity has one and so it's so easy to implement to that kind of things.

Speaker 6:

Yeah, regarding to world model. So this is definitely one direction. I think this is a buzzword for the industry, for I think since last year. So we are watching it. So we haven't done much on that from all, given the LPD schedule. So we definitely watch and for us it's not direction that AICP is taking. Instead, we'll be their users if the model is ready. So I don't have enough experience to see more. That makes sense. Thank you.

Speaker 1:

Cool. Well, we're kind of running up on our time here, but we'll continue the conversation offline with Q&A and things like that, but I really appreciate your time Again, dr Ishiwaka, joining us from early in the morning in Japan.

Speaker 1:

Thank you so much. She's my princess From AI Zip in the more reasonable time zone of California, but it's great having you here. I think we learned a lot, I think the audience learned a lot, and this is an interesting pattern that I'm sure could be applied to lots of different commercial problems out there, and so great to see this kind of pioneering work happening. So thank you again.

Speaker 2:

Yeah, thank you for being here.

Speaker 6:

Thank you all Thanks to the audience as well.

Speaker 2:

Thanks for the engagement. I think that was a fantastic engagement, yeah.

Speaker 1:

Yeah, thank you, great, all right, see you around everybody. See you next time, thank you.