Voices of Video

Your Sports Car Is Cool, But The Taxi Wins On Power Bills

NETINT Technologies Season 3 Episode 31

What if your heaviest video jobs spun up in seconds, sipped power, and scaled wherever your viewers are?

In this episode, we run NETINT VPUs inside Akamai Cloud and push them across live and just-in-time workflows—multi-codec ABR (AV1/H.264/HEVC), synchronization, DRM, and low-latency packaging included.

We start with deployment trade-offs: on-prem cards (control like a tuned sports car), cloud resources (on-demand like a taxi), portable containers, and Kubernetes for orchestration and autoscaling. With VPUs available in-region on Akamai, you cut CPU burn, lower watts per stream, and keep compute close to contribution or audience—ideal for local ingest, regional ad splicing, anti-piracy, and edge turnarounds.

Then we get hands-on. Scalstrm launches a live channel with a single API call—multicast in, three profiles out, catch-up enabled—in a couple of seconds. Advanced toggles cover time-shift TV, HLS/DASH, low-latency, trick play, iframe playlists, DRM, and ad insertion. Robust monitoring and analytics surface sync issues early to avoid blind troubleshooting. For VOD, we flip to just-in-time: store the top profile, regenerate lower rungs on demand, and save ~50–60% storage—while enabling instant ad asset playout.

For builders, we walk the Kubernetes path: provision a cluster in Frankfurt, label nodes for NETINT VPUs, deploy drivers from Git, wire up object storage, and run a pod that watches a bucket and invokes FFmpeg with hardware acceleration. We generate Apple’s ABR ladder across AV1/H.264/HEVC and finish a 5.5-minute asset in under four minutes—setup included—while power draw rises smoothly from idle without spikes.

If you care about power efficiency, global scale, and faster launches, this is a blueprint you can reuse today. Share it with the teammate who lives in FFmpeg, and tell us which part you want open-sourced next.

Key Takeaways

  • Deployment models: on-prem, cloud, containers, Kubernetes—when each makes sense
  • Why VPUs: higher density, lower power per stream, sustainability benefits
  • Akamai reach: edge and cloud tightly coupled for minimal latency
  • Scalstrm live demo: API setup → multicast in → three profiles out → ready in seconds
  • Advanced features: sync, time-shift TV, DRM, low-latency, trick play, iframe playlists, ad insertion
  • Observability: monitoring/analytics to reduce tickets and speed root-cause
  • Just-in-time VOD: keep highest profile, regenerate lower rungs on demand (~50–60% storage savings)
  • Kubernetes workflow: drivers, node labels, buckets, FFmpeg with NETINT acceleration
  • Performance proof: multi-codec ABR in minutes, end-to-end

📄 Download the presentation →
💡 Get $500 credit to test on Akamai →

Stay tuned for more in-depth insights on video technology, trends, and practical applications. Subscribe to Voices of Video: Inside the Tech for exclusive, hands-on knowledge from the experts. For more resources, visit Voices of Video.

Voices Of Video:

Voices of video. Voices of video. The voices of video. Voices of video.

Anders Nasman:

Everyone. Hello. Yeah, you over there. Hello. Yeah. Well, we're going to talk about how to run hardware in clouds. And yeah, I'm here with a friend as well. So we're going to do a little bit of experimentation, set up some stuff. And uh yeah, here we go. Uh Anders Nasman, I'm working as a solution engineer for Akamai. So I'm kind of helping our customers uh get started with technology that we provide. And one technology that we do provide is actually Netint cards. And uh we have Dominic as well. Yeah.

Dominique Vosters:

Dominique Vosters from uh Scalstrm, uh mainly business development, but I'm also into uh technical stuff. So we will do a demo today from uh the net int solution running on uh Akamai Cloud.

Anders Nasman:

Oh perfect stuff. So this is how these cards work, and um, so you might think what does a cloud provider like Akamai and a software provider like ScaleStream do on the stage in a hardware booth? You know, these are the guys that these uh well that it builds. Uh they build a lot of stuff, but they these are the ones that we actually use. So, what we've been doing is we've been deploying these cards into the Akamai Cloud so you can use them in the cloud environment. We don't need to fiddle and put it into a hardware box. So, just a little bit on on uh what Akamai Cloud can do. We have a lot of stuff. We have uh C well normal CPUs, something to that you can run. We have storage, we have Kubernetes, we have an app platform where you can deploy stuff and run Kubernetes uh workloads quickly. There's open storage and all sorts of things. But today, this is the piece of the pie that we're gonna talk about the VPUs. And uh let's think a little bit about different deployment methods. Uh, we talked about putting cards into hardware boxes. Yeah, that's one way, and that's a little bit like driving your own sports car. You can tweak it the way you want, you could even set the power levels if you want to to the card. And I know we were fiddling with uh yeah, speeding up uh CPUs for our gaming computers early on. Uh well, those are the things you can do when you own your hardware, and as you can see, these cards can be put in uh massive scale into your own hardware. But uh we are operating a cloud, so Akamai is operating both an edge platform and a cloud platform, and they're both built together very tightly. And that's a little bit more like riding a taxi. You call for a resource when you need it, otherwise you have your box and it's sitting there in your backyard with your sports card. There's another way uh to do it as well, and that's to run what's called containers. And in this case, it's presented by Docket containers, and they are portable. That's where we compare it with a truck. You can move it around, you put it on uh different locations, and we're gonna get into that in a bit. And then the fourth option is to run all these small containers in an environment, a full kind of traffic environment where you can control them uh in one single uh control plane. So you can scale, actually auto-scale as well if you want to. Uh these guys. Why did we go in and started using hardware in our cloud? Transcoding consumes a lot of CPU if you're doing transcoding. Uh so this is like just comparing if you do it in the cloud environment. We have Akama here, and we're just comparing with this setup in Ecolo. And just by using the cloud itself, we're reusing so many things like back planes, networking, and so forth. So you actually drastically use the power need for running those CPUs. Now that we bring the VPUs to the cloud, you can see that it actually uh percentage-wise, even bigger. So, yes, you can deploy these uh cards in a colo, but uh doing it on the cloud environment, you reduce the power need even more. So, again, the absolutely best way of deploying uh these fantastic cards is to do it in a cloud environment. So, where are those devices available, you might think? Well, we have people from all over the world here. I think I see I see all sorts of nationalities here, and uh our cloud compute is available a little bit everywhere, but the cards are right now distributed like this. So there should be devices like this available almost anywhere you are. Why would you need that? There's a number of uh scenarios. We have a big customer rolling out um this for user-generated content. You bring a lot of content in from end users, and they could be anywhere in the world. Local ingas could be TV stations, could be local TV channels, could be your local programming. Maybe you're you're producing your main channel somewhere in in Asia, but your audio is in uh in Europe, and maybe you want to retranscode or do stuff that is local to that region, and you do that regionalized. But local ad insertion, we have uh stuff like add the piracy. We're doing a really, really cool thing together with with the net ad scale stream, where we are splicing uh epic uh streams in real time using these cards. So a lot of great opportunities with having local uh presence for these cars, and then uh the most busy side of today, uh kind of representing how traffic is uh distributed over the world and latencies and so forth. But it happens to be that Akamai also has a fabulous edge platform. So, with that, we are present in 4,100 pops. We have 400,000 servers spread out all over the world, actually, the most distributed uh CDF platform that exists, and the cloud platform, where the cards are, and the edge platform are completely married together. So we have almost no latency between these different platforms, and you can deliver the experience wherever you want. I'm gonna hand over to Dominic.

Dominique Vosters:

Yeah, okay. Good, thank you. Thank you. So hi, all these platforms like uh on-prem, cloud, Kubernetes. Uh hi, they're running in the cloud. Uh, we can run on-prem in the cloud, in Kubernetes. So we're an application on top of uh, in this case, Akamai, which we will show today. Um, bit of a background perhaps on uh scale stream as such. So we have a few products from uh transcoding, uh offline transcoding, VOD transcoding, just in time transcoding, live transcoding to origin and towards CDN. Uh but today we will mainly focus on the just in time and uh live uh transcoding. So today I want to give a bit of a quick setup and a demo how we can uh set it up uh on our system. So uh today uh I want to mainly show two use cases. Uh let's start with a live one and then uh just in time. So uh I also want to show a bit uh how easy it is to spin it up. Uh even if it uh might sound complex, it's quite uh easy and straightforward uh to spin it up. So in this uh in this demo, um we will use uh an API call uh with a configuration file. We have a multicast uh input stream. Um in this case, we will transcode into three profiles. Uh we enable catch up for uh a few minutes, and then uh we will launch the API call, and then you will immediately see that the channel will be uh created. So now the channel is created, uh, it will be spinned up. The transcoding it the system will detect the stream. Transcoding job will start, and as you see, in two or three seconds the channel is uh up and running. So it will only take uh yeah, what is it, two or three seconds uh to spin up the channel. Um besides that, uh, we can also, of course, have uh more uh fancy features like synchronization. So uh we can uh synchronize the the stream on the transcoding level, also on the ankle on the packaging level, uh, just to make sure that uh the streams are all in sync. If the player would fail over between one or the other, there is no impact uh for the player. Um, we can also set uh timeshift uh TV uh which will be ready on the fly. Here you will see the original stream. We will transcode into three profiles in this case. Um, of course, it's uh configurable. Um on the output section, you can uh manipulate the content. Um we can set settings for dash for HLS. Um we can enable ad insertion, low latency, uh trick play, iframe playlists, um enable DRM. Um we can also customize these settings uh even more. Um also one thing we invest a lot in uh is uh monitoring analytics and debugging because I this is a win-win. A win for uh the customer and the partner that knows uh what's going on, and a win for us, we should get less tickets. So, in this case, for example, we're scanning the the input stream from the transcoder, um, see if everything everything is in sync, uh, and so on. Um, so that's quite straightforward. So, all right, perhaps let's jump into the the second use case, uh the just-in-time transcoding. Um, and this is something that can be used for uh, for example, for uh where we record only the highest profile uh and we um transcode the lower profiles on the fly. By this you can save roughly 50 to 60 percent of storage space. So in this case, uh we have a recording from uh a few hours ago. Uh here we keep also keep all the profiles because um we don't want to exchange storage costs versus uh transcoding costs. So the first hours we keep uh in this case, uh two hours uh two days, we keep all the profiles on storage, and after two days, we remove all the lower profiles, and then if a customer requests uh we will transcode it on the fly. Um so if we go back in uh the UI uh a month, I let's say uh a few days back, then we will uh see a recording that's uh from a month ago, and in that case, uh I will play it out. Uh the highest profile is the original profile of storage. We don't need to retranscode that. Uh, but if we go to a lower profile, then uh we will add a logo uh that will be transcoded just in time. So this is the play out for the highest profile. So it's the original uh stream that's coming from storage, so no transcoding is needed. Um, but if we go to one of the lower profiles, then uh you will see that we will add the transcode, I just the logo as transcoded uh on the fly. Um also this could be used for other use cases, so not only to save uh storage space, but it could also be used, for example, for uh ad insertion uh use cases uh where you fetch the file directly from uh the ad server and transcode them on the fly that it's immediately available for playout and that uh you don't lose any money because you need time to uh transcode it. Uh so that's a bit on uh on the demo. Um, I think I will hand it over back to uh to Anders. Um, from our end, yeah, I think the main differentiators is uh performance. This is something that we we already have on our origin platform, and together with uh Netint cards, uh we can do the same, uh be really different compared to uh others uh and be sustainable and uh have a have a good performance. So I hand it over back to uh Anders.

Anders Nasman:

Thank you. Uh super interesting, and actually, um I'll be trying this myself. I forgot an instance of Scale Stream running with the number of streams like after the summer now, six months later or something, it was still running and no one had touched it, and and everything was still in the same. So really cool. Um, I'm gonna do something that is more programmatic uh and maybe geeky. Uh I like geeky stuff. Uh so I'm gonna transcode using Kubernetes, and it's gonna be a little bit of code and stuff shown on the on the screens, but don't worry, I'll explain to you what I'm trying to do at least. So we have an input bucket, we put some videos there, could be one or a hundred thousand, doesn't merge. We have an output, so we're gonna send stuff through there, and we're gonna do some transcoding. And in Kubernetes, there's something, a machine that we would normally think of, something where these cards would run. Uh, it's called a pop in in um in Kubernetes. Uh so we're gonna connect that to a VPU card that exists in the cloud. We're gonna take an input file, normal 1080p file, 26 megabits per second, and we're gonna turn it into Apple's original ABR ladder. Maybe not the most um used one today, but that's what I use. And uh I'm gonna do it in AB1, H264, and HVC. I'm gonna see if we can do this in um minutes. Um, so first of all, you need a cluster to run your Kubernetes uh instance on. Oh, and uh I'm running this in uh Frankfurt, and uh these are the machines I can include in my cluster. So there's like three options of machines. I'm choosing the smallest one. Uh I choose to have three because then you have full redundancy in your cluster, uh, and you can also migrate things if you have three, so that's a good thing. So that ends up being uh a little bit of money, obviously, but uh you know running that as hardware and doing this yourself, maintaining that will be cost you more anyhow. So now let's start. So I'm gonna first take what's called a kube config. That's the configuration I need to talk to my cluster. I'm gonna put that onto my local machine, I'm gonna give uh all my machines a little bit of um information saying, Oh, it's a net intervice in that machine. So I know that if I have a big cluster with 10 GPUs and uh 150 VPUs, then I'm gonna mark the VPUs as uh net inf devices so I'm not mixing them up and sending your own things to the wrong place. Uh I'm now gonna get the driver for the net int uh card in place. So I do a git clone, which is all already uh available on Git, so very easy to get hold of it. Uh installing that, going in, deploying it to my cluster. Hell for everyone that you know, uh Kubernetes is a tool to put your hardware or software exactly into and install it into the Kubernetes cluster. This is nothing that we need to do anything about, it's just gonna do it it itself. Then there's two more things, and then we're ready to start our work. So we're gonna put some keys in so that we can log in to you. Remember the two buckets? I had one uh obvious storage here and one obvious storage there. So obviously, I need to be able to log into those both. So and I I also put some configuration for what they're called and so forth, so that every instance knows it. Now we start with uh MS Code. So what do we have in here? So we have a Docker file that's essentially um explaining to the system what um uh one of these uh pods should should contain. Also a little bit of configuration, but you see, it's not much, it's gonna be a little bit more complicated now. This is the thing that my pod will run. So as soon as I start this up, this is what's gonna happen to my pod. This could also be in a VM if you want to. So you can run the same thing in a VM, it will work. So it's scanning a directory here. So it's gonna look at this directory and go through it, through it, through it, and see if there's new files coming in. If it finds new files at the end with mp4, it's gonna start ffmpeg, it's gonna do that with a D from uh from uh Meteint, and it's also gonna use an encoder uh from uh Metit. But you see, there's a lot of renditions here, and we're doing uh HTBC, we're doing AB1, and we're doing H264 at the same time. So we're gonna push this uh really hard. And you know uh AB1 is tough to transpose, but with uh with the Metin card, this goes like a breeze. Then uh for the pod to be active, we also need what's called a well a DAML file that talks a little bit about how it should be running uh and where it should start. So it's okay. Start my uh job as soon as you have deployed yourself. So now we're gonna deploy. Now even more code. Interesting. We're checking the input. This was the left bucket, it contains a file called VD1. The output is empty. You have the timer going now, right? So we know that we're hitting the four-minute mark. Uh okay, so we're building the Docker image. Remember the image that that is running, so our our machine. We're deploying it, so we're now pushing it. Yeah, no, like my button. Okay. Okay, or something. And now we're checking. Okay, so we don't we we don't have this running yet, so we haven't deployed it into our cluster yet, but we have one little small piece of of uh pod that I will uh connect to. This is just for monitoring, so I'll I'll show you what I'm gonna do with this. So I'm now gonna log into this pod and check how the card is doing. So uh this is information from the network card. We can see that it's running roughly five watts. This is an idle mode. Uh we can see that it has no loads, it's not doing any decodes, not doing any encodes, it's not scaling, and it's not using the AI end at all. So now it's like idling. I think five watts. This kind of a lines I uh may just get one thing. Okay, so now we're pushing this uh code that I showed you before. So now it's actually active. Now it's starting to look for uh new files coming into the left bucket if you remember that one. So we see, oh, it's running. Good, it's different. It's recorded, by the way, so obviously it's running. Okay, so uh I'm checking the log of that. You know, that would be like your console in the virtual machine. So I'm checking what's happening here. Everyone that has played with FFmpeg immediately understands that this is FFmpeg output. This is how it looks, it's very boring, and uh after a while your eyes start bleeding because you look like this. Uh but if we check the status of the card now, oh we're up 12 watts. But remember how many AB1s, how many HGVCs, and how many H264s did we do at the same time? 12 watts of power, 90% load. Uh and uh yeah, otherwise we're not doing too much. It's uh pretty idle. So you could probably easily put in a number of transcribes more here. So we're gonna check the log from the pod again to see if it's done.

Dominique Vosters:

Uh I think yeah.

Anders Nasman:

Uh I forgot to say this video is five and sorry, five and a half minutes long. Uh so we're working way below uh real time here. It took uh oh, I can't say that now, but it takes roughly a minute to transfer that. But remember to all those profiles. Uh we see that our video artifacts, you know, the output, the write bucket, uh now holds a directory. So my process has worked and we check what files are there. And yes, we have uh the AB1s, the H264s, and the HPCs in that direction directory. So under four minutes, and that's including installing the stuff in the cloud. So um, yeah, go ahead, use your phones and uh scan that one, or go around the corner there. Uh get a lot of compute for free. Try the cards uh just now. That's my recommendation, or go talk to to Dominic and have a really nice demo how you can run this if you don't want to do the code thing here uh and uh well my colleagues as well around the corner if you want to talk about how you can do this with code, without code, or well, almost anyway, but not but not installing into a box. We're not doing that. Anything more you want to say, Dominic? I I think there's more something more to say.

Dominique Vosters:

No, not really. I think you gave a good presentation, but uh as Anders already mentioned, uh we have uh a stand here. So if you want to see a demo more uh detailed, you can ask my colleagues uh for a demo. We also have uh have a boot in Hall 5, uh H554, where we show also more stuff like uh ad insertion and and so on. So feel free to uh to come by. Thank you. Thank you so much.

Voices Of Video:

This episode of Voices of Video is brought to you by NetInt Technologies. If you are looking for cutting edge video encoding solutions, check out net's products at netint.com.