Fun with AWS Fargate

I was finally able to get the Streamlit app for the transcription approach to work with AWS Fargate. It was a LOT more complicated than I initially expected. It all makes sense now that I know how it works, but putting all the pieces together was tricky. Hopefully I can get the rest of the app connected before the meetup and show a fully functional transcription application.

Transcript

Ooh, recording. The ladies should tell me. There we go. All right, so this is basically a quick ish rundown of what it took to get my Streamlit application hosted on Amazon Fargate. So let me share my screen and I can walk through some of this fun stuff, entire screen. Yes, this one. All right, so what I have is a Streamlit application, which is fairly simple. Wow, that’s too big.

It goes through, puts up a S3 bucket and a folder to push stuff to some logo stuff, setting a page config to get like a favicon and the title stuff like that. Simple image, upload file, grabs details. If it’s audio or video, it takes that. Right now I’ve got this. I’ll have to change this afterwards because it’s playing text right now. Anyway, you get the message. So a VINIT grabs file extension file name, does some hashing into a timestamp, and then grabs that, shoves it into an S3 bucket, which we’re actually first, before we push it to an S3 bucket, we’re getting the, actually using a simple queue service, also from Amazon that I’m not really talking about tonight, but it might play into it.

And then, grips another button up there. It says, hey, do you want to transcribe this? If they say yes, go, next thing we do is we push it to the S3 bucket, which triggers a lambda that we’ve talked about before, which goes through and grabs the file, does a transcription, and while it’s transcribing, it’s pushing messages back through this message queue. And so we’re actually just printing out what’s going on to the screen, and then when it either completes or fails, we finish up. So just a quick thing, shouldn’t be that hard. I’ve got this in a pretty simple Docker file, Python 3a, 8501 exposed, copying in streamlit, and I’ve got some config stuff to automatically go dark theme and some other types of stuff. So just a pip install, upgrade install, streamlit and then requirements.

I don’t think these two things have to be separate, but now I was running into issues without it. Entry point, streamlit, run, app.py. Fairly simple. So you build all of that, I’ve got some notes in here for how to actually, it’s not this particular one, but you have to do a, that’s not it, there you go. You have to, the containers that you run on Fargate have to be hosted in your Amazon container registry. So you have to initially run this command to log in, piping that to a Docker log in, all that stuff. Then you tag your repo or your container, then you can push to the container registry.

Whoops, didn’t mean to do that on AWS. Fun stuff. So let me go back over to this window. Let me go find my, whoops, fun stuff. Let me log in again, which is all too factorized. Let me go bring that up. Which at first I thought the whole two factor thing was gonna be a pretty big hit as far as being able to do stuff is in hey, how long is it gonna take to do this every time? And it’s really not that bad.

So let me jump over to elastic container registry. So I’ve got a private repo for Huntsle AI that I’ve got what 20 different images in over time. I need to go remove some of these. So the one I’ve got latest is actually the one that runs on S on Lambda for doing the actual transcription, the streamlit container much smaller is the Docker image running my streamlit app that I built. So then the next thing is, okay, let’s get this into Fargate and run it and throw a public load balancer in front of it, that kind of thing. Cause it sounded easy.

As it turns out, I started tracking what all you need to do for that. So let’s start with kind of the things you have to know. Well, you got no dockers and containers. You got to know AWS, ECR, which is what we just showed. You got to know how to sub a VPC, availability regions, you got to know about subnets, CIDR, which is kind of goes with the subnets, but also another couple of places you got to have an internet gateway setup, route tables, security groups, load balancers, access, control list, endpoints, clusters, task definition, services, tasks, target groups, aim roles, and aim permissions. And then after all of that, you too can have a container running with Fargate. Easy enough, right?

So I started actually walking through, and one of the other interesting things is some of these tutorials you follow through, they enter at various points in this kind of a setup. I will try to jump over a little bit to, I started trying to graph kind of what I wind up with. So I’ve got to use a load balancer, you got to have two subnets or two availability regions, which forces you into two subnets, and all that kind of stuff. So anyway, there’s a couple of different ways to think about it. One is to start from the networking side, which would be to set up your VPC, your virtual private, what’s the, the C mean, out of whatever the C is in VPC, would be to start from the network side and kind of work your way back. The other, which is what I did, and then try to figure out what I broke was actually to start from Fargate itself and go forward.

So starting off, let’s go to, and of course, all of these things, these are all the different services I’ve had to jump through recently, trying to get all this stuff set up. So I’m looking for Fargate might be under elastic container service, I think it is. Yep, so the first thing you do with Fargate is create this thing called a cluster, and it’ll ask you which virtual private, which VPC you want, or it’ll create one from default. So I wound up creating one using whatever the default one was that it had and that didn’t work, so I wound up having to do some other stuff. But this is basically the set of a collection of services or tasks that you want this thing to run. What you wind up with is you can create either services or tasks, which is kind of interesting because of, and confusing at the same time.

To start off with, you go to a task definition, and you have to create a task definition. So we can just create one of these for the heck of it and see, so this is a new, we’ll call this new definition. So next thing, you gotta, you can name it or whatever, you gotta create your image uri, which for some reason this doesn’t let you do a lookup where a lot of the other places that you wind up with, like the Lambda, I can actually go look, it’ll actually give me a piece where I can look up what that is by clicking through. In this way, I actually have to go find this, find my image tag, go copy my image uri, I think that’s right. Yes. Then I go back over here and I paste that in. Report mappings, I want 85.1 in this case, because it’s a streamlit container. You can add more or less, I don’t really care about all the environment stuff. Whoops, I have to give a name for my task definition. The other thing I ran into, trying to create some of these and they wouldn’t, like I’ve had a streamlit task definition and it broke, so I removed it and I created a new one and then it complained that, well, this name is already in use or something. Currently it takes, some of these items, it takes like an hour for it to clear the cache, so you can reuse a name, which was a little tough. In the app environment, we’re going to go with Fargate for server lists. Yeah, that’s fine. These, I basically rolled them back, as low as I could possibly go. So here, I actually want it to use a quarter of a CPU and then I can drop that down to 0.5 gig of memory. I’m going to go with the task rolls. I don’t want this, we’ll get into some of this later because I think this is part that’s still broken for me. I don’t really want a storage volume because I get 20 gig for free or something like that. I’m turning logging. This is something I need to go figure out if this is actually costing me money where that’s coming from. Anyway, you hit next. So, container 1, blah, blah, blah. I think that’s right. What we had done. Oh, now it’s being created. All right. Fun times. So while that’s often running, apparently it says it’s active. I’m not quite sure if I don’t know if I trust that or not. If I go here, well, it looks like it’s active. So I’ve got a new definition for task. So then the next thing you wind up doing, you go to the clusters, the cluster you’ve got. So I have a task. Now I just need something to actually spin up and run or I have a task definition. Next up is to run something using that task definition. Hold on one second. Let me see what this message was. All right. So the next thing you can do is you have tasks and you have services. The thing that’s weird is that if you run a service, it creates a task for you, which is, I’m just makes you wonder why I weren’t we just running a task. The main difference is the service when it spins up, like I’ve got this one for streamlet service. Let’s see if I can go there. For the, when you run the service, you can actually tell it how many tasks you want to run at any time. And you can do a minimum level max level. And this is kind of a thing to where, depending on resource usage, it will go ahead and create new tasks for you. And so apparently we just, yeah, I just expose that thing. So anyway, you can go to configuration tasks. And this actually right now it only shows me one task running yesterday. It showed me about the last 10 that had failed. And it would try about every eight minutes to spin up a new task. Because I didn’t have one actively running and I told it I need at least one. So then under services, we can do something like, if I wanted to deploy a new service, transcribed cluster, I want a service. Family, I wish they would have used the word task definition here, because that’s actually what, what this is. A service name. Desired task, deployment options, load balancing. We’re going to leave that off for now. I think you’re supposed to be able to actually use an existing load balancer. I don’t want this one. Target group. What’s the tricky thing is it? I don’t know if I wasn’t able to get this to work. So I’m not going to try this now. But I don’t the load, this load balancer already has a target group associated with it. So I don’t know why it’s offering to create a new target group. So. This case, I’m just going to say none. Networking, this is very important. Because when I went through it automatically to create my, oh, virtual cloud, that’s it. VPC stands for. So you pick your VPC initially it had created four different subnets. And I added them all in. But then when I actually set up my, my load balancer and some other security groups, I had only picked two out of the four. And the tricky thing is the service will spin up this task and associated with one of these subnets. And at one point I was all messed up because it was. Sometimes it would pick the subnet and it would work because that particular subnet was routed correctly. Then it something would happen. It would fail and it would come up again and couldn’t create the container because it was putting it in a subnet that didn’t have a route to something. So that’s where you got to know about subnets security security groups are very important. So we’re going to use security group will get into that in a minute. And of course, I’m going to use a load balancer to actually access this. So I do not want a public IP for this. Where a lot of the tutorials you walk through will just, yeah, throw a public IP on that. Then you’re good to go. Which is super simple if you just want to, you know, you know, I would consider doing that for some kind of a quick and dirty demo app that’s going to live for maybe a day or two just to show show somebody that something’s working. But long term, you definitely don’t want that. So I can hit deploy and it says, hey, it’s a deployment might be a few minutes. Some of these screens automatically update. Some of them don’t. Now I’ve got my new service name. I can go in here and see. I really don’t have anything yet. So configuration and tasks. It says, okay, I’ve got new. I’ve got this task and it’s provisioning. If I click on this, I can actually see it’s already got a private IP address. Definition a subnet ID and an E and I ID, which this is basically the network interface that connects this container. At that particular IP address on the VPC. The one thing that got really hard to troubleshoot. Was when the task fails and it doesn’t create the container, it goes ahead and removes this, you know, network interface. But still has a link to it in this definition as if it were still there. So that’s tricky. So we’ll wait on that to come up and run and see if we can add that as a, as a link. So let’s see so far. Let me jump back over. See any questions so far on the funness of this stuff. Yeah, the load balancing stuff just looks much more complicated than I expected. Yeah. So let’s drop through. I still don’t think it’s. Is this new service running and running? So it looks like we’re up and going. Good to you. You can’t get to it yet because that I period isn’t available. We’ll hit that in a second. So let me drop over to, I don’t remember what Windows I have open, but. The VPC is basically your second home. So you wind up with two basic windows, always open one looking at your ECS stuff, which has clusters, task definitions and all. Your other is your virtual private cloud. So this is where your subnets live. All of your VPCs. The other good thing is you can actually take this whole thing. Because I have two VPCs. One of them gets, gets created by default for every, every time you, you, a new AWS account automatically gets a basically a blank VPC. So that’s kind of fun and distracting and confusing. So we’re going to pick this guy. Which I was hope this part over here refresh resources. There we go. Well, it still shows me two VPCs, even though I’ve filtered by this. So anyway. If I look through here. Okay, so I filtered I’ve got my VPC, which if you click on this guy, it shows you all of the different things associated with it. An other interesting thing is one of the default tutorials I had walked through. Because A WS on some of these, it will automate, it’ll offer kind of like what we were seeing with the load balancer. It says, hey, do you want to create one of these for you? And the first time I went through as a chair, but then some of the options they give you don’t like work or they work for one thing and they change it and now it doesn’t work. So that was important to not do that and just start and figure out what I was doing and then kind of walk backwards. I think this is the third or fourth VPC I’ve created trying to troubleshoot all of this stuff. So the ones that were created automatically would wind up with a routing table per subnet and multiple subnets and a bunch of stuff that didn’t matter. In one case, it created a subnet, four subnets, two for each availability zone, one with the name private and another with the name public, even though they were actually both private, it got really, really weird. So anyway, this routing table is important. I guess that’s there we go. So we have a couple of routes. We had to make sure you have to have this IGW, which is internet gateway, which tags in basically everything from, I mean, this is basically this kind of CIDR, which I used to know, but forgot a long time ago. This is basically all, all IP addresses coming in from our internet gateway, because if you don’t have an internet gateway, you don’t have a connection to the internet. So your load balancers never going to see anything. That was fun. Local is basically, this is grabbing 10,000, all the way smaller number here actually means a wider thing, wider section coming in. We used to use subnets a lot, would like class A, class B, class C, that apparently is not how you do it anymore. And I just know that the 16 means basically anything, any number in that highlighted section is now available on this VPC. So good thing to know. Subnet associations, we don’t have anything explicit. These just pick up automatically. And then the two subnets that are in my VPC, I’ve got 10,000 and the dash, the slash 20 means that this one basically goes up to 10,015.255. And then the second one picks up from there and goes up to a bigger number. So I’m not going to do that math in my head, whatever, I probably, what 29 something like that, that 255. That’s fun. I don’t have anything set up for edge association or route propagation. And then so that subnets, internet gateways, I think of already get that set. The other really important thing was security groups. ACLs, we go ahead and look at the ACL. It’s fairly boring inbound rules is basically yell out everything outbound rules. Yeah, allow everything. There are all kinds of information on how to make this more secure. I haven’t gotten there yet. And then the other thing is you also have to have that associated with your subnets. So that’s kind of interesting security groups. Everything started off if you go through the streamlet two totals you can find on how to run with fargate they’ll teach you how to. I don’t know what’s going on that open a security group and then open up port 80 51 on your security group, because that rule has to be added or else, you know, everything else is all port 80, you know, that kind of thing. So in this case, I created two secure and of course that works until you add a load balancer and then you’re all messed up. So I’ve got two security groups. One for the a ob as automatic load balancer. I guess it could have been a better name and the service one is for all of the containers we’ve got running. So if I look at this security group for the containers, the way it’s set up is to allow, let’s see inbound rules. There we go. I want all traffic as long as it’s internal. In other words, if it’s coming from within the subnet, that’s fine. Otherwise, allow all traffic from this other security group. So that was something else that I’d never run into before where I can have one security group basically routing traffic to another security group that was and this was a little tricky to get set up right. This was also the one that initially I had everything public and my containers worked fine. And then as soon as after I removed, I went into a load balancer. So I added this security group. I actually had removed this line. Unfortunately, that also meant that my service that spins up the container did not have access back out to like S3 or the container registry or anything else. And that was really, really tricky to try to troubleshoot. So then if I go find, okay, so I’m pulling in from this security group. If I go back to my list. And go here. This one shows I’ve got only port 80 available, you know, basically from the internet. So nothing except for port 80. And this is one where as soon as I get a certificate and loaded on there. At that point, I set up a rule somewhere to reroute 80 over to 443 and then only open the secure port there at some point. So that security groups, the other thing. I think load balancers on this list. Somewhere, let me see. Nope. How do you get to your load balancer? Let me see. This was routing tables. Oh, that’s is that back over here. No. I was thinking it was over here somewhere. Let’s look. All right. EC2. Yes. Okay. So this is fun because, and again, it gets really interesting. This is why I started working through this piece to actually draw this out because as you’ll see as we go through a lot of these links. A lot of these things connect to a lot of other things. And there are multiple windows they show up on. And it’s not like there’s it’s all. I don’t know. It’s like it’s like they were trying to figure out how do we organize this. And then they went and like drink a fit the jack Daniels and then decided stuff. So a lot of it makes no sense like I’ve got security groups here. That we just walk through. I’m already lost on VPC. Well, here I’ve got some nets, routing tables security groups. But I’ve also got, you know, security groups. So that’s fun. So anyway, walked into load balancer. So I decided I wanted to create a load balancer because I read a cool article on how it works and all. And so you create one. You get it set up to where it’s listening on port 80. You can also decide which availability zones and subnets that you want to listen on. Which the other interesting thing that happened was initially I mentioned that I had it created like a public and a private subnet for each one of these availability zones when you create a load balancer you pick availability zone and then you it was giving me both sub nets and I can only pick one. So this is where I’d wind up in a weird thing to where my load balancer is listening on subnet one and two. But then I had all four in my service description. So if it happened to create the container in subnet one or two, I was fine. If it created the subnet and some other sorry if it contained in the container and subnets three or four. And now I’m lost trying to figure out what they have. So anyway, that said I wound up going and throwing away that VPC and create one from scratch and only adding the base the bare minimum things I had to have which I should have done to start with. But you know, this is why it’s called learning load balancer has to be associated with a security group, which is what we just walked through there. A lot of fun stuff enabled. And then you get to the point where OK, so now your load balancer has what’s called listeners. So you can create a listener on on one port at a time. And then basically route that to and this is kind of interesting. Let me see if it’ll show up routing to a target group. And view and edit rules. See if what this looks like. Yeah, this is pretty pretty interesting. Not really. So when you set up a listener, you can basically do things like forward or you can reroute. You know, things like that. So I believe after I get done with the HTTPS side of this, I can actually create a listener on 80 and use this piece to actually reroute back over to the listener for 443. So that was fun. The other thing I’m trying to think of what I’m missing here. Oh, target groups. I was expecting to hit the load balancer and have the settings on the load balancer for how it ba lances considering the name load balancer. But that work is actually down in a target group, which are kind of interesting. When I created the first, let’s go back a little bit. Let’s create a let’s start three creating a new and from scratch. This is kind of showing you some things. I’m going to do because this got tricky as well for fargate. This selection has to be IP addresses. Just because it doesn’t work. Target group. Well, that’s all messed up. And the other thing I messed up initially was, well, yeah, I’m going to be accepting it’s connected to the list neural port 80s 80s. So I’m going to leave that alone. Actually, this needs to be the port that you’re routing to. So that was that through me for a while. You have to pick your VPCs. We’re going to pick the one I picked up for the transcribe. I haven’t messed with any of the protocol versions. The health check is something it does automatically. Then you can create next. And the other fun thing. And at first, I was trying to do a bunch of stuff here, because this is where it put me. And it took a while to figure out it puts me halfway down the page. And you actually have to scroll up to actually do anything, which is a little weird. So you get to hear you actually can add the IP addresses of the containers that you have running, which I’m still not quite sure how this connects. If you have containers automatically getting spun up. How did that how do they know how to attach to the existing load balancer. That’s something I’ve still got to work out. So anyway, those ports, let’s you add targets and things like that. We’re going to cancel this and just go look at the next one. Look at the other one. Wait, did I have to do something different cancel? Are you sure? So this lets you cancel your cancel. This is how bad some of this UI is. No, to cancel, I have to leave. So at this point, what I want to do is go back over to where I had my cluster up here transcribed cluster. I got these services running. I want to grab my new service, which has this container running, which is at this IP address. So I want to copy that. I’m going to go over here to my target group. And I’ve got this one already set up with some set of registered targets. These are the ones that I had killed that failed. So I can add a new target here. So that network, that IP address and automatically fizz out the port. Next thing you have to do is include is pending below. Then you have to scroll down and hit register pending targets because just because it’s on this list doesn’t mean it’s on anybody else’s. So that was weird. So I get that set up. And now I’m actually attempting at some point this will actually come back and say healthy because it’s actually doing a status check on it. I’m trying to figure out somewhere on here you can actually actions. I don’t want to delete. I may have to wait until. Okay, so I’ve got too healthy. I’m monitoring health checks. There’s a place on here somewhere where you can actually load balancing algorithm right now is round robin. This is one where you can actually set some, you know, some types of changes. I’m looking for the setting somewhere that I was actually able to to say I want 10% of the traffic to go to target one and I want the rest to go to the other targets. So I think there’s somewhere I’ve seen that you can actually do that. But that’s a little beyond this one. So at this point, if you actually go back to the load balancer, wherever that was should be on the screen because that’s where I’m at. I have a server setup at that DNS name that is routing traffic to two different containers. So that got that got me a little excited because if I actually it was a large, large pain to get all of this setup when all I want to do is host a container. You know, a couple hours in, I’m wondering why didn’t use Heroku, you know, a few more hours I’m looking at, well, should I just pay streamlit for their service. You know, another day in and I’m wondering, shoot, should I have just looked at, you know, how hugging face does their spaces. Or how they host stuff. So anyway, the major pain, but it does give you absolute control over what’s going on. And the ability to, at this point, what I’m looking at next is to get a certificate for a HTTPS. And then to do that, I actually have to, there’s a couple of different ways the way I’m probably going to do is to create an A record on the HSV AI, you know, DNS. And that way it can validate pretty quick. And then basically set up that streamlit instance to be basically I can take a, I’m thinking transcribed at AI dot huntsville.ai or something, then reroute to that streamlit container. So that is pretty much it. Let me, and it’s six, what time is it? 650, that’s about right. Let me actually any questions so far than I’ll kill the recording and cover anything else. Any thoughts? I mean, is it? I mean, it was a heck of a lot more than what I thought I was getting into. Yeah, I thought traffic was complicated to get set up to do a lot of those things, but it is significantly easier than using AWS’s stack of tools. I could see that being useful if you need it to scale across like all their clouds and scale up thousands of units really simply and easily. But man, that’s a lot of hoops to jump through like everything you’re looking to do, you can do a traffic and you could do it in about 20 minutes. If you just have some room machine. The other thing I thought about was Amazon has a cloud formation thing, which is basically a giant YAML file or whatever, where you just spec what you want. We didn’t even get into permissions shoot. Anyway, more for more for next time. So doing everything I’ve got what I wish I could get was, OK, I have this current setup in a VPC with all of this stuff. Can you tell me what cloud formation file would have generated this? You know, go backwards. Because that’s something Matt Brooks had mentioned was the problem with this whole thing is if something gets screwed up or whatnot or. There’s a hiccup and it gets lost somewhere now got to actually go through all of these button clicks and all of this crap and I might not even have a you know there’s not it’s not version. I can’t just oh I totally goes that up get reset hard please you know yeah you’re going to need a YAML configs and something like terraform to drive recreating it. Yeah, and I’ve seen that I mean terraform actually has a way to go from there you can build this out with terraform. That maybe the problem is it doesn’t have really anything to do with email so I’m just kind of like well yeah that’s great but. That sounds like somebody else’s meet up. Yeah, yeah, I want to try to like rope myself in and that’s one reason why I don’t deal with a lot of like hosting Amazon services at that point because. Yeah, I just want to get a demo site up and I want to get it hosted it’s easier for me to just create a like a VM on AWS and then just spin up the containers on that machine through the traffic reverse proxy in front of it. Just get a static IP address and then just point the domain name at it and that’s a graphical handle getting certificates that will handle routing to the services. Yeah, I think I’ve been able to quickly spin up more services with Docker compose files. I don’t have to dive through AWS menus to just create something. Right, we may want to well, I don’t want to walk through that maybe we will. But yeah, I probably would have gone that route of if I don’t know how hard all of this was going to be to the hard part is all troubleshooting because it’s all in weird log files where I can’t. I can access this IP and you’re like well crap, why not and then you have to go figure out between two sub nets, two security groups and access control list, a routing table and what not why can’t you get there from here. But anyway, that that being said, let me show this on the video while we’ve got it. So actually set up here’s my whole little thing on Amazon. I go browse files go find them, you know, some kind of a test way file. It’s going to complain at me in a minute because I don’t have the permissions to actually load this into s3 from here. And then here’s my little password that I shared earlier. So can upload to s3, but what I should be able to do. Actually, let me stop the video and I’ll continue a minute because we’ll cover some of this next time. All right, if I can figure out how to stop not stop video, stop recording. Yeah, there it is. Yeah.