Transcription provided by Huntsville AI Transcribe
Okay, so what we’ve been looking at lately, or what I’ve been looking at lately is Dayjob being overwhelming and somewhat of a dumpster fire, trying to figure out, hey, what is it that we can talk about, and somebody came up with the bright idea on a Discord server that, hey, let’s throw some agents like your transcription service thing and see if we can just create one, you know, and I thought that was a good idea, then the second thing I thought was, oh shoot, it’s probably going to be better than what I wrote, it’s going to be quite embarrassing, but hey, I mean, that’s kind of what it’s for, is hey, let’s see what happens. So the other side of that kind of conversation is if you’re here or if you’re online or watching the Discord server we run, or we kind of hop on, it’s Tech256, kind of hosts our channel for Huntsville AI, so it’s a pretty good place, we’ve got good conversations going, sometimes people come across stuff and say, hey, what about this, have you seen this, and usually it’s good stuff, it is a troll-free environment, so there’s that, that was a fun week, yes, it was a very interesting week, and it wasn’t even, it’s interesting. So the thing that I’ve learned through a lot of the experience dealing with that is if you do the work and you are, I guess, active and you’re actually doing something useful and things like that happen, you don’t even have to stand up for yourself. Other people, you know what I mean, it was kind of interesting. So I also, I haven’t removed anybody from Discord, that wouldn’t mean, that was somebody else saw what was going on and saw the whole thing and went, okay, this is not, yep, bye. So there’s that. Another, a little bit of intro stuff coming off of the AI Huntsville Task Force meeting today, they are now license certified, all the right stuff, registered, I guess the right word, as A501C6, so they are now an official charity, which is good, because up until now they were just waiting to get the thing back, so they got the city and others trying to give them money to do things, they’re like, okay, you gotta wait, I can’t take money yet because I’m not a charity yet because of whatever. So things they’ve done so far have been connected with the Alabama STEM Council, where there are things that the task force is working, mostly K like 9 through 12 education, they’re doing a full series in July, hosting a hundred and something teachers going through, here’s the education system laws right now, here’s the policies, here’s what you have to do, here’s the state guidelines that want, you know, you have a federal guideline that says we want more AI in schools, you have a state guideline that says you can’t have phones in the classroom, they have to physically like be locked up or something, and they’re trying to figure out, okay, we’ll have to have AI, if I’m not city, if I’m not city of Huntsville school, I don’t actually have a device to do things, you know, so I’m in the county, I don’t like have everything, you know, so they’re trying to figure out that what do we do, anyway, they’re talking about things like that, and the next thing on that docket, I’m doing a session for them in August, more on the general public, you know, how do people use AI and stuff like that, and I’ve come across two very interesting things that I’m kind of interested in a little bit of feedback, or if y’all have other stuff to look at, one of them is a friend of mine, and a friend of some of the others in here, is a product manager, and also a mom, and went and built, but I don’t know if this won’t post until later, because she’s got to tell her husband she wrote it, she made a children’s book using chat GPT, and just a bunch of pictures and stuff, and made new pictures and all kinds of stuff, it’s like the adventures of Jack, whatever, you know, and so all the images were chat GPT, the text and stuff, the storyline, you know, was heavily, all that kind of stuff, and that was like zero to that, it was, according to her, it was fun, you know, some whatever, just one of them, it’s not like, you know, so there’s that, that’s kind of an interesting thing from a general, you might, you could do things like that.
I came across another friend over the weekend, actually doing tree cleanup, actor, some crazy winds, hit our mountain to study also, helped that, the husband said, hey, I heard you do AI stuff, like, yeah, why?
So he’s like, well, I’ve gotten interested in it, things like that, and you know, looking in, I’m running a podcast, stuff, David, you’re a painter, you paint, this would be a commercial painter, house painter, all that kind of stuff, he has chat GPT on his phone, connected to his headset, using voiceover, he talks to chat GPT for about eight hours a day, he has, he’s got, I think one main use, I mean, he’s looking at writing a book, so he’s doing research on this, research on this, store this to this, do this, I don’t know how the, because I mean, there’s a secondary thing, I think better when I want to walk, I’m physically moving, doing something that doesn’t have to engage my brain, but my brain doesn’t have to think about my, you know, it’s just a weird thing, for him, it’s painting, it’s therapeutic, lethargy, whatever right word there is, but he was like having conversations with, I’m like, how did you, well, somebody said something about it, I tried it, I’ve been using it for like six months now, you know, I’ve got an outlook, I’ve got an outline for a manuscript, this, this, this, I’m like, oh my gosh, I don’t think this association, a career path, I need to talk to this guy, he’s a very, he is very philosophical, thought provoking guy, his oldest, I think is 14 or 16, I can’t remember, and he is, I mean, the most of our conversation was what kind of job is that, is going to be there when he gets out of college, you know, you’re a good, I don’t have any years that is, 22, it depends on how smart you are, and how many tours of college you take, but we’re just looking through that, as far as what kind of, what did it, where did you, what jobs are, what does my job gonna look like, I don’t even know, so the only advice I had for him was, you got to be good at problem solving with the tools you have, and then when you run out of tools, be able to make more, do that, and then do anything, so anyway, just a very interesting kind of a set of events or something, so now for August, I’m interested, I’m kind of curious about, hey, here is somebody that’s not an AI person, you know, I mean, this is a, that’s using this to do this and this, so if you needed to, maybe you’re working on something, and you’re trying to, it’s kind of like, while you’re at it, you know, do these things, the book thing was pretty cool though, I’m waiting to see, I don’t think she’s got it yet to see, so with that, I guess that gets us to tonight, so what we’re going to try to do, here’s a transcription service, it’s pretty dumb, you know, you can put in your email address, you upload a file, if you don’t know the secret little code to use, you got to pay five bucks through Stripe, and then after you do, it’ll enable a button that says transcribe my file, it’ll go off, transcribe your file, while it’s transcribing, it does a live update to the screen, so you can actually see it, and then it’ll go off, and then it’ll go off, and then it’ll go off, so you can actually see what it’s doing, it’s a little slow, because right now it’s running entirely on an AWS Lambda, so good news is it’s incredibly cheap, you know, to the tune of, I think I’m at like eight cents per hour for transcription, which some places you’re going to pay 25 cents a minute, you know, I mean, it’s like, whatever, we could, and there are, a lot of people that use this, it’s probably not even enough to pay for the, you know, the hosting, but I like to use it as an example myself, when I go try to roll out something, or whatever, just to, it’s a good playground, and it’s a live thing that I can actually show, or whatnot, and you know, just a bunch of stuff, the key item is, from a market perspective, I don’t know what you’d call it, the main thing, you don’t have to subscribe to anything, you don’t have to make an account, it’s easy, I don’t care that you don’t come back, whatever, it’s like, oh, okay, here’s the transcript, you don’t have to wait for it, you can go, I’ll send it to you in email, that kind of thing, the other is that it’s all containerized, and I really don’t want to pay to store your stuff, so we transcribe, throw everything away, I don’t even have it, so, bad news is, if something goes wrong, and I get an email saying, hey, this sucks, I don’t have the thing, to go back and see what happened, but that’s okay, that hasn’t happened yet, I use this to actually do the transcriptions of the video that we’re, so I am transcribing myself, and my transcription that you’ll see, and at some point, if you look at the videos, so, with that, I had pretty much written down the technical stack that I use, everything’s on AWS, Docker for everything, mostly in the case I wanted to move off of AWS, which is what I’m working now, I’m looking at throwing this off of Lambda over to RunPod, because I also figured out the other approach to lowering the cost is the speed that you can get a transcription, so, yes, Lambdas are super cheap, but it’s extremely slow, it’s all CPU, we’re doing, you know, I think I’ve got a medium model running on that, something like that, I have found that using the later version of Faster Whisper, I’m not sure what version it is, has a batched inference, where if I run it on a big GPU, I can transcribe, I think I did an hour and a half worth of a transcription, I think it was like a minute, maybe a minute and a half, on something that I’m paying 26 cents an hour for, so, if the pricing works out as a wash, but I get it in a minute and a half, and I can’t even do that on a Lambda, I have to actually chop it, because the longest you can run a Lambda is 15 minutes, so, anyway, just fun stuff, but that’s the reason for Docker, of course, we’re using Faster Whisper, we’ve talked about that a good bit, in this case, I wouldn’t be surprised if some of the Vibe coding throws you at an OpenAI server to do, affect their version of Whisper, that’s perfectly valid, but the reason I didn’t is because I would like for this thing to be totally enclosed, as in no crossing, anyway, outside of things I don’t control, right now I’m using Flask for a backend, it’s just Python, it’s super simple as route or whatnot, that does the upload file, that does the interaction with Stripe, things like that, I’m using Stripe for payment processing, for the single reason that it’s stupid, simple, and really easy, and a lot of people use it, so if you have problems, you just go ask, and there’s examples of what you screwed up, and the main thing that I screw up is I’m doing testing and stuff, and I’ll forget, I’ve still got my test key, and I forget, and now something doesn’t work, we’re good, well, anyway, here’s that, we’re using Python and TypeScript, so Python on the backend, TypeScript for the frontend with React, bonus, the thing is also set up in a full CICD pipeline with GitHub, so if I modify the frontend and I commit that and merge it, there’s an automatic pipeline that picks up that frontend change, pushes it over to another repo, builds the docker image, uploads the docker to the registry, and it goes and restarts the service, so it is a button push and then within 30 seconds it’s live, so that kind of thing, you don’t have to do that, but I got tired of remembering how to do that, so I automated it, and that’s all using GitHub workflows and the Terraform to actually manage the infrastructure, these days I would probably highly not use AWS for the full infrastructure, everything you do gets nickel and dime to death, it’s a, oh, you need an internet gateway, oh yeah, I gotta kind of like reach the, it’s an interface, so I have to, oh, that’s, you know, some of the things are free, some of the things are not free, load balancing, not free, and it’s the thing, I don’t need load balancing, that’s fine, you have a load of one which is balanced, so here’s your, you know, here’s your fee, I’m like, but that’s not, anyway, so there’s that, that’s kind of the tech stack, the basic requirements, a user can upload a video or audio file, set the email address, because this comes back to you in an email, it’s a somewhat nicely formatted email that, you know, says, hey, thanks for using this, here’s your transcript, you know, along with the backend of click here for your next one, you know what I mean, to try to draw you back, they can pay that starts a transcription service, so the app updates live as it’s transcribed, that was one of the hardest things to actually implement, was that I’ve got a lambda doing a transcription, but I really don’t have a connection from that back to the, so right now, the user puts the thing in the web interface or what not upload, pushes it to a container, which throws it to an S3 bucket, and then triggers something to run, and then sends the email, so crossing that back over, so I wound up with using one of their PubSub kind of messaging broker things to actually shove that back over live, that’s probably not the best way to do it, but it was an hour’s worth of work, and it worked, so I left it alone, so you get the email when it’s complete, I don’t know if we would go so far as to worry about the, there’s a whole nother set of work to actually make sure that email is signed correctly, and all this kind of stuff, so it doesn’t wind up in their spam folder, that’s a hard, it’s, I’ll cover this in a second, then after you’re done, everything gets torn down and thrown away, and we all keep receipts, other than the payment receipt, in case they need a refund for some reason, and then the other thing, which was kind of my primary thing to start with, because I got so tired of everything, I just need to transcribe a file and everything, want an account, or a subscription fee, or whatever, I’m like, I don’t know if I have more than one of these, I, it’s, can I just do this, there are free ways to do this now, it’s not that hard, if you’re using like teams, or if you’re using these other things, they already have a way to transcribe a meeting, or do whatever, it’s already there, so, but from an AI app perspective, it shows how fun it is to build something, only to find out it’s offered for free in a lot of other places, by the time you get done, that’s always a thing.
The other interesting thing from an AI app perspective is all of the minute little detail things that drive you crazy, that have nothing to do with AI, or computer science, or anything else, you know, send an email, okay, there’s a service that sends an email, that’s great, what address are you sending it from?
I’m sending it from transcribe at hsv.ai, okay, I need you to do this, and this, and this, to be able to see, you know, and it’s, I mean, I need somebody with like some, you need half an IT background to do some of this stuff, oh, anyway, there’s that. So, we’ve been talking lately about Klein, and about some, we did all hands dev, or open hands first, where you create the, and all this video stuff is on there, where you attach something to your workflow, and it kicks off a thing, and it goes, it does a task for you, great, it’s like a, it’s like an extra person doing a review of my code, you know, something, we tried a couple of things, it wasn’t super great, it built something that almost ran, I think Jack was able to get it to run, but it was, what I, what we did with that, was we pointed it at our example repo we use for like hackathons and stuff, and said, hey, I’ve got another thing, add to this, and it went and created a Jupiter notebook to do the MNIST dataset, and walked through that, and it almost worked, it created its own merge request, and it, you know, it was kind of cool, it did do something weird, where it wasn’t hard to get working, which is kind of where we’re at right now, it seems like a lot of these models and agents can get you 90%, and it’s just like the little pieces that have to have a person go connect the dots, or go, oh, okay, you’re really meant to do this, the cool thing about it on that one, it used, it brought up like a thing in a Jupiter notebook, where you could actually draw your own number, I didn’t even know that existed, I’m like, okay, so the one of the issues you run into with any developer, is we use the tools that are in our toolbox, sometimes we don’t actively go try to learn new things, if this is in my will, you’re going to get what I know, especially if I know it can meet the requirements, but it is kind of interesting to see something pull in something totally outside the sphere of what I’m used to using, I didn’t even know I could do that, last time we went through Klein plus OpenRouter to do some code stuff directly in VS code, things like that, I’ve been still playing around with that some, I’m stalling a little bit to see if any of you guys that are actually working on this, are anywhere close to doing anything, but yeah, so I put it in right when we started, just into Opus 4, yeah, and it is 10 separate codes, complete several documents laying out the plan for making it, and how to implement all the code, and it’s still running on.
Can you share that for a second?
That’s what I’m working on, give it up here for us. Okay, I can throw the host over to you for a second, just to, and that’s Opus 4? Yeah.
Okay.
Yeah, I’m on like 14 sets, 14 sets, I don’t know what we’re on for OpenRouter at the moment, we’ll log into that for a second.
I’m using the cheaper client, so I can use it all day long. Oh yeah, let’s see, I had been fighting to get this unit test framework still on the Flask side, right now I’ve got something weird on the tests, initially we did this last week, and it built it built a full-up unit test suite, let me see where we’re at, it’d be activate, so if I go to Docker, last image, high test.
So a little bit further, what we found after the one that it built was that it looked at the routes that were available, all that kind of stuff, it would not have been able to find, I have certain things, parameters set in a .vnv file that it doesn’t have access to, it has no way to know what my credentials are, whatever, so I got that part fixed, and now I’m working through some issues, it’s working through the socket.io part that I use for the feedback, and there’s something with the session, where the session on the Flask app is separate from the session that goes with the socket.io, and there’s something weird, and for the life of me, I haven’t figured out how to do it.
Anyway, it’s running through some stuff, it’s still got a lot of failures, but it was worthwhile to at least get, I think like 26 unit tests actually written, even though they don’t exactly pass, but they are the things that I would test.
So if I look at all the routes I’ve got, and I look at the different ways I would do stuff, yeah, it’s solid as far as what it’s trying to do. So even now, you could look at, I’m not sure if this is, have you asked it to add some debug so that it can, I’ve asked it to go, so far I’ve asked it to switch between two different mocking systems for the AWS site, because AWS on the Boto 3 stuff actually has its own mocking framework that’s separate from, it’s interesting, but I’ve also got a couple of, I might have an hour or so into this, so it’s got 38 unit tests that it created, all of which would be things that would need to be tested, and 20 pass, and it did a thing that was more useful if I, it’s hard enough for me to write the tests as far as that goes that I have neglected to do so, because at that point, I got something that’s, post-product development, you get something that works good enough, and then you stop spending time on it, you move to the next thing. So that is one of the places where I’ve, right now, I like to use agents and the AI side to do the things that, to kind of extend what I’m working on, do the parts I didn’t want to write in the first place, and I need to go re-up my, on the open router side that we’re using, and the other thing on this that we covered last time, it’s very interesting to see, yeah, I think here’s where I said, hey, change the mocks to use Vodacore Stub Stubber, and that’s all I gave it, and it went, it figured out what that was, it went and changed out what it had written before, and now it’s using that, and and you can see what it did, all that kind of stuff, if I scroll all the way up, you can see the full, oh, so okay, so this was, let me, this is where it’s basically explaining what it’s doing, come up with its plan, it’s making, there are some things that, I know we’re covering some of this again, for the client tool, there’s some things you can have it auto-approved to do, there are other things where it can’t do until you click a yes do this button, so in my case, it can’t save, it can’t actually modify any file I have unless I say yes, I want that, but it can do a lot of other stuff, and this is kind of interesting watching it go, where you’ve got its response, it’s got the actual, is this the tool calling piece on where it’s the JSON part, where it’s actually, okay, I need to read a file, I have a tool that reads a file, tool, please go read this, you know, yeah, that’s the JSON, or the JSON color, and then so it gets that information back, this is, hey, I’ve read through this, I see test routes, I see this, you know, next step, okay, let’s read this other file, okay, I see that, you know what I mean, and it’s kind of interesting to watch, the thing I don’t like seeing is the price as it implements as it’s going because it’s making calls and calls to something I pay for, but it was super interesting, uh, let me hop over to, let’s, uh, to OpenRouter.
So I can either publish artifacts on Discord or I can share the screen.
Let me make you, if you’re, if, are you locked?
Yeah, we’re waiting on you to let us know. Oh yeah, no, I’m not ready.
Okay, uh, I’m sorry, okay. It actually, it actually just finished. So perfect timing. You are the host and you can share your screen. Reminder to close any windows you don’t want to share. I’m just going to share, sorry, this is a slow laptop. All right. Not coming through yet. You know, you have to stop sharing first. I can. Hey, yeah, if you want to take it back over after enable some stuff again, it’s been off of my computer. See, what’s the joy on that? Getting you back into zoom after a while because you have to reset your permissions to share. Yes, I’ll batch her a little bit. Is there other tools besides zoom that y’all normally use or like better?
Yeah, it’s, it’s just, it’s much easier to control like which screen you’re going to share and it’s, uh, it’s much easier for people to get in on the voice. Okay. It tends to be clearer. Okay. Can you record?
I guess you can. I could use other things on my computer to record while I’m on. Good. I haven’t tried recording on discord.
Okay. I’m not sure if you can see it, but I’m not sure if you can see it, but I’m not sure if you can see it. Okay. Okay.
Is this zoom?
I pay for the next year, like in January, and it’s a little over a hundred bucks, something like that.
Not a lot, you know, it gives us what we need for doing this. The main thing that bugs the heck out of me for zoom is if you want more than one person to be able to start a meeting, you have to add them as an active account, which is another full price of a thing.
So next week, we’re doing a paper review on Wednesday.
I will be in the mountains of North Carolina on Wednesday.
But at 545, I will pull over, open up my phone, start the meeting, Josh will log in. That’s what we did last one of the times, I think.
Make host, and then I flip over to audio in case I drop a connection or whatever.
But it’d be so much easier to say, hey, here’s the three people that can start meetings. I don’t know if Google keeps throwing me their stuff because I use Google Workspace, but I don’t know if anybody’s got a lot with Google Meet or if that’s fun or not fun or anything.
It’s always worked well for me, but they’ve always been about the same as everything else, at least on the receiving side, I’ve not set up as much. Okay, I knew it, because I already get that. And they keep going up on their prices for Gmail stuff because of all the AI stuff they have that I don’t use. Let’s see if you can get me back in there. Okay. You are, I think you’re still host, but I can… I said host disabled attendee screen sharing, yeah. Okay, let me reclaim host. I will go see if I can make host. I’m missing something on here. We’ll see.
Anytime I click screen share, it still says host disabled attendee screen sharing. Did you disable attendee screen sharing? You’re the host. Let me see.
We won’t let him take it. I keep seeing you as joining. Maybe, wait, I have a David S that’s a host and a David S that is a stand-up. Okay, so my last instance is still in there somehow.
Okay, I only see one now.
Okay, so try it. You might. Are you the host?
Make a yes. I am now.
That was the problem. Hey, and we haven’t made two clients.
Cool.
Should I enable my audio or just can they hear? They can, we can hear. I would make sure you’re muted. I am muted.
So all I did is as soon as you showed on the screen of the presentation, as soon as you showed what we’re going to ask in form, I just went into the email and copied over. I said I need help making a transcription service. Constraints, basic requirements, word for word we had.
Please lay out the structure and everything for this needed with simple step-by-step instructions.
Did you ask us to be free?
And you’ll see the only thing I had to do is hit continue at one point where it was finally like, yeah, I’ve worked long enough to I’m going to keep working on this. So everything it’s done was with that one prompt. I haven’t checked or corrected anything.
And it started off for people that don’t know Claude, this is the artifact window for all the files that it creates. And it began with the project directory structure for how everything should be organized.
And I find that’s pretty great from Claude because I can usually take this other AI tools that aren’t as good as organizing the structure, give this to like a Gemini and say, hey, start from something that has no dependencies, make that first then start building whatever has dependencies on it.
And it works really great off of Claude’s layout. So this first, next up.
It’s very detailed, especially on the native US. I will share the actual code for the transcription service later we’ll actually see solidly what it did matches exactly the way I laid out code. Yeah, so next up it decided the flask backend. So laid that out next. And any of these you can either directly copy and paste or you can choose to publish to a link or you can download on your computers whatever file type it should be. So it’s a Python script.
You’ll download as a .py file.
It’s named Gregory, you can just plug it in. Next, the transcription worker service. The copy button used to work though.
You could just hit copy and then paste it into your IDE. Yeah, yeah. But now it doesn’t. Now it puts extra characters in there and it messes it up.
It says they updated it. Oh, we got an error here for the React frontend.
So I can tell it to just try to fix this.
Yeah, we can go back and look at the others while it’s doing that.
Can you use Clip-Up Link?
Or is it only on the… Yeah, with the handset. I mean, the phone version. Yeah, and I think, so everything I think goes through AnthropiX servers. Yeah, well, it’s a Gemini work. It’s an excellent business. Is there like a download entire directory string?
Can you get a tarball of all the stuff it did?
No, so that’s generally when you tie in Claude to, like if you work with Claude code, so it has access to it directly, or if you use cursor and you use Claude in there and have it working with everything at once.
If you’re just using Claude straight up, then you’ll just copy or download each individual file and put it up to the directory yourself.
It’d be interesting to throw Claude in there.
Yeah, it’d be interesting to throw Claude in there. And throw Klein over. Yeah, that’s what I did. Okay, we’ll probably, we’ll switch over to you in a minute. Okay. If that works.
I don’t have a whole lot, but I can show you my thought process, please. Okay. Yeah, so after it gets through all the code that it decided it needed, see it gave us- Emergency procedure.
Yeah, it gave us a quick- Oh, I don’t have an emergency procedure. It gave us a quick reference guide, quick start.
Oh.
Just basically everything you need to reference or tie together.
I notice it gets tired of telling you some things. It’ll go like, I thought you knew that. I already told you three times.
Will you scroll up just a little bit to the, I’m curious on, okay.
So workspace, right here. Okay, so it’s got a region, the bucket names.
It also went with an SQS queue to actually the same thing I wound up doing to feed the info back to the website.
It’s using, I think I’m at, yeah, I’m on DynamoDB. It’s got the keys, the hooks. Oh, okay.
It went with a web look.
That’s fine.
And the from email. Stripe has two different ways to pop it into your front end if you’re using like a web.
If you’re not using, if you’re using React, it’s great. There’s a React component that wraps the Stripe things which puts it right into your thing. It doesn’t feel like you’re going anywhere else. Their web hook kind of throws you to another. It’s hosted on Stripe itself. And then you fill out the stuff. Stripe does the thing and it sends it back to you. I wound up, the first iteration of the transcription service had that. And tracking usage, people were getting to that. But going to another site instead of the one I’m on, it dropped a hundred percent. Yeah. The ones that use the little key thing that let them bypass the payments would keep going.
You know, just that friction of it’s a little hokey.
Nobody knows what this is.
I don’t want to put my info, you know, that kind of thing. Yeah. So it fixed that React front end test.
Okay.
So that’s what’s pulled up now. So you can go through. So that’s what I built for you.
It’s just wait for you to put a picture or whatever in.
Upload some audio or a video to it. Just test if the front end works. So yeah, there is. And obviously it would take more plugging all this together and then see if everything works. Can I see the Terraform infrastructure?
Okay.
So they’re using, they got the VPC.
We got that.
And they got the region broken. I know they, and they did modules instead of, a lot of my stuff was based on, I initially went with a module piece like that. And I wound up throwing that out and reorganizing based on the task. So if I’ve got the backend part that needs to upload, needs to print, needs to be done, I’ve got the backend part that needs to upload, needs to pull the file, run the transcription letter, and then use email and stuff. I’ve got that in one thing. And the other thing is more geared towards the front end part where I need a load balanced UI with an internet gateway, you know, all that stuff. And it got really weird when I started crossing the stream. So anyway, they may have it better.
Oh my, this is like literally matches what I wrote.
It just makes me upset. Just a little faster, right? Oh, a little faster than me.
Yeah.
So even in the same region, yay. Environment, something I haven’t used. Okay, so, and it’ll figure out that it’s got Docker images that it needs to handle.
And then it’s, okay, that works. I didn’t know, I’m assuming that’s true where Terraform has a sensitive flag on keys that keeps it from printing it out places that, you know, if you do a Terraform plan, it spits out what it’s about to do and what it’s about to set in places. And if you do that wrong, your Terraform plan will actually say, hey, I’m setting this environment variable in this container. And if that happens to be a key, it’s there.
And then if you happen to build in a GitHub workflow, that’s in the log file.
If it’s, you know, there’s tools to try to keep that from happening, but they’re also geared on things like the name of a path, you know, the word password, word key.
It’s trying to guess what is sensitive.
And if you had something that wasn’t named something, it’s probably just going to print it right out.
Okay, so it’s got the VPC.
It’s got, there’s your interface.
It’s got the, it’s got the, there’s your internet gateway.
Yeah, it even names things similar. It’s got that, okay.
This, yeah, wow, that’s cool. Okay, that’s good. I’m waiting to see if it’s logging to, I can’t remember the log framework AWS uses. If it’s got that, it’s going to be kind of interesting.
Configuration.
CloudWatch, yep, okay.
Then you still have to paste that into your main TF, your variables.
Yeah, of course, for me, I’ve got a file that’s a, my normal approach is to use a .env that’s in a directory that is not even in my repo.
It’s not something I can ignore, but I still don’t trust myself.
So I go over to another directory, source my .env file, and then I’ve got my Terraform stuff picking things up from my environment and then using that passing down instead of even checking it into a file, it gets pushed.
Because I have checked stuff in before and then had to go, it was easier to rip it out and just change it than it was to try to chase down everywhere that might have also published. That’s still cool. All right, want me to turn that back over? Yeah, turn that back over to me and then I’ll reclaim host and I will throw it over to Charlie and we’ll walk through what client was able to do. Do you have a cost associated with what that was? Yes, so you’re probably on a plan that does all the things.
Yes, I’m using the top level of Clogmax, so 200 a month for that, but I can’t run into usage limits like I’m using this. There is no usage limits.
There is, but I think you have to use it solid for the eight hours a day to start running into.
So I’ve looked at it for token-wise and I’m guessing if I was straight up using like the API for sport tokens, I’d easily be paying 75, 80 bucks a day.
Okay, yeah, for what you just did, I would guess I probably had, that would have been, and that’s if you discount the parts that I threw away because it didn’t work out and went to what you wound up with, probably a bad month, easy.
160 hours matching everything ever.
I’m not sure it’s right. Anyway, and you can get access to everything I did with lower usage limits for 100 a month for half as much. I just get as quarter as much use, but if you’re using it a few hours, if you’re using it two or three hours a day, that will cover you for that. All right, so Charlie. Yeah, it took me a second there. So I started to go the same route, just grab the constraints and the basic requirements of this and set that as a markdown file for client to reference. And then I figured why not go and grab a few more. So I grabbed some documentation on Vaster Whisper, on Flask, React, Stripe. I also went and grabbed the Transcribe website and just as another reference, because it might need to take a look at it.
That image one robot transcribing an audio file, that is the prompt I use with chat CPT to create a mistake on the transcription. Okay, I didn’t realize I did that. So I started with planning using, you’re not gonna see my keys or anything. I started using Anthropic Plot Sonnet 4.
Nice.
And there’s the prompt that I used up there, just build me a project plan to implement this with the constraints and the basic requirements.
And it went through, read all of the references that I had, created a project overview.
I thought this was really cool. Thanks. Showing me the architecture, breaking up the backend and the front end, talking about what the AWS infrastructure would be, then planning things out in phases.
Even touched on the bonus features.
Once it had a total of 11 to 16 days. Okay, sure.
So from, so it’s not a fast night just on the plan, right?
So from there, I switched from Clog Sonnet 4 to what did I use Gemini?
Yeah, I did the 2.5 flash preview and went from plan to act and just said, go. And it builds, it builds all of this in about five minutes.
It’s not at all operational, but for the little bit of time that it had, it did pretty well.
So it ran out of time is why it stopped there?
Well, I only let it run that far because after a while, Gemini said, you went through too many tokens in such a short amount of time. I said, okay, I’ll just stop here. I have something to share too. Oh yeah, yeah. Before you, yeah, I just want a file I was looking at.
Status, health, transcribe, okay.
Looks like a Flask app, looks like a Boto3 client, looks like a, I didn’t do any threading because that’d be interesting. Yeah. That’s pretty cool. All right, let’s flip over to Josh because we are at… There you go.
Okay, let me go over to… So Gemini, are you on a free plan or you just got to develop your keyword? Yep, totally free. And it was, I think, I showed that I used 50 cents of API calls, 45 cents of that was from Anthropic.
On the plan.
Yeah, and it was just their basic plan. I didn’t even use the $200 thing. Well, and even access to pretty much unlimited, like I do, what you do a lot is I’ll start off and I’ll get Opus to make a plan and then I’ll take it to Gemini and the GBT, let it run, take it back into Anthropic.
It’s nearly like you’ve got the architect doing stuff and then here’s my implementation team to go build the thing.
Nearly. All right, go ahead. All right, so I’m using O3 Pro and I use CODs on it a few different times. So I’m kind of using the full gambit of stuff and I’m kind of taking it with two approaches. I use lots of cursor, lots of O3, got to do pre-planning to explore the space. So I basically embedded the thing that Jay has.
It says, give me an app and build me a PRD. And then I just kind of talked through it a little bit. So it gives me a requirements documents, tries to define out some goals and stuff like that. And some of this I’ll take over and put it into the repo and all that sort of stuff. A lot of it’s crap that I don’t actually care about. Lots of over-engineering stuff that I wanted to kind of generate out this IAC cradle using Ansible to kind of generally decide exactly how it’s going to build the stuff so I can throw it all away when it sucks and then rebuild it. And so that’s kind of what all of this is looking at.
What are some ideas that we can do for some stuff, how to run?
I’m actually going to run it on this Mac.
And so this is kind of the area that I have now.
So you can see over here, I’m working with cursor and I kind of have this set up here with a plan mode where I’ve taken away its ability to edit any files. I have a lot of other ones that I have on my actual machine but this is like the one that I take everywhere.
Or you can only plan and do stuff just kind of like what Klein does out of the gate.
But then it also has the agent mode and I’ll kind of flip back and forth as it’s going along.
But you can see here, I’ll just show kind of what it got to.
So it’s running on this Mac and these Macs have a very special NPS sort of bare metal runner.
And so I actually had it build me a script and an image for running with the bare metal acceleration so that I can run Whisper on here.
And I have it, I asked it where I could go find some audio and it gave me the site that has a bunch of Supreme Court things and I said, good, that sound with words, which is my criteria. And so it gave me this thing which I haven’t listened to but I’m sure it’s something. And you can see here, I’ll increase the size a little bit that it’s starting to do stuff.
You know it’s AI driven because there’s emojis.
And so you can see it’s doing the stuff. It’s working with the small. So it gave me the small one with faster Whisper and it’s just kind of plugging away. It’s a very long one. I let this run a little bit earlier.
It took about five, six minutes to go so we can leave it up doing that. Hopefully it won’t break. But what I will show is the other side here which this might be not something you guys have played with but I think it’s very useful because it gives you lots of incremental breakage is this playbook stuff.
And so with that, I can make it kind of build the thing from scratch.
I get incremental feedback about all the stuff it’s doing. And for the most part, all the stuff, I did some redirects as we went along especially when it’s kind of really going off the rails. I didn’t really do a lot of manual code editing to stay true to the vibe coding nature of the challenge. But you can see here that it’s going through. It’s actually, there’s so much stuff that it’s falling off.
You can see here it’s looking at the MLX configuration.
It’s detecting that I’m on Darwin so that it needs to run that and do the Apple silicone and checks if UV is available, installs a whole bunch of stuff that I need for acceleration and kind of generates that stuff specifically.
If I was on Ubuntu, it would do something different.
And this is all for a dev configuration. So in this Ansible directory, it’s got sort of all these things for dev, prod, and staging where right now I basically, I have a mock Stripe server.
So it’s fake. You mock Stripe? Yeah, I mock Stripe. Yeah, because I just took the OpenAPI thing down. And we have a, under my docs, actually I have an OpenAPI endpoints where I did a, here’s based off what you described.
This would be the TSV of what that spec looks like.
And so it took that and then generated out the routes for that based off the spec.
And I actually have a GitHub action that tests it.
There’s something that you can do called, I don’t know what it is, but there’s some sort of test you can do to make sure that your server still holds to that spec.
But generally just trying to get it so that it’s giving it feedback.
So when it goes off the rails, it immediately yells at it.
It’s always running the thing. It’s not triumphantly generating out, buy files with slop and not running it. You want to always be able to kind of detect what’s going on. It’s even implemented SSE with streaming and all that sort of stuff to stream the transcription coming out. I see it went with Redis, which is cool. Yes. It also did a Postgres thing.
I just let it do it.
It’s wanting to save the stuff off. It’s what it wants to do. So it does have persistence. Yeah, it’s pretty cool. One thing I do play a lot around with, and you see here I’m arguing with it about something or the other, is that if it’s starting to talk, so let’s see, let’s see. Run the test script. I really want it to generally do the actual run. I found generally that Sonnet is a lot better at running chain calls.
And I’m going to show something that’s really cool. With Cursor right now, I’m sure that they will patch this at some point.
But right now, this whole thing probably cost me about maybe $0.50.
Because with Cursor right now, they only charge for when we make a submission, and it’s always $0.04 per submission. And so if they keep calling tools over and over and over and over again, still costs $0.04. And I found out that you can actually change the commands.
And I can change whatever crap it’s putting in instead of stopping it and starting it again. I can do echo, hey, buddy, why don’t you run that script that you’re talking about instead of posting your read me?
And it will not charge me because I’m hijacking the terminal. So there’s lots of stuff like that.
You kind of get your control over your environment that you can do this plus. Yeah. Claude is very fussy.
But usually it’ll stay on.
Usually it’ll stay on task. If I tell it to do a command, it’ll redirect.
You can see there that it can’t get in.
Yeah, that’s what it’s kind of doing. Right now, I think, let’s see. There is the actual test script. So it has a transcribed sample.
And I’m having issues with getting it to actually stream the audio, but I see it doing it in the background.
You can see here that it’s going in. The zoom is in the way, so it’s hard to. Yeah, I don’t know if you can minimize it. It shows it’s a weird thing. You can kind of see what it’s poking at now.
I love that it initially gave you something over engineered. Because if you had looked at what I designed initially, you would have said it’s over engineered. And you were right. Right, so it’s trying to do a batch. And so it’s failing on the SQL. We’re basically doing a batch queue.
Yeah, I mean, it has a batch queue. It’s trying to do it.
It builds. It’s running off into an image. You know, there’s a deployment directory.
It’s building its own sort of stuff. Yeah, pretty cool. It’s very cool. Yeah. Okay, how much time did it take?
How much did it cost?
I’d say I started when I got here.
So I’ve been five.
I got here at five.
So I’ve kind of been poking at a lot of the outlying stuff.
But as far as what it cost, I’d say probably no more than a dollar. A dollar, okay. You’re probably at least at where I was, you know, a man-month in or something like that is my best guess.
Oh yeah, that would have taken me forever to do my hand.
Dollar a month. You got anything? I’ll do you really quick.
Josh covered a lot of it. But I’ll share really quick just to share a couple things. And I really like this. This is fun. Yeah, yeah, yeah. This is, I really appreciate the idea to start with. And then I’ve actually learned more than I am sharing. Anyway, it’s not, it’s normal that I learned more in these than I’ve actually talked about, even though I put together a session sometimes. Oh yeah. I thought you were going to say you ran it locally, but the money, you know, I’m writing the small model here. Okay, so I did kind of similar. I took the prompt.
One thing I always like to do for anything like this is I always get Jim and I to ask me questions to refine things. So I took that, just asked it to generate some questions.
So it did that. And so we had live updates, visual design, just some general kind of stuff.
So I gave it that back on kind of, you know, what I wanted there. And then it went through and generated kind of the outline for me of all of the code. So I think in total it generated, and I’ll jump down to here.
This is everything it generated.
So it gave a pretty high level Terraform environment.
I’d have to go in and do a lot more to that.
Basically everything else that had kind of a good start on. So then I jumped into VS Code.
I use Copilot, but I’m using 2.5 Pro.
So it’s just bundled in as part of that subscription cost.
And let me zoom in a little bit more.
So I copied everything over, generated that. I had a couple issues with the Docker build. But once I worked through those, I’m getting the main thing, honestly, for this, because I asked it for Stripe specifically, I ended up having to pass in this Stripe public key. It kept complaining about that because it was looking for that as part of the kickoff. But after deploying it, I do actually get a front end here. Nice.
It does not work.
If I type this in here and then I choose a file, I have one here, it will act like it’s going to work.
The issue is that because I don’t actually have that Stripe key set up, it’s going to immediately fail when it tries to do that. But I then tested out the Terraform deploy. And that successfully worked. It successfully built my ECR images here. Nice.
Back end and by front end.
But then after that, I got to the next piece that I’d asked it on was to update my deploy.yaml, which is that GitHub workflow that was doing an ECS task definition and then the Terraform set up to do that. And it gave me all that code. It would be a lot to then go in and update and then you’re getting into a little bit more on the pricing side of going through that.
But it did generate and I’m not super good at Terraform, but from what I do know, it looks like it generated pretty much everything that I would need for that.
And I would have to go in and update and then there’s some environment variables I’d have to set on GitHub that it was asking me to do because it’s building that task definition as part of the workflow, which is fairly standard, but I didn’t want to go through that for putting all that into GitHub currently. So, but I mean, here, it looks like it’s generating everything you would need. It’s got both the ECR already, the ECS, your task definitions, your load balancer, and I believe there’s VPC farther up, but yeah.
So exactly what you said, something I’d normally do with Claude is after I put in the prompt, I would say before you start asking me for any clarifications on this that you would need. So if I was actually going to make something, like I would do that step first. But one thing I like testing, it’s still only like a week and a half, two weeks release.
I like seeing what direction a model chooses to go without it just to see like, what are you going to do on your own?
But if I’m actually making something, I like to be like, hey, ask me for clarification before.
It was really weird for me to try and do this really fast because generally my, like my vibe coding, I’m usually like talking to the model for a day whenever I’m doing this sort of stuff, kind of getting this stuff specked out.
So it’s, it’s, you kind of have that framework to, you know, so the concept of just being like, go, go, go, go, go.
I see, I see it going down around.
It’s like, ah, no, I don’t have time to tell you not to use Stripe that way. If you run into anything, like when you’re doing this, like one day worth of stuff, if you run into any issues where you have a model that’s kind of headed in a specific direction, trying to get it yanked back into the stubbornness, there should be- I bounce between models a lot. I said, Claude starts off great. So any hesitancy from it, I take everything it’s done through GMI or OpenAI and then whenever that outputs, I bring it back to Claude. I’m like, hey, I started working on this. What do you think? And then Claude takes off running again.
Yeah, yeah, I’ve heard that once you get past a certain number of attempts to try to veer back on course, just stop, go to a completely new session and say, hey, here’s where I am.
I’m working on this.
Can you take it from here?
I love when it compliments me on stuff that it’s done as well.
Oh, this is great.
I did notice on yours, it did say excellent after you.
I’m like, it’s self-affirming. I do appreciate that. I’ve needed that today. One thing I will say that I found helpful when you do move to that new chat, before you leave it, ask it to generate a summary and then use that as your starting point. That’s been very helpful.
I’m using Claude, the lowest paid version. And when I ran out of chats, it’s like, oh, man, I ran out of output so it can’t even output a summary to start a new chat.
What’s the… Oh, Opus is so expensive.
Okay.
So we had a question in the chat from Lauren asking about the Claude and whether it was a paid subscription or what. I think you covered that on the… You’re on the full-up. I’m on max. I hear that’s really good.
And then the other thing is, the reason I really like OpenRouter for a lot of this is I’ve got access to so many different models to go pick from and use.
And it tells me the pricing.
I’m definitely not going to pay for this one at $15 a million tokens.
I think it’s a million tokens, maybe, for input.
Anyway.
Yeah.
My go-to at the moment for a lot of stuff, I’m not even on sign in for, I don’t think. But I’m on 370, probably. Didn’t O3 Pro just drop?
They dropped their price. They’re cheaper than Google Gemini 2.5 Pro. Yes. Wow.
It’s good. O3 Pro drops.
Crazy. What? O3 Pro’s price dropped through.
I would consider, and the Pro, I would consider OpenAI actually back on top for a moment. It’ll be like a week, but it’s actually on top for once.
I mean, the thing I really like on the OpenRouter side is you can actually use price.
If you’re running something and you’re not at the little nitty gritty things of where one model versus another affects you a lot, you can actually hit lowest price for today.
Go that way.
Lowest price for tomorrow. So what I was going to do, and if anybody actually wants access to this repo, I can add you as a, it’s a private repo just because I don’t want some troll somewhere stealing it and then act like it’s his. So I’ve got some design folders. I’ve got examples where I put all the crap I wrote to try to figure out stuff.
If you remember, we were talking about this transcription thing way before Whisper.
I mean, we used, I can’t remember the Mozilla project, Deep Speech, we used that. We used another, I mean, we have, we’ve used Wave to Effect. We, I mean, we use all kinds of stuff initially. This goes back to one of our NeurIPS, our first one probably when we had George in the group trying to figure out how do I put up a live transcription of a video, all of that. So that’s where all this started.
2019 for the NeurIPS think, the first one, and then COVID.
Oh, we did it for that too. So let’s see.
Some of these are just intermediate things or files that I’ve never checked in or whatnot.
The main thing I’ve got is Docker.
I’ve got my Flask image that’s got one giant app file that’s got all of our routes and stuff in it.
You know, getting through that.
I’ve got, here’s the worker for faster image that actually does the transcription right now. You know, that’s all, you know, prediction handlers, things like that. Terraform, actually, I can take that back.
I got it. Maybe I went backwards on that.
Container registry, the database. Okay, that’s where I wound up with.
I had some aspects of the Terraform that was the same for either the backend or the frontend.
Those wound up being common.
And then the parts that were like the thing for faster whisper that aren’t common in anything, I wound up with that kind of a thing. And okay, yes, that’s still the app. But yeah, I mean, this thing came out, what y’all were talking about came out pretty close to what I wound up with after, you know, a couple of throwaway attempts, a couple of iterations, a couple of whatever. The thing that would be interesting for me that we may want to look at at some point would be, I’ve got this repo out there that’s got all of the issues because I’ve been going through writing issues and things as I had to fix stuff using a project board for, you know, tracking and stuff. It would be interesting to get access to that as well. And you can see the progression from zero to I think I’m on version three now after I’m moving to faster whisper lambda, jump me to three, getting off the lambda, going to run pod would probably be version four of this thing.
But it’d be interesting to see if it knew the full history in the repo, what it would pick up and use.
But anyway, this was fun. So let me jump back over to the crazy thing that we just talked about. This project was done three times in an hour. One, two, three, four. It was also nice to do a full-up session of vibe coding. I didn’t even write any code. It was fun. Well, to be fair, I didn’t either. It brings you back to some of the interesting things. You’re back to where, can I clearly specify the requirements of a project?
I just told you to write code.
No, that’s the harder part of anything. And the most expensive part to get wrong. So there’s that.
Next week, we’re doing a paper series again.
I know you sent it to me. It’s on Zero Trust for AI Agents. So talking about identity for them, how do you deal with authentication, secrets, all that sort of stuff, and blockchain. It’s nice.
How do we use distributed identity?
So cool paper.
It’s a very cool paper. So just kind of going through all those sorts of things. We’ll cover core concepts and then talk about that specific paper. So for those that were around, like the co-working group years and years and years ago, there was a giant set of folks that were all into blockchain for a minute back before.
Like right when Bitcoin started, I mean, it was huge. It’d be kind of funny if any of them just kind of are still Twitch when blockchain shows up. I am not one of those people.
So this is me saying like, this is actually a useful thing for this.
That’s right. That’s the next thing while we make the Spill it.
Yeah, I guess I’ll leave the Hustling AI in a piece. Yeah, I’m surprised somebody else hadn’t tried that already.
We’ll let Jerry, the other guy, say it’s his.
I know, I was going to say it. And then after that, I believe, is when we got to the point where we were like, oh, I’m going to do this, I’m going to do this, I’m going to do this, I’m going to do this, I’m going to do this, I’m going to do this, I’m going to do this, I’m going to do this, Tom, coming to talk about NVIDIA certificates. I think, actually, it’s in front of them. It is. Yes.
And then looking at doing a social night the week after July 4th. So, July 4th is a Friday.
A lot of folks will be kind of out or whatever, and I think it’d be good to just kind of hang out for a minute.
Still trying to figure out, we always go to Stove House. I don’t know if we want to keep going to Stove House or if we want to try somewhere else. I need somewhere where there is beer and food and we can just kind of hang out and talk. So places like that, if you got thoughts, let me know. Let me see if there’s any more in the chat.
If I can figure out how to get to the chat over here. Chat. Got it. All right. So we’re good as far as chat. Any questions or any comments or thoughts?
That was fun. That was fun. That’s a hackathon in a box, man. We actually tried to do something similar back in March and it wasn’t nearly as good as what we’ve got now because that’s when we went to the Hudson Alpha Tech Challenge and we actually tried using nothing but chat GPT and a couple of other models to solve a challenge. And I think we could have won, but I think the judges were mad that we just fine-coded the whole thing. It wasn’t a challenge for us. And it’s like, oh yeah, it’s an AI challenge. Maybe we should show AI a song challenge. That did not translate well. Anyway, we’ll stop the recording.