Develop Yourself

#280 - The Missing Guide to AI for Web Developers

Brian Jenney

I just quit my job as a senior AI developer, and while helping hire my replacement, I realized how few people actually know how AI apps work. 

In this episode, I walk through Retrieval-Augmented Generation step by step—the same system I built at work—and show you the real skills developers need right now.

👉 Grab a hands-on project with a video you can follow along with here 👈


Hold your spot in the January cohort and become the go-to person in your organization for integrating AI into your web applications. Hold spot here

Send us a text

Shameless Plugs

Free 5 day email course to go from HTML to AI

Got a question you want answered on the pod? Drop it here

Apply for 1 of 12 spots at Parsity - Learn to build complex software, work with LLMs and launch your career.

AI Bootcamp (NEW) - for software developers who want to be the expert on their team when it comes to integrating AI into web applications.

SPEAKER_00:

Welcome to the Develop Yourself Podcast, where we teach you everything you need to land your first job as a software developer by learning to develop yourself, your skills, your network, and more. I'm Brian, your host. So a few hours ago, I just quit my job as a senior software engineer, AI developer, at an AI startup where I'd help build a product from zero to one, and it's going to be launched soon. The product is a platform to help find TikTok influencers for large record labels. Yeah, super fun, way different and less nerdy than a lot of other things I've done in the past. But time has come for me to move on. And as I'm helping to find my replacement, it's illuminated me and kind of exposed a very big knowledge gap that I see in a lot of software engineers nowadays. And I want to take time to outline the skills that I really think you should be building up in 2025, going to 2026. Whether you're a college student, whether you're a bootcamp grad, whether you're thinking about learning to code, these are the skills which I see are increasingly going to be important as the AI hype kind of dies down. And we realize, okay, AI is not replacing software engineers. If you actually think that, then I don't know, I got a bridge to sell you in Brooklyn somewhere. But for the rest of us, the serious people that are left and understand that, okay, now there's going to be some best to clean up. Now there's a lot of tools and a lot of actual valuable use cases that don't include buzzwords. The real work is going to begin. And some of the work that I've done here and in my previous company, which was also an AI startup, have shown me what it is that I think a lot of us will be doing in the future. These companies are really small, they've grown quite a bit over the last year. And I see that they're essentially doing the same thing in many ways and using a lot of the same core technology that I wish more people were paying attention to. I've done a few podcasts on this already, so forgive me if it sounds like I'm just beating a dead horse here, but I don't feel like a lot of people got it. And I've gotten a lot of great feedback on some previous episodes about RAG retrieval augmented generation, and another one where I had a person come on and also speak about his experience using RAG. And now I want to go a little bit deeper down the rabbit hole of the exact technologies which I think you should be learning if you want to not only be hireable, but remain hirable and also be hyper-competitive going into 2026. The reason I'm so optimistic and like really bullish on learning these four skills, which I'm going to get into, is because they're new enough that you don't have a lot of experts in the field. I spoke with a consultant a couple months ago who came to our AI startup and was talking about our implementation of our Gintic architecture. Basically, how are we using AI and the APIs like open AI and how are we building our agents and how are we doing things like RAG? And we were speaking the same language, me and this guy. Now he has a lot more experience than I do, and he referred to me as an AI developer in this conversation. And I thought, whoa, am I actually an AI developer now? I thought that was pretty cool. Anyways, what I saw is that he's doing a lot of the same thing I'm doing, and he only has a couple more years' experience than I do, which makes him basically like an OG in this game right now. And so this is a really cool period where if you know a little bit of knowledge, it's a lot. And there's not a lot of web developers, especially that are learning this. When you think about AI, what do you think about Python, machine learning engineers, data scientists? But the work that I'm seeing increasingly that we need, and now that I'm actually helping to craft the interview process for this new full stack AI developer role at the current company that I'm leaving, I'm seeing, oh man, there's a big gap here between what a lot of software developers know and what people actually need. And a lot of companies are having a hell of a time finding these kinds of people and the pool is small. So this is that magical time to learn this stuff. So without further ado, I hope I've sold you on why you want to learn these things. Now I'm gonna outline exactly what it is I think you as a software developer slash web developer should be learning. If you're a Python developer or you're a back-end person that uses a different language, you may not find this as valuable. If you're a person learning JavaScript, TypeScript, React, Next.js, you're gonna want to know this. So, first of all, where do you want to go? What is our North Star here? Our North Star is you being able to smartly integrate AI into web applications, mostly using what's called retrieval augmented generation. Very quickly, because if you listen to this show, you've probably heard me go over this a bunch of times. And I even have a little bonus material in the show notes that's gonna show you how to actually get your hands dirty and build something with RAG, retrieval augmented generation. Check that out. It's gonna teach you way more and it's gonna be super hands-on. But if you don't know what RAG is, basically it's like this you're in ChatGPT, right? And you're typing, hey, tell me about my company's vacation policy. And ChatGPT says, I don't know your company's vacation policy. Wouldn't it be cool if you could grab the documents from your company, toss those to OpenAI, ChatGPT, GRAC, whatever, and then say, cool, here's the documents, here's my query. Now you tell me based on these documents, what is my company's policy? That's what RAG is in a nutshell. It is grabbing some information that OpenAI or the large language model does not have access to, providing it to that along with your original query, and then giving you a very custom detailed response based on this proprietary data or not so proprietary data. What I was building at this last startup here was a very complex RAG system where we'd go find TikTok influencers, get their information, understand what they're about, put this into a vector store, which I'll get into in just a minute, and then be able to search them through a chat interface so record companies like Rock Nation or Universal could find these people and pay them to help make songs big. At the company before this, we did a very similar flow where we would scrape the web, find information on suppliers and buyers in the procurement space, and then surface those in a chat-like interface that people could type and search for these people. You may think, why don't you just do that in ChatGPT? And that's because you can't. ChatGPT does not have access to the most recent legal documents. It can't just do a large web search. It can't comb through millions of documents that your company may have access to, that it doesn't just want to dump into Chat GPT. So retrieval augmented generation seems to be one of the main use cases and one of the only practical ones I've seen that is going to come out of this whole AI hype cycle we've been living in, where they're basically trying to force feed us AI and tell us how we should use it in all these kind of ridiculous ways. Like obviously, you should be using cursor or claude or whatever other AI-assisted coding tool of your choice. But that's not where the money is. The money is in things like RAG, where you can do things we've always done, like getting lots and lots of data, summarizing that data, and helping somebody make an informed choice to hopefully make more money or to do something faster that would normally take lots and lots of time. So, how do you learn this stuff? Well, if you're a web developer, you're more than halfway there. I'd say you're 75% of the way there if you're a web developer. And I'm gonna essentially tell you the exact curriculum and some of the same things that I'm teaching parsity students in the curriculum that I literally built last week when I was in Reno, Nevada, hold up in an apartment for five days, making this curriculum. So I'm like, why is nobody teaching this stuff? And here it goes. So if you're a web developer, you need to know these things. JavaScript, of course, React or View or Angular, whatever. Know some sort of front-end framework. Be a competent web developer before you jump into the AI stuff because that is like the layer on top of the layer of web development and practical software fundamentals you should already have. If you don't have those, you need to have those yet. You just can't skip the line and go straight to AI. Learn how to deploy a full stack web application like using Next.js, SQL, or Mongo or whatever, Node Express. Understand those things. You're gonna need to get your hands dirty with a little bit of backend before you can truly dive into using AI and implementing something like a RAG system. Now, here's where the few tutorials that I've seen fall short. They dive into the implementation details of how to set up a rag pipeline or basically digging into what RAG is. But before you can truly do that and get a really solid foundation, I highly suggest you watch some of three blue, one brown's YouTube series on linear algebra. You need to understand a little bit about vectors and how those translate and work with large language models, in particular vector databases. This is a linear algebra concept that is at the heart of how most large language models work and how vector databases work, which is usually the mechanism or the database that is preferred for retrieval augmented generation. In a very, very small nutshell or large nutshell or whatever analogy I'm going for there. When you type something into ChatGPT, the words that you type are turned into vectors or embeddings, basically an array full of numbers, a large array full of numbers. These numbers encode the meaning of what you said, not necessarily the exact definition, but it's why ChatGPT can work across different languages, work with pictures. It will encode the meaning of the image or the text that you're writing, and then it will compare that to other vectors. Basically, it's massive knowledge base in a high-dimensional vector space. And then it will compare what you've said to other semantically similar vectors, basically doing some linear algebra to see, hey, what did you say and how similar is it to another vector out there in space? And then through much more complex logic than I can explain here or even truly know, it will take that and then weave a coherent amount of vectors together to give back to you in the form of text. Now, there are things like transformer mechanisms at play, attention mechanisms, and weights and all sorts of other things and probability scores, way more than we can fit into an episode or that I would know how to explain intelligently at all. But at a very high level, think of your words turn to numbers, numbers turn back to words given to you via text that you can read because you're a human. This is exactly why you should not be using ChatGPT as you're a psychologist or something like that, because at the end of the day, it's doing really complex linear algebra. And it's not, it's not actually your friend. It's just been trained to talk like your friend. Anyway, now that you have a basic foundation in linear algebra and understanding vectors and things like dot product and cosine similarity, basically like 10th grade math stuff that you probably failed, like I did too. Now that you understand a little bit of that through watching that series, now you have at least an intuition for how words go to numbers. Now the next step is you're going to need to get familiar with some vector databases. You should be familiar with SQL or NoSQL databases by this point, and you understand what a database does. It's a place to store data and persist that data. A vector database is just another flavor of database. And in this case, you can guess what this vector database does. What do you think it stores? Vectors, embeddings, big arrays of numbers. Now, I used to recommend Pinecone, and that is actually going to be what I teach in the course in the curriculum for parsity students, but I've used Quadrant at work. I've also used Vertex in GCP, which is Google Cloud Provider. If I'm being completely honest, avoid vertex at all costs. I would probably use quadrant, and the main reason I would now suggest quadrant, and the main reason why I would now suggest quadrant is because they have a cool visualization tool. And I think a lot of us have a hard time visualizing or conceptualizing how these vectors are stored. And this is really cool because you can see the vectors in this little cool visualization graph. And you can say, oh, these vectors are clustered here, and these vectors are clustered over here. So now that we understand we need a vector database, we need to fill this vector database with data. And how do we do that? Well, the simplest way would be to scrape the web for some information which we find useful. Now, me, I've written tons of articles. I have about a thousand LinkedIn posts. And so I actually wanted to train a rag chat bot on my voice. And I thought, what better way to do that than to scrape all my posts and then upload them into a vector database? Now, this is also where most tutorials fall kind of flat or short. They don't mention this thing called chunking. Chunking is incredibly important when you're uploading data to a vector store. Imagine if you're working with things that are larger than a quick post on LinkedIn. What if you're looking over massive legal documents or medical documents, or you want to do something like build an app that can do your taxes or something like that? Look up tax codes and then quickly summarize them and tell you whether or not you're doing the right thing or you're going to get beaten to death by the IRS. Well, these documents can be massive, right? So it's not enough to just store that entire document as a vector, right? You'll need to split it up or chunk it. Chunking is one of the most difficult parts of using vector databases because you're often taking large amounts of text and figuring out, well, how do I separate this text out? You can't just count the number of words because what if you cut off the middle of a word? That's not really helpful for a large language model. If you said, here's some documents related to this tax code the person wants to know of, but it's cut off right at this point. Well, the large language model may give you a wrong answer or hallucinate something. So you have to semantically chunk this in certain cases. But you'll need to look up things like chunking strategies. Do you chunk them at the paragraph? Do you chunk them by chapter? What happens if you maybe get a chunk that's too short or too long? How do you figure these things out? This is just good old-fashioned software engineering. This is an interesting problem that requires some logic and some opinions. It also requires some knowledge of what you're looking at. Medical documents versus legal documents versus my posts on LinkedIn require very, very different strategies. What happens when you want to update these documents? Do you just delete them all and start over again? What happens when these documents inevitably change? How often will they change? How often do you need to rechunk and re-update them? These are lots of things you'll need to figure out. And there are people that have done this, but a lot of this is brand new. So you're solving a problem in a space that doesn't have a lot of answers or a lot of leaders that have super strong opinions. This is not something you can just ask Claude to do or just have cursor do. This is something that requires deep thinking and making sure you get it right. Because if you don't get it right, it is really difficult to update a vector database. It's not like updating a SQL database. These vectors, once they're stored, one, they cost money to embed them. Taking these large blocks of text and creating embeddings or turning them into vectors, this may cost a few cents per thousand documents you want to do. So you might spend a dollar or two embedding all these things. And the way you embed them is using OpenAI. OpenAI is the API behind ChatGPT. This is the most popular one off the shelf. If you want to be cool and you want to use Anthropic or Grok or Deep Seek or whatever, be my guest. But everybody's using OpenAI. So, in my opinion, spend$5, get access to the API, and then use it to experiment. Five bucks should get you very, very far. You can embed thousands of documents for under a dollar, and you can do hundreds or thousands of requests for another couple bucks. This should get you through at least a month of doing some pretty decent work and learning about RAG. So now that you have your vector database, now that you've scraped some documents and you understand chunking a little bit and you've chunked up the documents in a smart way and you've stored them in the vector database, now you're ready for the most interesting and fun part, which is retrieving them and getting some good information from a large language model. This is actually the easiest part. If you're using something like Next.js, because of course you are, and if you're not, I'm thinking, what are you using in 2025 and what do you think people will be using in 2026? Do you know that OpenAI is very opinionated on the front-end framework you should be using? And guess what they recommend? Next.js, React, HTML, CSS, JavaScript, Zod is their schema enforcement library in TypeScript. Turns out OpenAI loves TypeScript in Zod and Next.js. So you can use whatever the heck you want. But we've already seen who's won the game as far as frameworks go, and we've seen who's won the game as far as large language models go. So it's probably in your best interest to use the tools that they say they're going to support out of the box. So let's say you want to make a back-end API that will receive a request from a user, take that request, and then search a vector database, and then finally pass the documents from that search into a large language model with your query and then give you a response. For example, you say, hey, write something like my favorite LinkedIn influencer, Brian Ginny, about XYZ, about how to become a better junior developer or something like that. So it would take that query, how to become a junior developer, and it would embed it. It would vectorize it. Why would it do that? So then it can compare it to other vectors in your vector store, which is quadrant or pine cone, or if you're crazy, vertex AI. So it queries that vector store and it says, huh, what other embeddings, what other vectors are close in similarity to the one we just created? Well, I have about 10 or 20 vectors which are similar. How many do you want back? And you might say, just give me the top five, right? Vector databases give you basically a fuzzy mash. It won't return you the exact mash. It'll say which ones are closest. It's like comparing angles or something in a graph. And you're saying which little angles are closest to the one I just created. My query was embedded. Now give me the top five that are close to that query, and it will. Now you got those top five documents, which may be LinkedIn posts in this case. And you say, cool. Now I'm going to send my query, which was write an article like Brian Ginny about how to be a better junior developer. And I'm going to send you the top five matches that are close to that original query. And then I'm going to say, OpenAI, take these documents, take these five documents, use them for the style and tone and grammar or something that Brian uses. And now write a post based on my query. And it's going to take that and say, cool, here's your post. And it's going to return that back to you. Now, we could do a whole course on prompt engineering or something like that, but I'll be completely honest with you. I think people really overthink prompts. They should be short, succinct, and include some examples just like we're doing by getting this vector search done and then feeding it some examples. Now, the way we've worked with APIs, if you're a back-end developer, is that you typically expect a contract, right? You hit an API and you expect that API to return you something that has certain keys and values. Now, large language models return text usually. That's not always so great if you're working in a front-end app or using TypeScript and you expect certain keys and values back. Now, for this made-up project we're doing where we're writing an article, maybe we only want text back. But something you should explore is structured outputs because OpenAI, the API behind ChatGBT, works really well with TypeScript. And you can use a library like Zod to enforce a schema that it gives back. So basically, you can enforce that it gives you back a certain object with specific keys and values that are relevant to your query. So maybe in this case, it would give us back like a summary, maybe it would give us the article, and maybe it would give us something like a headline or something like that. So you could have three different key value properties and you could return this. And you could be really, really sure now that when you hit this API and you talk to OpenAI, it's going to give you back something that follows this contract. Just look up structured outputs plus Zod Z-O D, and you're going to see exactly how OpenAI recommends you do this. And it's going to make your life so much better as a developer. And you're going to be way ahead of people who are just basically passing text to OpenAI and getting text back, which is the most naive way you could do it. Using structured outputs is like the Chad way of doing it. Now, finally, on the front end, you might want to stream this response because when you use ChatGPT or Claude or whatever you're using, or Grok, maybe if you're a weirdo, just kidding. Just kidding, all you Grok users out there. But usually we'll get a streaming of text in the front end. Nowadays, you just expect this. And luckily, when you're using ForCell, which is the platform that hosts Next.js, they have a library called AI. And this AI library has all sorts of cool little AI tools where you can stream stuff. You have a used chat hook, which you can use right in your React application. And it works so well with Next.js and their API routes that you get this cool streaming effect really, really quickly. Now, the last thing that you want to look into, if you really want to go deeper into being like an AI developer, I almost kind of hate that term because it sounds corny, but it's a real thing. Like I don't know how to describe this new role, but I've been caught in the AI developer. I'm beginning to just embrace the term, just like I embraced full stack when people used to dunk on that and think that was a stupid term. And now look, now it's become the catch-all phrase for basically all software developers. So I don't mind the term AI developer anymore. I'm gonna lean into it. I'm just gonna own the term AI developer. Now, if you really want to go deeper, we can't leave out agents. Now, agents is probably my least favorite buzzword and it gets overused. Basically, what people mean when they say agents is a small encapsulated workflow that uses AI for a very specific purpose. For example, the thing we just spoke about, this little agent that could create an article, that could be an agent. It takes in a request and then it gives back some information in a uniform interface, just like any other API. Now, where this gets interesting is maybe you have a chat that will support multiple things. The simplest thing to do would be to have two agents. One to first summarize the conversation and then pass that information to the next agent, which may actually do something. So maybe you type something in like, hey, I want to write an article like Brian, but you're like 20 messages deep, right? So the first thing you might do is pass the last five or 10 messages to an agent. That agent would summarize the messages and then it would extract the actual intent of the user, and then it would say, Cool, I actually see that you want to write an article not just like Brian, but about this particular topic. And now I'm gonna pass that down to our article agent who's gonna use our vector database, get some articles, and then write that actual thing and do the thing that you want. Or maybe you fat finger the keyboard and you write ASDF, blah, blah, blah, blah, blah, blah, blah, blah, blah. And it says, I don't know what to do with that request. So I need you to clarify what you mean. Right. So now you have an agent that's gonna basically be your orchestrator or your guard agent that's gonna say, I can't do anything with that query. So I'm gonna ask you to refine or clarify what you mean before I pass it downstream to our other agent. Or maybe you type something in like, hey, tell me what you feel about Donald Trump or something like that. And it's like, well, that's not actually the goal of what we're doing here. So I'm not just gonna let you ask weird questions. So it's like, unless you're asking something that has to do with an article, I'm gonna kick your request back to you so then you can refine it and ask again. This is a very, very basic agent structure. Now, where people go wrong with this is that they have a bunch of agents that pass more and more information downstream. Agents have a 70% success rate from what I've read, but let's pretend it's 90% success rate. So every agent downstream has 90% of 90% of whatever the last agent did, right? So you have five or six agents downstream. You're talking about around a 50% chance of actually getting an answer that you actually want. So keep your agents tightly scoped, have them do one thing, and ideally, don't pass lots of requests to agents further and further downstream because you're less likely to get really good answers. And this idea that you can just have an agent just do anything or just do whatever, just have access to everything is ridiculous. If anybody tells you to do that, they don't know what they're talking about, and please correct them. Do some research on your own, do some work on your own, and you'll see that this is a foolish fantasy being sold by companies that have something to gain. And there's a reason why we don't have a single agent out there right now doing anything of the sort. If there was, you better believe that everybody would have bought it or we would have at least heard about it. There aren't agents that just do whatever at this point. There are very few agents that are working and even less that have been successfully deployed. I've actually deployed a couple out there living out there in the wild being used. And let me tell you, the more and more you try to pile on, or the more and more you try to make a single agent do, when your prompts are getting the length of an entire essay, you probably need to cut back and think, how can I split this up into different agents that do one thing very, very well? Very lastly, very lastly, as a real software engineer, somebody who cares about the product that you're making and how it behaves over time, you're going to need to add some sort of observability layer in here. Observability is basically how do you know when things are going right or wrong without literally typing into the chat bot or like looking at the quality of stuff that you're getting as you're using this app? When you have users, or maybe you're just you and your mom or something are using it, you probably want to be able to observe how the tool works over time. If you changed a prompt, how did that change the responses? Maybe OpenAI released a new model when you upgraded. How did that change the responses? Are they better or worse than they were? There's a few really big tools out there. I've used both of them. Helicone and Langsmith are the two biggest large language model observability tools out there. Helicone is really cool because if you do plan on selling your product and you want people to be charged by their usage, Helicone apparently has this out of the box. Langsmith, as far as I know, does not. They're both really, really great tools and they let you see and inspect what people are asking and the types of responses you're getting and the data that you're supplying to these large language models as well. More importantly, they show you the latency of the request, basically how long does it take to return a response so you can see, hey, did change that model spike our latency? Is the app slower or faster? What's going to improve the quality? What's going to improve the speed? And then seeing what are the trade-offs and how you make everybody happy. And this, my friends, is how you become an actual AI developer, not a bunch of silly buzzwords or take a course on prompt engineering. And I've seen very few courses, in my opinion, that are decent. Boot.dev apparently has a pretty good course. Scrimba, which I like a lot, looks like they have a pretty decent course. The problem is though, none of these really go far beyond the surface, in my opinion. And none of them are for TypeScript developers, which is exactly why I made the curriculum I did. But I'll be honest with you, people don't finish courses. And partially we kind of force you to finish the course because that's how you unlock the next course. So we see that people finish it. They have a capstone project to make. We look at it, we make sure it passes mustered, we grade it, and then we can be pretty confident that, hey, you actually did what we wanted you to do. For the rest of you out there, though, I don't believe you're gonna take this course and do anything with it, which is the reason why the only way I'm gonna be sharing this knowledge in a deep way and actually working with you on it is through a one-month program that I'm gonna do in January after the holiday buzz is worn off, and people are getting into the hiring mood and everybody wants to get ready for the new year. You're going to learn more than most developers, the overwhelming majority of developers who have zero clue about any of this stuff, but you're not just gonna learn how to do it. You're gonna build it with me and a group of other people over 30 days, and you're gonna learn how to sell this to your company because my goal is very simple to make you the leader of AI integration in your company when it comes to web apps. So when it comes to integrating AI into web apps, I want you to be the go-to person. I was that go-to person at this company and the company I'm going to, I'm sure I won't be anymore, but it was pretty fun being the knowledgeable guy in the company. I kind of helped lead the entire implementation from thought into the actual code that we're deploying and record labels will be using. And I see a massive opportunity for you out there if you want to do the same. Either way, I may have the course available for non-pars D students, but I definitely will have it available for that 30 days in January. And if it goes well, we'll probably do it again and keep adding stuff as things progress in this field. So if you want to be down, check out the link in the show notes or just grab the free stuff and have fun with the free stuff. But I'll also maybe see you in January if there's still spots available. See you around, hope that's helpful. And if you already know this stuff, holler at me, get at me in my inbox because I may actually have a role open for you at the company where I just left. And I'd love to talk to you a little bit more about it if you already knew all this stuff. Anyways, see you around. That'll do it for today's episode of the Develop Yourself podcast. If you're curious about switching careers and becoming a software developer and building complex software and want to work directly with me and my team, go to parsity.io. And if you want more information, feel free to schedule a chat by just clicking the link in the show notes. See you next week.

Podcasts we love

Check out these other fine podcasts recommended by us, not an algorithm.