.png)
Develop Yourself
To change careers and land your first job as a Software Engineer, you need more than just great software development skills - you need to develop yourself.
Welcome to the podcast that helps you develop your skills, your habits, your network and more, all in hopes of becoming a thriving Software Engineer.
Develop Yourself
#267 - Step-by-Step: Build a Real AI Project with Next.js & RAG
What does it actually mean to be an “AI Engineer”?
Honestly—not much. The title is overloaded and vague.
But what is meaningful right now is knowing how to build real projects with AI that go beyond toy chatbots and portfolio fluff.
In this episode, I walk you through the exact project I’ve been building at two different AI startups: a Retrieval Augmented Generation (RAG) app. You’ll learn how to:
- Scrape and store content in a vector database
- Use embeddings to turn your text into something a model can understand
- Stream responses back to your frontend with Next.js + TypeScript
- Reduce hallucinations and add structured, reliable outputs
- Understand why this is the skillset employers are actually hiring for right now
👉 Check out the repo and code from this episode here: https://www.parsity.io/ai-with-rag
If you’ve been wondering how to actually learn AI engineering skills that matter in 2025, this is the place to start.
Shameless Plugs
🧑💻 Join Parsity - Become a full stack AI developer in 6-9 months.
✉️ Got a question you want answered on the pod? Drop it here
Zubin's LinkedIn (ex-lawyer, former Googler, Brian-look-a-like)
Welcome to the Develop Yourself podcast, where we teach you everything you need to land your first job as a software developer by learning to develop yourself, your skills, your network and more. I'm Brian, your host. What the heck is an AI engineer? That title is loaded. It means many things to many people and, at the same time, is kind of meaningless. What I wanna do is walk you through, not how to be an AI engineer whatever that means but essentially how to build a project that's going to show you how to practically use AI.
Speaker 1:Over the last year, I've worked at two different AI startups building applications that leverage AI in a practical way. At the same time, I've seen this explosion of demand for AI engineers, whatever that really means. So what I want to do today is break down the new tech stack that I think you should be learning, as well as a project, a step-by-step project that you can build with your current web dev skills. If you know TypeScript, if you know JavaScript, if you know Next, if you know React, I want to show you how to build an actual project, a practical one, that's gonna set you apart from the majority of people out there that are junior developers especially, or more senior developers that are just not acquiring these skills. I hope you find this really helpful, because I've learned this through a lot of trial error and being very, very lucky.
Speaker 1:Now, in the last year year and a half actually I got laid off from my pretty cushy job as a senior engineering manager. I went back on the market and then I landed at a very, very tiny startup with some of the smartest people that I've ever met. In fact, I did an episode on the developer that I worked with there. He was a principal engineer at Amazon came to this company and I worked directly with him really closely for a little under a year and we built a production AI app. Now, when I say AI, that's a pretty nebulous open-ended term. What we really built is what's called Retrieval Augmented Generation System. We took information that we found on the web. We stored it in a very particular type of database that is well-suited for large language models like ChatGPT or Grok, or pick one of the large language models out there and use that to summarize and synthesize information to give back to people. That was kind of it in a nutshell, and we're going to walk through exactly that flow and how you can do that as a software developer and build a really, really cool project.
Speaker 1:At the current company where I'm now working, I'm a senior software engineer and I'm kind of doing the same thing. I'm using a different cloud service provider, I'm using a little bit of a different tech stack, including Python, but essentially it's really building the same thing. I'm seeing an interesting pattern here and I've spoken with other CTOs. I've spoken with recruiters that are reaching out. A lot of people want developers with this set of skills that I'm going to lay out, but there's a big problem Most developers just don't even one know this stuff exists and two don't know really how to acquire the skills. So I'm seeing a proliferation of these boot camps popping up that are kind of teaching this stuff and they're encouraging people to become data scientists, maybe get into data engineering or machine learning engineering. These topics require a very, very solid understanding of math and, I'll be honest, I don't see a lot of people getting hired for these positions that don't have master's or PhD degrees. So if you're a web developer, before you decide to switch directly into Python or try to go the ML route or the data engineering route, I wanna lay out a project you can do over the course of a week or so, and this is actually basically the project we're going to be doing at Parsity. We're already really highly encouraging students to build with AI, meaning like, hey, add some AI into your app right, because it's going to teach you a lot about what employers are expecting and it's just going to get you ahead of the game because this is not going anywhere right.
Speaker 1:We can't stick our heads in the sand and say I'm not going to code with AI and we don't need to use it, or anything like that. I'm not a super optimist, and I'm also not a super pessimist. I don't see jobs just getting obliterated, like a lot of people, in fact, amazon's CEO of actually the AWS CEO, amazon Web Services, came out last week and said the idea that AI will just replace all juniors, or that we should fire junior developers and just replace them with AI, is the dumbest thing I've ever heard. I'm like thank you. Finally somebody that's sitting closer to the top one of the largest cloud service providers, or maybe the largest cloud service provider on earth saying that's stupid. Right, let's be real. Anyway, we're not here to debate whether AI is going to take your job or not.
Speaker 1:We're here to talk about something way cooler. How do you build something interesting? I'm going to tell you the exact tech stack that I'm using, along with the flow for data and setting up a data system, the types of databases you're going to use and some of the interesting libraries that are out there that are pretty widely adopted, that you can use to build something way beyond just a chat app. So here we go. First off, I'm using Nextjs in TypeScript and you're thinking, wait, what Not Python? Listen, at both the companies where I just worked, we used TypeScript. In fact, we had some logic written in Python that the head of engineering at the last company, the Amazon dude, said, hey, written in Python. That the head of engineering at the last company, the Amazon dude, said, hey, just rewrite this in TypeScript. He's like we're going to put all this in Nextjs.
Speaker 1:Why Nextjs? Because it has a lot of excellent libraries and server-side capabilities and it's full stack, so you can write your back-end logic and your front-end logic in one place, deploy it as one large application and you don't have to worry about having a backend and a frontend deployed separately and worry about all that kind of stuff. Also, large language models tend to do really well with types. So TypeScript is a great choice. In fact, if you're not using TypeScript, I highly suggest you add that to your tool belt. That was one of the things we added at Parsity last year and it was really difficult for me to learn personally, and I really didn't like TypeScript. But once you learn it, the learning curve is a little bit steep and then boom, you're off to the races. So take time to learn some TypeScript.
Speaker 1:So here's what we're going to build at a really high level. In fact, I'm adding curriculum to the course as we speak about this. We're going to build a knowledge base, basically a glorified chat app, essentially right, where you can type in a message and you can send it to your backend. That we're gonna talk about how to create and what it's gonna do is it's gonna look into a vector database that has documents of particular knowledge. Maybe you wanna store things from a journal you have. Maybe you wanna store articles from your local newspaper about a topic you're interested in. Maybe you wanna store stuff about football players and upcoming sports stats or whatever right? So we're going to build this thing using some web, scraping some old school stuff and some very new school stuff so you can talk to your knowledge base and query it with text called a semantic search.
Speaker 1:Think about this problem that you may have encountered yourself. Right, you're trying to watch a movie on Netflix or Comcast or whatever cable service provider. You have right. And instead of being able to type something like hey, I'd like to watch a movie about 1980s aliens coming to Earth, that it's PG-13. It doesn't accept queries like that. Right, you can't enter a search like that. Wouldn't it be cool if you could? Right, right, it's like well, no, you have to enter in the entire movie title, right, and that's lame. So what we're building is something called RAG retrieval augmented generation. So if you think about generation, that's what large language models do, think about ChatGPT or Grok or Claude, or whatever you're using.
Speaker 1:You ask a question, it will give you an answer. The problem is it has limited knowledge. In fact, its knowledge base is not updated frequently. These updates are done every year, every six months or whenever they decide to do it. So if you ask it like what's today's weather? It's like I don't know, I'll have to do an internet search and tell you that information. Or if you say, hey, what did I write about in my journal yesterday? It's going to say I have no clue what you wrote about in your journal yesterday. Also, these models are incentivized to hallucinate, meaning they're incentivized to give you an answer despite them having zero knowledge of the topic. This is where having a knowledge base comes in handy. If you can give the LLM, chat, gbt or whatever some context, some information that is proprietary, that you own, or some very specific knowledge that you want it to have, then it not only eliminates hallucinations or it greatly reduces them, but you can get very, very specific information about something without having to look it up.
Speaker 1:I, for example, built a small version of this for my duty on my child's school board. I'm part of a parent faculty club and I'm the parliamentarian, meaning I'm supposed to know all the rules for this club, right, and there's pages of them. I'm like there's no way, how am I going to know this stuff? Right, and there's pages of them. I'm like there's no way, how am I going to know this stuff? So I built essentially this little tiny rag app where I could upload all the documents and I could say, hey, what are the rules regarding this particular law or whatever. During elections, what should I tell people? How am I supposed to run elections for the school board and it could tell me. Now you think well, why don't you just copy and paste that into ChatGPT? Well, there's a limit, right? You can't paste 20, 30, 40 pages of documents just yet. So this is why you need to have something called a vector database, which we're gonna get into in just a second Somewhere where you can store information that your large language model can retrieve before it gives you an answer. Basically, it's just a way for you to give very, very specific context to a large language model. So we're going to build this knowledge base, right, this knowledge bot, if you will, where you can ask a question about something very specific, and it's going to have this information. So the first thing we're going to want to do set up your Nextjs app and then we're going to use a library like Churio or Puppet or whatever you want to use to scrape the web.
Speaker 1:Web scraping has been around forever. It's honestly how ChatGPT and all these other big companies that do AI, how they got all that knowledge in the first place. Like, how do you think they know everything about code and literature and books and all these things? They just took it off the web. So what you're gonna do is identify some area of knowledge that you want to have available to your large language model, and you're going to scrape websites, right, you're gonna write some sort of code that's gonna visit a URL, look at the information at that URL and then take it, and then we're gonna store it somewhere, and where we're gonna store it is a place called a vector database. Now you can either scrape this information or you could literally like copy and paste the text. It doesn't really matter, but using a web scraper is going to help you understand. Well, how do I just constantly update my knowledge base, right? Like, if you're doing sports or stocks or something it's really important, your knowledge base is updated a ton. So you probably want to have a web scraper, some sort of code that can independently go out or get triggered on a nightly basis or an hourly basis or whenever, that goes out to certain websites, grabs text and information from them and stores that information somewhere, and the somewhere that we're going to store that is what's called a vector database.
Speaker 1:Vector databases have become the unofficial databases for large language models. Why are they the unofficial databases and what are they exactly? Vectors are essentially like big arrays of numbers. So when you write something to a large language model and you're typing, hey, my cat is sick, what should I do? What it's doing is taking all those words and it's vectorizing or embedding them. It's turning them into an array full of numbers. Those numbers have tons of meaning in them. They encapsulate the literal meaning of the word, the emotion, all sorts of things that are a black box to us, meaning we don't know, because these things are proprietary. They were built by these companies and they will never expose their secret sauce, right? So what we wanna do is take our knowledge and then vectorize it and then store it in a special type of database where we can add these numbers.
Speaker 1:And what a vector database does is it allows you to look at other vectors basically think of lines on a graph and it allows you to look at vectors that are similar to what you write. So when I write something like my cat is sick, what should I do? And I write something like my dog is sick, what should I do? What that vector database will do is extract the meaning, the similarity of those two different vectors and return you vectors in that space that match it. They'll say, well, that's kind of like this and that's kind of like that. You don't need to really understand this, but if you do want to go deeper into how vector databases work embeddings, linear algebra I would watch three blue one browns series on linear algebra and how large language models work. It'll give you a really good intuition and a good starting point to understand way beyond what the average person knows about vector databases and how large language models work in the first place. And if you're using large language models for a therapist or something like that, watching those videos will probably make you reconsider whether you should be doing that and just give you a much deeper understanding and appreciation of what's going on beneath the hood.
Speaker 1:So we have a vector database I suggest you use something like Pinecone and you have your articles right. You have a web scraper and you have a place to store the embeddings or the vectorized information. Now I'm doing this at a really high level. At Parsity. We're gonna go very deep into every single one of these topics so you can actually build a production grade one. But I'm giving you enough information to be dangerous, so you can just go ahead and play around and have some fun with this kind of stuff and in the show notes because I'm feeling so generous, I'm actually gonna open source a live event I did. That includes the code base and a bunch of posts 800 posts that I've written on LinkedIn with instructions on how to vectorize them and add them to Pinecone and do your own little rag bot based on my LinkedIn posts. Anyway, now that we have our web scraper, now we have our Pinecone database or choose a vector database. I like Pinecone. It's really easy to set up and use and they have a free tier.
Speaker 1:What you want to do now and this will cost you, let's say, five bucks you might spend five bucks doing this project. Is it worth it? A hundred percent? If you don't want to spend five bucks trying to educate yourself on this stuff, then you don't want to spend five bucks trying to educate yourself and just turn it off and good luck out there. So what you're going to want to do is go to open AI and you want to go to the API section of open AI. You can just look up open AI plus API put $5 in credits, get the API key and you're going to use the OpenAI library in TypeScript in that next JS app.
Speaker 1:What you're going to do now with your web scraper is you're going to scrape five, 10 pages or so just as an experiment. You're going to vectorize, you're going to embed them using OpenAI's embeddings library. So once you download OpenAI, the TypeScript library, you can use the embeddings method, and what you're going to do is you're going to vectorize those articles. Now, articles are large, so you might have to embed multiple pieces of that article.
Speaker 1:And what I suggest you do because that's beyond the scope of what I can really tell you over a show is to do something like create semantic chunks, look up a way to chunk or split up the pages into paragraphs, or like get X amount of words. Like, if you want to do it, really, really simply just split that entire page of words that you're scraping into. Like you know, 600 characters max, right, like, so every 600 characters, you create an embedding and then you dump that embedding into Pinecone. There's way better ways of doing this and that's way beyond the scope of what I can go into here. Again, we're going to get into all the nitty-gritty details and parsity in the course that I'm creating, because I think that's super important to know. There's all sorts of fun little things to figure out along the way here. But this is just a really, really bare-bones implementation that's going to teach you a ton and just have some fun along the way.
Speaker 1:So pick an arbitrary amount of words that you're going to use in each chunk that you're going to embed. You're going to pass that chunk to OpenAI's embedding method. It's going to give you back some vectors. You're going to throw those vectors in your document store, in the vector store in Pinecone. Voila done, step one done. That might take you, I don't know, a few days to do, maybe a week to do this kind of thing. I would just start off with like 10 articles 20, 30, 40, whatever. You don't need a ton. Honestly. You don't need a lot of information here to make this fairly useful. Once you feel pretty confident with your web scraper and your chunking and your embedding method, you might wanna even consider having this web scraper run on a nightly basis. You might wanna give it a few different websites it'll go to and then, if that information changes, it'll update the Vector Store and you'll have the most up-to-date information, and then you can ask questions and you'll get like information that is really really relevant.
Speaker 1:Now onto the chat part of this. Now, how are you gonna interact with this Vector Store. Nextjs has this really cool library called AI, which is pretty wild. I think it's actually Vercel's library which is pretty wild. I think it's actually Vercel's library. Anyway, look up Vercel plus AI. Literally, their library is called AI. What a great name. And what they have is this really cool software developer kit, this library or set of libraries and tools that you can use in Nextjs that will stream text and automatically interact with an API route in your Nextjs application.
Speaker 1:So what you're gonna do in your Nextjs application? So what you're going to do in your Nextjs application, you're going to create a route, call that route, chat right, api slash chat. If you don't know how to make routes in Nextjs, you should have gone to Parsity or just look it up and figure out how to make a route, an API route, in Nextjs. It's really not that difficult and the tutorials on the Nextjs page, which I'll have in the show notes as well, are pretty easy to follow along with. There's definitely some room for improvement there. But hey, what docs are just you know? Perfect, right, they're pretty good, they're good enough. So you've installed the AI-SDK and you've installed AI from Vercel. You can look that up and figure that out and now you can have a front-end component a, a page that's going to use a method called use chat, which will take in your API, which will take in the route to your API. So it would be like something like API dash chat, wherever your Nextjs route lives. And then when you type something and hit enter, and then you can take that query and embed it.
Speaker 1:Again back to this idea of embedding or vectorizing. We're going to take whatever you type in. You're going to say give me the best stock prices or give me the worst player in the NBA yesterday or what was. I don't know anything about stocks or sports, I don't know why I use that as an example. But you know, tell me some information about something, about stocks or sports or whatever, right, some information that would be relevant to the stuff you've scraped, right? So, whatever you've scraped, if it's your journal, you'd say, hey, how's my mood changed over the last week or year? You know what has been my best moment in the last weeks, but what's been my worst moment this year?
Speaker 1:You get pretty deep, I guess, in this if you really want to, and what you need to do is take that query, that thing you've typed and embed it using OpenAI's embedding method. It'll return you a vector, an array full of numbers. Now what you're going to do is you're going to query Pinecone. You're going to say, hey, pinecone, you're going to also download that library using NPM or Yarn or whatever. And what you're going to do is then say, pinecone, here is that query that I've vectorized into numbers. Give me back the top five or 10. You have this thing called top K, where you say give me back the top X amount of vectors that are close to this one.
Speaker 1:Right, it's not like a traditional query where in SQL or MongoDB or some traditional database, where there's like one thing you'll get back. It's like no, it's kind of a fuzzy match, it's a semantic search. It's like a feeling like give me back what feels kind of like what I'm asking. So it'll give you five different options, or 10 or 20 or whatever. Take back as many as you think is appropriate, and then you could feed those to OpenAI's messaging method and you could say, hey, openai or ChatGPT, here is some information and here is the original query. Here is some information and here is the original query. Give me back an answer, or give me back a response to this query with this information.
Speaker 1:This is retrieval, augmented generation, because now you're not just asking hey, chatgpt, what did I write about in my diary last week? It's like I don't know. Now you're saying, no, no, here's what I wrote, here's the top five things that match the query that I wrote and I'm feeding that to you now. And now you're going to summarize that and tell me, because just getting back the information is sort of useful, but summarizing it and giving it back in a human-like voice or context, whatever, is even more useful. And what's even better is you don't just have to just give it back like an API response where it just goes on the screen. You can stream it back, so it goes like letter by letter or sentence by sentence, line by line, and it looks basically the same way that OpenAI's chat GPT interface works right now.
Speaker 1:This is a really cool way to build stuff. This is essentially what I've built, or a very, very slimmed down version of what I've built in the last year at two different places. Now, a lot of this you're thinking. Well, that's not even that much different than what I'm doing now, and that's the point. The point is that these new skills using vector databases, understanding embeddings, learning how to leverage RAG, scraping the web these aren't really really difficult things to learn.
Speaker 1:Now, a few of the things that I didn't get into here that I'm definitely gonna get into in the course that I'm creating is also structured responses. Now, openai has an API that you can call. It's basically like a way for you to talk to ChatGPT, but programmatically through your code, so that way you can give it your own. That way you can add things like the context that we just spoke about from looking in your vector database and you can add all sorts of system prompts. You can really manipulate the way that it responds. You can even tell it like the temperature to use. You can say, use 0.1 temperature. That's a setting you can use to tell it. Basically, don't make stuff up. 0.9 or something would be like, hey, go wild, be really crazy in what you say back to me and kind of give me some more freedom to decide what to tell the person that's receiving that information, right? So one thing we didn't talk about is structured responses.
Speaker 1:If you're using TypeScript, you can create a schema and you can pass that schema into OpenAI's method and it will return you something that follows that schema. Why do you care? Why does this matter? Well, if you use APIs as a software developer, you get structured responses, you get objects right. You get an array full of objects and they have certain keys and values. That's important because on your front end it may expect to find certain keys and values right. So maybe you wanna have like sources for your information and you wanna have something like the text and the source. You can create a schema using a library like Zod, which is really really popular for TypeScript, and you can define a schema where you can write something like text, which is a string, and you can write source, which is also a string, and you can write confidence, which would be a number, which is one through nine, and you can feed that to OpenAI and it will return you an object that has those values and those keys. That way, when you give it back to the front end, it always knows what to look in. So then you not only have a really cool response that's getting streamed across the screen, you can say what is your confidence in what you're telling me and also what is the source where you got this.
Speaker 1:Think of the applications you could use this for. This could be used for medicine, for the law, where it's not enough to just say, hey, chatgpt, my patient has, you know, some sort of late-stage cancer. Tell me what I should do, and you get back some nonsense or some hallucination. If you use something like RAG, you not only get back some relevant information based on tons of documents that it could look through and find, it also will tell you the source. So the doctor, the lawyer, whoever could click on there and say, oh okay, this is not just a hallucination. So now I have back something I can tell the patient and I also have the source, for that's from. So now I feel confident that what I'm saying isn't a completely made up thing. This, my friends, is where the money's at. This is where companies are investing. These are the skills and the methods and the projects that you can build that actually make people pause and say, oh, that's interesting, that's something cool.
Speaker 1:You can go deeper into any one of these territories, whether it's web scraping or chunking and vectorizing stuff, or the UI stuff. Or I'm thinking how do I have custom UI responses? Or how do I have multiple agents? Maybe? What if I ask for something that my vector database doesn't have? What if I have multiple vector databases. Well, how do I orchestrate and coordinate between all those things? This is where fun stuff comes in handy. How do you write tests for this kind of stuff? How do you observe it and make sure it doesn't drift or say crazy things? There's a lot of ways we can have fun with this kind of project, and this is exactly what we're going to be working on at Parsity in the very near future.
Speaker 1:I think this might be the most fun time to join ever. I cannot wait to release this, because this is really stuff that I'm doing, and I know that it's going to make you a lot more desirable and attractive on the market nowadays. So maybe this could be called the AI engineering course. Maybe this could be called AI full stack. I don't know. What I do know is this is the stuff you should be learning. I hope this was helpful. If nothing else, take a look at the GitHub repo that I've included at the bottom of the show notes. It might be privatized soon, because I don't want to just share it forever, but use it, see if you can play with it, pick it apart and take what makes sense and see if you can take it a little bit further than I did Break stuff. Have a little bit further than I did break stuff, have fun with it, rip it apart and stop building portfolio projects like it's 2023. Sincerely hope that's helpful for you. See you around. That'll do it for today's episode of the Develop Yourself podcast. If you're serious about switching careers and becoming a software developer and building complex software and wanna work directly with me and my team, go to parsityio and if you want more information, feel