Navigating AI Adoption in Finance with Finpilot

AI technologies have sparked widespread curiosity and adoption across many industries, encouraging professionals to explore the practical applications of AI in their daily tasks. This is no different in the finance industry, where experts have been experimenting with the transformative potential of integrating AI into standard tasks.

That is what startup, Finpilot is doing. Described as ChatGPT for financial questions, Finpilot uses AI to pull information out of unstructured financial data. The co-founder and CEO, Lakshay Chauhan, joins Jim Jockle to discuss this technology, its implications and its future.

Speaker 1: 0:07

Welcome to Trading Tomorrow Navigating Trends in Capital Markets the podcast where we deep dive into the technologies reshaping the world of capital markets. I'm your host, jim Jockle, a veteran of the finance industry with a passion for the complexities of financial technologies and market trends. Because this is Trading Tomorrow navigating trends in capital markets where the future of capital markets unfolds. Over the past year, the rise of AI technologies such as Chak, gpt and Copilot AI has sparked widespread curiosity and adoption across many industries, encouraging professionals to explore the practical applications of AI in their daily tasks In finance. Discussions about the transformative potential of integrating AI has never been more popular. Until recently, it was just talk, but now this interesting frontier of innovation is rapidly becoming a reality, and today we're thrilled to welcome a guest whose groundbreaking startup is helping.

Speaker 1: 1:13

Joining us today is Lakshay Chauhan, the co-founder and CEO of FinPilot, which is being called chat GPT for financial questions Currently available in public data. Finpilot uses AI to pull information out of unstructured financial data, for example, data found in SEC documents. Along with his co-founder, john Alberg, lakshay's company received $4 million in seed financing led by Madrona, with participation from Ascend VC and Angels from leading hedge funds. Lakshay is a longtime machine learning engineer in Seattle for the hedge fund industry. Lakshay, thank you so much for joining us today. I mean, it's a fascinating product and I'm really excited to dig into this a bit further with you. So perhaps just to kick us off, where did you come up with the idea for FinPilot?

Speaker 2: 2:02

Thanks, jim, happy to be here. It's interesting. So before starting FinPilot, I was at a hedge fund and I was a head of machine learning there. So I spent a lot of time building ML models for investing purposes, right, and so I was really deep into financial data and trying to, you know, build prediction models with deep learning. And so, you know, over the three, four or five year period I kind of, like you, had mined all the quantitative data we could for the fund that I was working at.

Speaker 2: 2:33

And so during that course of the time, like after, you know, mining all the quantitative data, we were looking at unstructured qualitative data. So, you know, we know everything about the financials, unstructured qualitative data. So we know everything about the financials. We know everything about momentum data, whatever we could get our hands on. But then there's this thing about the quality aspect that matters a lot in investing and that we weren't really capturing at the time. So that's where I started to dig into. It's like what can we do in understanding this unstructured data, this textual data? And so we started digging into these sources like filings and transcripts and market research reports. And when I was looking into these, transformer models had come along. It had been a few years and I was playing around with them and so this is way before ChatGPT or GPT-3. I think GPT-2 was out at the time.

Speaker 2: 3:25

But the fact, when I was playing around with these, that these models could understand language so well was very surprising to me, and that was like whoa. Actually, the fact that these models are good at understanding long, you know, text and like logic and reasoning could be more interesting for the human side of things, or the analysts themselves, because they are the ones reading these long documents and computers are processing data, which you were already doing. So I think that was sort of like okay, can we do something with like for the humans, because there's just so many analysts there? And that to me seemed a very compelling opportunity, given that, as an all knowledge worker, spend a lot of time right reading and synthesizing information and given that I could see, you know, ai is getting there to understand these sort of documents. And that was really, like you know, a starting point and we started talking to people and kept getting more and more signal what it could look like.

Speaker 2: 4:26

But that was really the genesis of why it made sense to do it, because just these models got so better at a point where they could understand this information more generally. You didn't have to program them right, you didn't have to program. This is how you extract data from an earnings call or understand sentiments like this or whatever. Like it was very general purpose and to me the applications was like, okay, the human productivity side of things could be really, really fascinating. So that, yeah, that was it.

Speaker 1: 4:57

So you know you mentioned unstructured data and that means a lot to a lot of people. You know were you straf? Were you strafing security cameras of people walking into stores? Maybe give us a little bit more context around that.

Speaker 2: 5:13

Yeah, no, that's a very good question. So typically in the financial world or the quantum world, quantitative data is just numerical data, like tick data, credit card data, like all the numbers right, and unstructured data is typically numerical data, like tick data, credit card data, like all the numbers Right, and unstructured data is typically they can be numbers, but it's basically something that has not been processed and is more raw form. So it could be security cam footage. You know you've heard hedge funds looking at you know parking lot images of Walmart and trying to figure out, OK, what's the traffic? Like parking lot images of Walmart and trying to figure out, okay, what's the traffic like.

Speaker 2: 5:49

So that would be categorized as unstructured data. What I'm specifically meaning as more in terms of like textual data, so PDF reports, right, SEC filings, transcripts and you know management calls on, you know conferences and stuff like that. So the data is raw, it's not structured, it has not been analyzed in any way, it's not been, you know, easy to search. That's what I mean. But you could sort of like same thing applies to videos and audio and whatever.

Speaker 1: 6:11

But yeah, so tell me what goes into building a product like this.

Speaker 2: 6:16

With the current state of AI is interesting. Right when ChatGPT came along in November of 2022, it was quite interesting Like you just type anything and you'll get something amusing back to you, and so the technology of large language models enabled you to think about a lot of different things and quickly build something that could show you oh, this is possible, like a very cool demo. But as we started working with these models, we realized that actually, if you know, we want to take care of like analysts right, like buy side analysts, for example it would take like a lot more than sort of just putting together these APIs, and we realized that these models are not good at domain specific information. So if you wanted to ask finance related questions, they would make a lot of mistakes, both in terms of understanding the question, but also like hallucinations, which is the technical term for making something up that doesn't exist in anywhere. Right, and language models are notorious for that. So our approach was, in having the ML background, we built this retrieval system that has been fine-tuned and built for financial domain, and so what that means is we have four AI models that we've built in-house that understand financial documents very well. So when you ask a question. When you try to understand a table, it just knows much better about.

Speaker 2: 7:46

Okay, what's being asked? What is the right piece of information, what's the nuance between EBITDA or adjusted EBITDA or all these kinds of nuances that general models don't capture? And to do that we had to do training our models and running our own GPUs and running our own inference stack, and that has been, you know, a lot of fun because, like, you kind of like uncover these little nuances that make you, oh, like, play around with these models and you kind of learn where they fail and where they don't fail. But then building it our own helps us, you know, make it faster and cheaper. So that's a big part of it.

Speaker 2: 8:22

Like, our sort of like core thing is building a retrieval system that can identify. When you ask a question or give a task, it can identify what piece of information do I need to find? Where do I need to find? From thousands of documents, essentially. And so that's been the core part of what we've been doing so far. And then you kind of like layer on top of you can put a chat, you can put like AI agents on top of it. You can do multiple things, but the core of it is like being able to find the right piece of information you're looking for, with confidence and accuracy.

Speaker 1: 8:56

And you know, I think maybe you could take us a little under the covers right? Everybody just talks oh, you got to train the model, right, you know, but for something so specific, like you know financial services, you know what goes into that training process.

Speaker 2: 9:13

It is, like you know, we've been training our models and you know, you see, you know companies like OpenAI spending hundreds of millions of dollars, potentially billions. You know companies like OpenAI spending hundreds of millions of dollars, potentially billions. And then there's smaller firms and newer companies like ours doing different types of training. So, yeah, so essentially what a language model really is like, the way they're trained is right, you take all the text that's possible in the world that they can get their hands on and they try to feed into this model, which does fill in the blanks. So you would have an English sentence, like you know, the cat is eating its food or something, and you would like blank out two words and then you'll force the model to predict those words through this probability distribution of all the words that are possible. So initially it's random, it's just filling out words, but as it's training, it tries to, you know, understand what is the most likely word after this sequence, right? So this is called pre-training and this is what the most expensive part of it is. And so when models are being pre-trained, where they're learning, like how to complete sentences essentially, but they don't have any specific domain knowledge about, you know how to do a financial analysis or why NVIDIA's certain metrics are different than AMD and things like that, like how one company is defining net retention revenue different than another. It doesn't have that nuance. It's general so far.

Speaker 2: 10:38

So to incorporate that specific knowledge for finance or some team or some company, you need to teach that model.

Speaker 2: 10:48

So you kind of like extend that fine tuning, like training process and sort of like zone in into specific aspects of this model that you care about, and so what that looks like.

Speaker 2: 11:00

You create a data set of like hey, this is what I want to do and this is the output, or these are the things I'm looking for, and you give a lot of training examples. So if you're trying to teach it, it's almost like teaching like a young kid, but with lots and lots of examples. Like, hey, I want you to understand the nuance between this term and this definition, or how this company calculates subscriptions, or you know, billings versus this company or whatever it is, and you just create like this data set by either, like you know, having humans or expert analysts sort of annotate it or doing yourself or some automated system. But you know, essentially you're trying to give it more examples of what you want it to do or where it's failing, and so that part is what we typically mean by fine tuning is like training that last player of the network, to just understand you a little better on what you're trying to do.

Speaker 1: 11:52

You know it's interesting. As a novice in this world, if you may, I was at a lecture a couple of years ago and it was fascinating to me how Captivy sells all its data. You're going to go buy tickets to go to whatever rock concert or whatnot, but it's basically humans teaching the machine images, which was fascinating. What is a stop sign? Where is a bike?

Speaker 2: 12:22

Which one's an electric pole or something of that nature, and and it made me feel a little stupid it is very similar to that right like yeah, it is for us, like it looks like we have, we're trying to teach it a very nuanced take on things that expert human analysts would expect and want to. So, beyond just like, is it good or bad, you want to understand why, something like as you know, if you take the top analyst, any field or any company or any industry, you want to understand how they do, their reasoning and thinking and impart that to the models. So that's where the challenge and the opportunity comes in.

Speaker 1: 12:59

So, you know, that opens up a whole whole other argument and discussion of. You know, are we losing our job to computers? But you know, let's leave that for a different podcast. So let me ask you a question. So your team has two products. Let's start with the web-based AI chat tool. You know, can you give us a situation where in which a financial professional would want to use this?

Speaker 2: 13:23

Yeah, yeah, so yeah. So we have this, you know, beta, open for public, which is like this chat tool and you can basically ask any question about companies, financial questions about companies, and what's really good at is what and you can specify sources. You know, if you only want to focus on SEC filings and transcripts, you can specify. If you want to use the web, you can specify it. But once you do that, you can ask any question and it gives you, like it can scour all those documents and give you succinctly the answer that you want. And the very nice thing about it is it can take you um, all everything is cited, so like, if it cites a number, you can click on that number. It'll take you exactly where that's coming from, even down to a cell in a table, and that's very powerful for two reasons.

Speaker 2: 14:10

One, language models are known to hallucinate, as I mentioned before. So if you go to ChatGPD and ask a bunch of financial questions, more likely than not you will find something that's not true or not present anywhere. But the other problem is the field where we operate in, like, accuracy is obviously the most important thing, right, Like I need to be able to trust the output. I need to be able to know where the numbers are coming from and to build that trust and confidence for the analysts. We have spent some time in building this where you can cite everything, which is not that straightforward, but we feel like if you get one thing one time wrong and you can't verify it, you're not going to be able to use any of these tools right. Like trust is going to be a big part of AI adoption across any industry, but especially for us, because you know, like if you have to manually just do all the work, you're not, like AI is not providing value.

Speaker 2: 15:10

So that's one thing where, like, well, somebody can ask you know, hey, what has been going on with the litigation of 3M in the last five years? They have something going on with the PUFA or something, and you can get a very quick answer, versus like reading all last five years of documents, or hey, why have the gross margins of a certain company been falling? And you can quickly get those answers. And obviously, like simple things about, like segment revenues and things like that.

Speaker 2: 15:37

But the other thing we actually launched recently, which was and we didn't know how popular it was going to be, like it is, so basically, a lot of buy-side folks have investment thesis right and after a quarter happens they want to know hey, for this thesis, were there any question answers that were discussed in the Q&A section of the call? So in this tool, basically you just put in your thesis or whatever topic and within you know two seconds you'd get all the relevant Q&A questions, so you don't have to dig through all of them manually and it turns out it's pretty popular. So, yeah, those are kind of things where you want like quick answers and you want to like dig into a company and even get some ideas from the AI to analyzing that or this. You know QA analysis tool which is just pop in your thesis and get back what's the most relevant questions from the call.

Speaker 1: 16:36

So so forgive me, because my producer is going to hate me for this bad joke. So chat gpt hallucinations clearly needs to stay away from the digital mushrooms. However, the question I have is how do we think about this? Do we think about this as a productivity tool, in terms of of saving time and research, or do we think about this as more?

Speaker 2: 17:02

uh, finding alpha right, it's a, it's, it's all. It's probably the best question one can ask at this junction of like, because we're still it's so new, the technology is being developed. It's kind of like it opens the possibility for different things. I think it can be used for both and it will start as more productivity, because for the alpha piece you need more reasoning and you need more systems embedded in and for the alpha piece of it, my intuition is it has to be very focused and strategic. That means you need a lot of human input, humans kind of design, and come up with ideas and sort of help and get help from AI for executing those ideas. And that's a scale and speed that humans cannot do.

Speaker 2: 17:53

But what obvious and probably the first step is in the productivity side. So whatever you're going to do, ai can help you do it cheaper, faster, you know better and that's like the obvious thing, that's like the lowest hanging fruit. So I think it'll tag both markets. My intention is productivity is going to happen first and then it'll kind of like fall into the alpha market a little bit. But alpha is such a you know it's very hard right Like it's very hard, very hard to generate alpha right and part of it is like it's an art in some sense, like if it was science, you know you would have figured it out, and so part of that is this human AI interaction. So I think people who can leverage AI in terms of like either more coming up with more ideas or executing them better than others, I think there's value there. But obviously the first step is just to get the productivity layer fixed and you can get as much value from AI on that layer initially.

Speaker 1: 18:52

You know, in preparing for today's call, one of the things I read that in pilot is building links into the output that references the primary source material. Why is this unique and why is this important?

Speaker 2: 19:06

Right, yeah, so it is very, very important. For example, take a case of analyst, investment banker, sell side, buy side, right, and where the AI can help. So let's say I'm writing a report or I'm looking at a company and I want to get a head start and I have AI do like a first draft of you know a quarter or some market or some you know top 20 companies in this industry sector. I want a consolidated information and, like an AI has done all the work and I have a report right Now, given at least now the way the technology works is like, I can never be a hundred percent sure that the outputs that have been generated by AI, like the numbers, the facts, will always be a hundred percent correct, right? So you cannot simply risk sending that report to a client where even one number or something is wrong. Where you know this, you know mentally that AI can hallucinate. You'll see like, oh, can I trust any of the other facts? Right? So to build that trust is super critical, because that's the only way to drive adoption of AI tools for the analyst. So our sort of approach is well, we can't change how LLMs are trained or we don't have that technology as an industry yet where we have 100% confidence. So it's almost like what can we do? That's the next best thing. And the next best thing is you try to link and source back pretty much everything, like down to every number. So if you think something's fishy or something doesn't make sense or something is very critical, you can check in a click right and then okay, that's good with me.

Speaker 2: 20:46

So I think it's very critical one to have the human analyst, trust the AI output and drive the adoption and then go on to, you know, leveraging all the productivity benefits. Without it it's like, oh, it's a cool tool and I can, you know, it's a good starting point, but I have to do all the work again because I can't trust it. That's, it beats the purpose. So I think that's why it's very critical. It's unique.

Speaker 2: 21:10

I think we've spent a lot of time on that because early on we just figured out like, as analysts cause you know, coming from the hedge fund world, like I, if I can't trust it, I just won't use it. So, like, being able to source everything back is challenging in building the right system. So you need sort of optimizing the whole stack so you can go back, flow through your system and go back to the primary source document, but I think to answer your question in one sentence, it's going to be very important for adoption, otherwise it will remain in a surface level versus actually being embedded in workflows that give you that 30, 40, 50% productivity boost.

Speaker 1: 21:47

Well, and I think what you're driving to is transparency.

Speaker 2: 21:51

Exactly right.

Speaker 1: 21:57

And when I think transparency, it also makes me think of regulation. We're in a highly regulated industry. Are there any particular regulatory concerns associated with using? You know your solution. Yeah.

Speaker 2: 22:08

So I will speak. So just AI and financial regulation in general, like it is a big topic, right, and there are multiple sort of levels to it. One is so SEC is looking and I think they're trying to understand, okay, what is the all sorts of capabilities that AI systems can do and where you know, like they're trying to get the map right now, right, and so I haven't seen anything concrete yet. But there has been some talk about being very careful, especially in the advisory business, right, you can't have AI systems right now that you know within the advisory system that can make some recommendations, right, like that is a very, very tricky path right now, and for good reasons, right, like because these systems are generally intelligent enough, but then you can't really rely on something that important and critical. So there will be regulation. My intuition is on that front and part of that is, I think firms and enterprises have taken sort of like, not a pause, but they're doing a wait and see approach, listening for SEC sort of like commentary, and so I think they will be more cautious. Where it's directly interfacing with investors, right, and how the outputs of AI go to the investors directly in making decisions. The other aspect of it is like materially non-public information, right. So if you're an investment bank and you're dealing with a company that's about to go public in three months, obviously it's super sensitive information, right and risking that information to LLM which can get leaked or can used we don't know how OpenAI uses it is also very critical. So in that approach, I think you might see more private models come up right which takes care of like everything is in-house for that bank and they don't send anything out there. So it's kind of like you solve that problem by figuring out what piece of technology I want and how can I bring that in-house.

Speaker 2: 24:19

As for us, I think right now we're focused on sort of research analysts. We're not facing with, like you know, a retail investor or something like that. So we are focused on as a research platform and trying to being as transparent within the AI models. It's not a black box at all. You can, you know, flow through all the steps that the AI took to give you certain output. So I think right at the moment, it's not something that's blocking us from anything. But I think, depending on where you end up and like what's the application of AI that you go after will dictate how much AI regulations. You have to, but it is super early. I think SEC is also trying to wrap their head around, like like, where should we even begin?

Speaker 1: 25:05

right, so you know, which also begs the question of you know, when you think of fast moving industries, you do not think financial services. You know, I think you know we started talking about movement to the cloud at the top of the hype cycle in 2012. And we're only now seeing trading and risk platforms starting to move to the cloud. You know, how do you see, you know how do you change that, especially with newer technology. How do you get people using and deploying, and even recommending your product.

Speaker 2: 25:39

Yeah, no, that's a very good comment because it is true, you do not start financial services because it's super regulated. It's been interesting because I think the value proposition is so high that it's hard to ignore that. And so it's surprising to me when we talk to our customers and when you talk to people like potential, like new people who have not even heard about AI tools, when we communicate to them like this is what can do, and we show them like the excitement is like, oh man, like this, we could save like three hours every day for each, every analyst, or something like that. You know, three hours every day for each, every analyst, or like something like that. And the other type of customers are like they're more excited. They've kind of, like they know, chat, gpd, they've been trying to like cajole it into doing what they're already trying to do. They're more excited than us and they're giving us ideas Can you do this, please, can you do that? And so, like we've seen somewhat of the opposite problem where, like people are coming to us like, hey, can we do this now, can we do that? So it's interesting Now, maybe because we're not, you know, in the layer of like super regulated, like we're not trading right, we're not, we don't have a fund that we're recommending right Like.

Speaker 2: 27:06

So I think maybe that isolates us a little bit at this stage, but I think as you move across, you know, within the field, within the domain, to different areas, you might have to wrestle with that. So so far, so you know, one of the things that we have a close beta on is called AI agents, like task agents, where you can just give like a task. Hey, I'm looking at these top 20 companies in this sector and I want to do XYZ analysis and you can have like a pretty detailed analysis and it just does that for all the 20 companies. Or it can look at, you know, incremental changes in quarter to quarter about certain qualitative aspects, and that has been getting a lot of momentum and it's like, basically, because it's so generic for the buy side, it's like, okay, I can just have it do things for me. So that is interesting where I would be in the camp of like, okay, how are we ever going to break into it? But it's been sort of like the excitement has been quite the opposite.

Speaker 1: 28:21

So that's been super interesting. You know, sadly we've made it to the final question of this podcast because I've got about 10 more questions I want to dive into. But you know we call this final question the trend drop. It's like a desert island question.

Speaker 2: 28:41

And if you could only track one trend in AI technology, what would that trend be? One thing that I am tracking closely is the latency and the cost of the powerful models and the reason that's and this may not be super important two or three years from now, but in the short term it's kind of like really important because that allows you to figure out what you can do at a sort of like a speed and cost that makes sense, right, like if you take a week to do something that may not be as valuable versus a day or two hours, right. And it goes back to this AI agents that need to do multiple things, like hundreds of steps, and like rechecking and verification and all that, and so being able to go down to this, like being able to do all those steps fast and cheaper, is going to be very critical. So one thing I am like looking at these curves of like how, sort of like how fast inference takes on these models. So it's going goes back to like algorithmic sort of advancements and chip like GPU, you know advancements and like how, through GPU throughput advancements, and chip like GPU, you know advancements, and like how, through GPU, throughput advancements, all these things like which was very low level things, but it translates in big way for the application layer that we're operating in.

Speaker 2: 30:04

So I think that would be one thing that I'm very keen on learning. And the other thing, if I may, is just the open source models that are. They are getting good. They're not quite there yet, but if they can match certain quality, that would be a huge win for the industry, especially on the private model side and the regulation side. So yeah, you asked me for one, I give you two.

Speaker 1: 30:28

I'll take two Fair enough. Well, lakshmi, I want to thank you so much for your time today, your insights, and I want to congratulate you on your success with FinPilot, as well as future success.

Speaker 2: 30:40

Thank you so much, thank you. Thank you, jim, I really appreciate it.

Speaker 1: 30:48

And that wraps up this season of Trading Tomorrow, navigating trends in capital markets. We appreciate your loyal listenership and we'll be back with Season 3 after a short break. Make sure you rate, comment and like our podcast so we can continue to bring you information and chats on the latest technology changing the financial industry.

Trading Tomorrow - Navigating Trends in Capital Markets

Navigating AI Adoption in Finance with Finpilot

Listen to this podcast on