213 - Setting up your own chatbot with Ruggiero Lovreglio and Amir Rafe Artwork

Fire Science Show

Fire Science Show is connecting fire researchers and practitioners with a society of fire engineers, firefighters, architects, designers and all others, who are genuinely interested in creating a fire-safe future. Through interviews with a diverse group of experts, we present the history of our field as well as the most novel advancements. We hope the Fire Science Show becomes your weekly source of fire science knowledge and entertainment. Produced in partnership with the Diamond Sponsor of the show - OFR Consultants

All Episodes

Fire Science Show

213 - Setting up your own chatbot with Ruggiero Lovreglio and Amir Rafe

August 06, 2025 • Wojciech Węgrzyński

The AI revolution has arrived, but fire safety engineers face a critical dilemma: how to leverage powerful AI tools while protecting confidential project data.

Professor Ruggiero Rino Lovreglio from Massey University and Dr. Amir Rafe from Utah State University join us to explore the world of local Large Language Models (LLMs) - AI systems you can run privately on your own computer without sending sensitive information to the cloud. While cloud-based AI like ChatGPT raises serious privacy concerns (as Sam Altman recently admitted, user prompts could be surrendered to courts if requested), local models offer a secure alternative that doesn't compromise confidentiality.

We break down things you should know about setting up your own AI assistant: from hardware requirements and model selection to fine-tuning for fire engineering tasks. Our guests explain how even models with "just" a few billion parameters can transform your workflow while keeping your data completely private. They share their groundbreaking work developing specialized fire engineering datasets and testing these tools on real-world evacuation problems.

The conversation demystifies technical concepts like parameters, temperature settings, RAG (Retrieval-Augmented Generation), and fine-tuning - making them accessible to engineers without computer science backgrounds. Most importantly, we address why fire engineering remains resilient to AI takeover (with only a 19% risk of automation) while exploring how these tools can enhance rather than replace human expertise.

Whether you're AI-curious or AI-skeptical, this episode provides practical insights for integrating these powerful tools into your engineering practice without compromising the confidentiality that defines professional work. Download Ollama today and take your first steps toward a more efficient, AI-augmented engineering workflow that keeps your data where it belongs - on your computer.

Further reading: https://ascelibrary.org/doi/abs/10.1061/9780784486191.034

Ollama: https://ollama.com/

Hugging face: https://huggingface.co/

Rino's Youtube with guide videos: https://www.youtube.com/@rinoandcaroline

----
The Fire Science Show is produced by the Fire Science Media in collaboration with OFR Consultants. Thank you to the podcast sponsor for their continuous support towards our mission.

Wojciech Wegrzynski: 0:00

Hello everybody, welcome to the Fire Science Show. When the Chachapiti revolution occurred, I was very happy to tell you all about it as soon as I could. I had also an episode with Mike Kinsey where we've discussed the possibility of using tools like Chachapiti and setting out your own stuff that can support engineers' workflow. Fast forward a few years later. I think we got used to this technology by now. I think it's going to be the defining technology of this decade. You know, like Internet defined the 90s, I guess Facebook defined the 2000s, instagram and Twitter probably defined the 2010s. Maybe TikTok.

Wojciech Wegrzynski: 0:43

I'm a different generation and I think this decade will be defined through large language models, chatbots, etc. And it's just a part of our lives nowadays. But are you really using that in your engineering workflow? I'm using it for programming. I'm using it to solve pieces of problems that I work. I find finding solutions to some issues much quicker with the support of chatbots, but it's not that I am really incorporating them in my workflows completely, and the real problem with them is really the privacy problem. And hallucinations, yes, but privacy would be the one that worries me the most.

Wojciech Wegrzynski: 1:28

A day or two days ago, I saw a quote from Sam Altman when he was asked what's going to happen if a court asks OpenAI to release prompts of a user in some sort of court hearing, and Sam said that they probably will have to give that to the court. So if you are talking with ChatGPT, any kind of LLM it's not that you're having a secure conversation with your computer. You're sending all of that into the internet and if you upload a file, it goes somewhere. If you upload a confidential file, well, it goes somewhere as well. So you probably don't want to do that, and that kind of limits the ability for us to work, because most of the stuff we have here is confidential. The amount of NDAs I have to sign to do anything is crazy. Therefore, the ability to use AI in my engineering workflow is limited.

Wojciech Wegrzynski: 2:22

And here comes the solution my two days guests, professor Ruggiero Lavreglio from Massey University and Dr Amir Rafae from Utah State University. They've been playing with this technology, but they were playing with LLMs, or small language models that you can install locally on your computers and you have the ownership of the data that is being sent. You don't even need internet for them to work. Quite magical world. Instead of relying on insane computational power of OpenAI or XAI, you can use your own computer to be your own. Chat Comes with the requirements. Not that easy. Well, it technically is easy, but it has its challenges, which you will learn in the episode. So I think this opens a new pathway where those tools can be really, really useful for fire safety engineering.

Wojciech Wegrzynski: 3:16

Enough of my rambling, because there's a lot of valuable content behind the intro, so let's spin the intro and jump into the episode. Welcome to the firesize show. My name is Wojciech Wigrzynski and I will be your host.

Wojciech Wegrzynski: 3:49

The FireSense Show is into its third year of continued support from its sponsor, ofar Consultants, who are an independent, multi-award winning fire engineering consultancy with a reputation for delivering innovative safety-driven solutions. As the UK-leading independent fire risk consultancy, ofr's globally established team have developed a reputation for preeminent fire engineering expertise, with colleagues working across the world to help protect people, property and the plant. Established in the UK in 2016 as a startup business by two highly experienced fire engineering consultants, the business continues to grow at a phenomenal rate, with offices across the country in eight locations, from Edinburgh to Bath, and plans for future expansions. If you're keen to find out more or join OFR Consultants during this exciting period of growth, visit their website at ofrconsultantscom. And now back to the episode. Hello everybody, welcome in the Fire Science Show around the globe. Today I'm in my studio in Warsaw, my first guest, dr Amir Rafay from Utah State University. Hey, amir, nice to see you.

Wojciech Wegrzynski: 4:58

Hello, thank you for welcoming me, thank you Good afternoon, I guess, and my second guest, professor Ruggiero Lovreglio from Massey University. Hey, rino, good to see you Good morning everyone. Good morning in New Zealand.

Ruggiero Lovreglio: 5:17

Wow, that's literally around the globe, nice we are in the future here and can tell you that the weather looks good.

Wojciech Wegrzynski: 5:31

I'm so glad that tomorrow looks nice. Thank you, that's what I was looking for. It's it's a very late evening in warsaw, um amir, congratulations on passing your viva, and I just I heard that it was a few days ago. So all for a good start into the episode and it's a very interesting content. We are talking about ai and how ai will change the industry. I remember, I think two years ago I was talking with mike kinsey on podcasts about creating some sort of ai tools or ai alike tools, because michael had some explicit tools that were not really in ai. He also had ai, to be honest, felt like, you know, a dream for a future. It was very interesting. Today, two years later, holy crap, a lot has changed. Rino, can you summarize where we are in this madness of AI revolution today?

Ruggiero Lovreglio: 6:19

So yeah, we can say that three years ago we all had the shakeup when we tried the GPT thing I think it was back then and we started typing and we started seeing oh gosh, it's answering questions, it's looking like a human, it's doing stuff that we're not expecting, and that was the big shock that the food world had with OpenAI and their first public tool for all of us. From there, things has been going wild. You can see that there is a lot more competition on cloud service. I experienced myself among them using Cloned Rock JV9, and they are really there trying to fly with each other. Who is going to have the best results with benchmarking some of them cheating, because they train the model on the benchmark and then they say, oh look, they're keeping a really good match. It's like, of course. So there is a lot at stake, especially who is going to be the one still leading forward.

Ruggiero Lovreglio: 7:21

You've been talking, I don't know, for a year about Chachi PT and it's always the new one is always about to come. God knows when it's going to come, but we could see already, from the GTT 3 to 4, the great advancement. The latest news in the last couple of weeks was the release of an agent function within Chachi PT, which was like wow for the words Not much wow for me, amir, because if you are in the field and you see all the open tools that are in there, it's that prototyping a lot of stuff yourself before ChachPT or wherever produced those tools. So it was like, yeah, nice, let's give it a try, let's see how it works. And so now everyone has the buzzword agentic, ai, agentic, ai, goodness.

Ruggiero Lovreglio: 8:08

And if you see what was Chachapiti one year ago, he had the possibility to be an agent because he was loading a Python environment, writing the code for you, developing the charts. And if I tell my wife, it's not the language model that developed the charts in Chachapiti, it's because he's having an agency on the Python code do stuff and give back the results to you. So agentic is not the new. It's a nice buzzword to do marketing, to sell fluff, but probably it's already two, three years old stuff. Amir probably can tell us more about it.

Wojciech Wegrzynski: 8:42

Yeah, I'm very happy to hear that. If we could just quickly round up what are the popular models you mentioned ChatGPT, cloud, grok, gemini. There's also Perplexity, if I'm not wrong.

Ruggiero Lovreglio: 8:54

Yeah, I'm not even mentioning Perplexity, copilot, because they are not models. They are just AI tools that use as a backbone these big models. They are just capable those platform, to reuse something that is already there through API and sell you something that is a bit more customized for specific task, and that's the direction we are taking. Also, profile protection engineering and the Chinese one what was his name? Ah yeah, deepcq was like there was another shaker for the wars. It's because of the results, because he was pretty cool with the faking capability, but also because they realized that they spend Chris that's the official data much less money than everyone else to train a model. And they were like, oh my goodness, and in China there is a lot of ban and difficulties to find the advanced graphic cards.

Wojciech Wegrzynski: 9:47

But underneath the clothes. It's all instances of very similar concept. It was called Lama, I believe. I'm not wrong. But, Amir, tell us more about where we are from a technical point of view and how the environment looks right now.

Amir Rafe: 10:01

Sure, first I wanted to say we had AI, I think from 1940 and after that in 1950, when Alan Turing said and designed a test for how machines work as a human, or how they think as a human. And I think now, in 2025, we are working on that because we are looking for the AGI artificial general intelligence. But in the product side we have a lot of AI models or, as we can say, we have a lot of large language models or small language models. They can do a lot of things in the space of the fire, engineering or transportation. If I wanted to call it just one part of the Reno, what's the perplexity? It has a new large language models, as they called Sonar. So they have a Sonar reasoning model for the thinking model.

Amir Rafe: 11:06

But I think OpenAI when published the chat GPT and GPT models after the GPT 2.5 and after that 3.5, I think in 2023, if I'm correct. So they changed a lot of things because you know we worked with them for the usual tasks and you know the chat GPT was a generated AI. It's so important for us because we can communicate with these models and we can ask and after that, you, after that, the AI agent develops and a lot of things. So, after chat, gpt the meta company as we know it for the Facebook or WhatsApp or Instant Run, they publish an open source model as a LAMA. They change a lot of things because now we found we can work with the open source models, we can use the API for our tools. It's so important for us Now, in 2025, we have a lot of models, open source models, the Lama or Lama, and we can use a lot of API using the open router or similar products like this.

Amir Rafe: 12:27

So I think the open source models that we have on the Hugging Face and, you know, olamo, it's so important for us as a researcher or as engineers to create products for, you know, create a pipeline for our works. Create products, create a pipeline for our works. So I think it's a good start to talk about how we can use AI in our research or in our field, by engineering. That.

Wojciech Wegrzynski: 12:56

It's an ongoing discussion and it's a discussion in the practical aspect already, because everyone is using AI in one way or another. Even if you're doing Google search today, you are using some sort of AI already in it. One thing that was with us from the start of this AI GPT-fueled revolution with the first release of the chatbot, was how deep they can go, how good answers can they make, and immediate problem that we've observed was that, with great confidence, they will give you a really bullshit answer, sometimes like blatantly wrong, with 100% confidence that you're right At this point. For me as a user of these tools, this is absolutely frustrating.

Wojciech Wegrzynski: 13:41

Not sure if you've seen a meme of an AI surgeon. Is that I've removed part of your body or shouldn't it be on the other side? Oh, yes, you are right, it should have been on the other side. They let me remove it again. That kind of summarized this experience to me. Has anything changed in this hallucination aspect? Have things have improved, changed? I have a feeling they like sinusoidal curve. They get better and worse, better and worse, and I can't tell which part of the cycle we're in.

Ruggiero Lovreglio: 14:08

Yeah, no, that hallucination has been one of the major things that people have been complaining about, especially when you are trying to get something with a reference. You get these beautiful title papers and then you go there and you don't find it. Can we prevent it? Definitely yes. The problem is that when we use really general models like ChachiDT have been trained to be good at answering a bit of everything and not actually report when the knowledge is not there. But I've been learning a lot from Amir that using a system prompt or using any other setting or parameter on the model like 180 quantum to meter is the temperature, that is the level of the model you can reuse it and make the model more deterministic. This stack more or work is the knowledge as we train. So I always show an example that when I use even small model like 3 million parameter, to give you a reference point, chachapy T4 is 1.7 trillion parameter, gtt 3.0 boards 10 times smaller, 170 million. And now you can run on your computer. Consider like one gigabytes of RAR will allow you roughly a bit more than 1 million parameters. So we can run more liquid language model on our own personal PC locally. So you unplug all the internet.

Ruggiero Lovreglio: 15:30

It's still working and it starts asking things about who is Reno? Of course it's too small that he doesn't know me. He might know a lot about Isaac Newton and if I don't do any change in the setting, he's going to stop most likely telling me that I'm either a Macchia boss or I'm a musician from Naples I don't know which one is the worst, just kidding. And it's going to start telling me to fabricate a lot of stuff. But if I put then a specific prompt to say just stay on the facts, don't make hypotheses or make that work in the state then and reduce the temperature on the model, it will. The same model will tell you I don't have information about this context. Please ask something else. So it's something that can be modified.

Ruggiero Lovreglio: 16:15

You have hallucination when the model doesn't have information about that. If you provide author with all the information that he needs, he will generate an answer. So that's the reason behind the hallucination. And they are probabilistic model. They don't even understand, they don't have consciousness, so they have a line here. You can tell them. Hey, you're making some assumption. Put it forward.

Ruggiero Lovreglio: 16:35

The problem with the big models that we use through crowds is you don't have access to all these parameters, you don't have the possibility to use system prompts. You can do some clicking on Gemini when you use it with the Google AI Studio, but most of the others it's all locked. You don't even know what is the system prompt that the opening AI is using. So it's there, you can't touch it, and that's a big limitation. Hence much better using those models with ETH or using open source tools and, like Mir will say, just Google Ollama, and you will see that there are so many open solutions that are there for you for free, with model data nearly close to 1 trillion parameters, and so you can even download them. But good luck, maybe a computer that can run them, a cluster that can run them, because they are really GPU intense, and that's the other things that we can discuss later.

Ruggiero Lovreglio: 17:29

Ceo, check your running piece model.

Wojciech Wegrzynski: 17:31

I want to ask Amir on some of the terminology you have used, because I think for our listeners, the ones who are not that technologically savvy, let's try to clean some concepts. You've mentioned parameters, you've mentioned temperature, you've mentioned API tools. I would like to go over in more or less this order. So perhaps, amir, if Reno is telling me one model is like 3 billion parameters, other is trillion parameters, what does it mean? What was the parameter in this context?

Amir Rafe: 18:01

It's a very good question because it's a big start to working with the AI models in the product sections Yoda. When we call this GPT or Generative Between Transformers, they're created based on some data and after that they work for some tasks. So when we call these as a parameter, is large language model created based on some data, or textual data or image, or depends to which model that we are working with that.

Amir Rafe: 18:33

And with a different concept when we are talking about the data set created based on reinforcement learning, so it's different from other aspects that other models created.

Amir Rafe: 18:45

So parameters are the size of the data that model created based on that. When we are calling this model, for example, gemma from the Google, it has 4 billion parameters, so it's created based on the data that has a size of 4 billion. For example, when you call it the text data, pdfs or books, the size of them is 4 billion flowers. So because we have technical terms here as a context window, so a model can read a lot of data. Text window, so model can read a lot of data. It can read, for example, based on the context window that would read it, the model.

Wojciech Wegrzynski: 19:31

One more follow-up question, if I can. But the simple fact that one model has more parameters is not necessarily meaning that it's a better model, because you would probably, if you are going for a specific use, you would probably, if you are going for a specific use, you would probably have a better, fine-tuned model on less parameters which are very fit to what you're trying to accomplish, rather than a multi-billion parameter model. Train on on random stuff and I guess when they, when they trained the next instances of grok or chat g, they're probably just let it read the entire internet as a training set as Amir was saying on this is like yeah, the bigger model is like the brain is much bigger.

Ruggiero Lovreglio: 20:12

It comes with a much bigger context length. The context length is basically the short-term memory of a language model, so it's where you put all the information you want from turn, all the things that you want to digest and process. Based on that, the long-term memory, it's desperate. What has been the model trade-on can be changed, like we can do some fine-tuning on that. So that's why really big models are really good, because the reasoning capability improved and also the amount of information that can be worked on is much better.

Wojciech Wegrzynski: 20:43

And now, if you could explain that concept of decreasing temperature and those parameter settings because, as like Rino said, you don't really play with those at all, with your normal chatbots. So what do you mean by altering the parameters of the model?

Amir Rafe: 21:00

as a user, I think just about your question. Everything it depends to your task. Maybe in some tasks in general tasks larger models, for example, where we have 170 billion model it works better in general tasks because they are more complex and they can answer you with more accuracy. But when you are working in the engineering side, when you're working when you wanted to connect the AI to the documents or some specific task, maybe a smaller model works better because you wanted to create the specific brain for the AI in the specific area. So it depends on your task.

Amir Rafe: 21:47

If you wanted to ask about the weather, if you wanted to ask about scheduling something, I think the larger model works better than the smaller one. So I want to just mention another thing about the temperature, as I forgot to say. Temperature is so important when you are using the API for that the creativity level of the model so it's so important when you are working with the RAG structure or when you wanted to extract the specific data from the document. It's important to set the temperature to zero or less than 0.5, because the temperature scale for the AI model is between zero to one. So when we decrease this and move from one to zero, we can decrease the creativity of the model and you know it's a kind of that we say to the model use our data for answering, not your brain or the data that you created based on that. So temperature in the product side is so important, choosing the model is so important.

Ruggiero Lovreglio: 23:02

If a man is like drinking a bit, you will see that if you drink a bit more you might become more social, more chucky and tell us things that probably you don't even need. And when you get too much of the shots then you start telling a bit too much random stuff and almost you need the model to go and be flexible on what it says. And if the temperature is too low then you start becoming a bit boring and not able to come out with something.

Wojciech Wegrzynski: 23:34

I once spent an entire evening talking in German and I don't know German, so that must have been a very high temperature. It's possible to recreate this in some experimental setting at some conference if you want. I guess this is also the reason why sometimes chatbot annoys me with the language it's using, like those ridiculous, ridiculous, you know texts. You can immediately say, oh, this is chatbot, generated like no one speaks like that, and probably when you go closer to zero, it just gives you more dense, simple answers, more to the point. But again, if you want something very creative, it just gives you more dense, simple answers, more to the point. But again, if you want something very creative, it's probably good to be higher. You've again used API, so not everyone knows what's API. So what's API?

Amir Rafe: 24:15

API it's. You know if it's simple? Because you know, I'm not a computer scientist, I'm just a user of the language in AI.

Wojciech Wegrzynski: 24:23

I don't think any. Maybe there are a few listeners to Funnel Science Show who are computer scientists. So it's from one fire engineer to another.

Ruggiero Lovreglio: 24:33

You know the.

Amir Rafe: 24:34

API is application programming interface. So you can use the API keys for using the AI in your code and calling the large language model from the server. If you are using the commercial large language model, you can call them from the server, or if you are using the open source model, you can call this from the hugging phase using the API.

Wojciech Wegrzynski: 24:59

So an example would be I would be writing my code and instead of my code solving something would be I would be writing my code and instead of my code solving something, I could just write a, a piece of code, a piece of code. Go ask this to chat gpt and post me the answer, more or less yes, yes okay good what you need to do.

Ruggiero Lovreglio: 25:15

You can even try with open ai.

Ruggiero Lovreglio: 25:17

You can go in their api platform, you can generate a key and you can see that once you have that key, you can put it in many other user interface that allow you to use API or ChachiPT or any other model, and then you can run it directly on this new user interface. And the good news about that is that you don't need to have a subscription. You pay as you go. In fact, you can see how much every model costs in terms of token. That's the other keywords that we need to talk. Everything is running token, A set of characters that when you write a prompt it can be converted in a number of tokens that you send back to the server and the server comes back to you with an answer that is measured and broken and you pay the bills as you go. So if you have a really big company and you don't want to have a 300-substructure possibility, if some of your stuff don't use Chachapiti, AIM and C-Pro single instance would be probably cheaper to use it to. Ai were being probably cheaper from using the tool-aid AI.

Wojciech Wegrzynski: 26:22

Okay, I mean it's a valid question for very generic use of AI, which happens seldom. You use it, let's say once a week or twice a week.

Amir Rafe: 26:32

Is it even a word to?

Ruggiero Lovreglio: 26:33

go for it's much cheaper.

Wojciech Wegrzynski: 26:35

Okay, good Guys, let's bring this discussion closer to fire safety engineering, because that's what I really wanted to know. I mean, it's fascinating to observe the AI revolution happening in front of our eyes. It's absolutely crazy. But well, let's get it closer to fire safety engineering. Before we started talking, I went to one of my favorite websites, willrobotstakemyjobcom, and this website actually has fire prevention and protection engineers in it, and it gives me minimal risk. So it tells me that there's a risk of 19% that AI will overtake fire safety engineers. It also tells me an average wage of fire safety engineer is $103,000 a year, which is very reassuring to me. What's so hard about fire safety engineering? That we are at minimal risk of being taken over by AI?

Ruggiero Lovreglio: 27:34

No, this is a partial answer because we can accelerate a lot the work of fire protection engineer. That means a firm will need the capability, will have the capability to run more projects. That means that competition is going to be higher and probably more need fewer engineers, but also capable to do in a company to augment the staff using AI. It's helping to speed up a lot of the work, make more informed decisions, to have much more context when you make some decisions, but you still need the brain of humans to make the call.

Wojciech Wegrzynski: 28:12

While I agree, it's also, I would rather say that, with ever-increasing workload, it's just going to allow us to catch up rather than decrease the number of fire engineers needed, which is also a very positive observation, but still still, you know, 19%. There are jobs like data scientists, which have 95. There are jobs that are like on an imminent risk of extinction. You know, and we're not, why fire safety engineering is not on imminent danger of extinction by AI. What's special about?

Ruggiero Lovreglio: 28:44

us Alessandro Frassica Really complex tasks to do and not much data on which the model could be trained. All programming it can go on the web. There is so much code you can train a model on and most of the work done in the engineering field stay in the engineering field. You don't write a post I. We solve the big project challenge by good business and this is a property of the company. To maintain all this knowledge, because that's how what you sell for the next project. We have been trying to do and try to increase the company. If your model is that injection or a that stand for Archive. Augmented Generation will make more and more expert. On the safety code, try to write FES code and we are still in the baby stage. I will say for some task on a really advanced stage or other. We just filled that paper with Amir and probably he can tell us about that.

Wojciech Wegrzynski: 29:39

Yeah, we'll go deeper in that, but I would like to pull that out Because, if I assume that it has been trained on the entirety of the internet the internet has a lot of resources on fire safety engineering and I assume it has been trained on all of the books of the world, which includes every book that we would be using. Therefore, I would argue that quantity is insufficient. There must be something else. I believe that the quantity of raw data is sufficient, but perhaps there was insufficient examples of turning this data into solutions, Because I think it's also something it needs to train on. How did you solve a problem engineering problem, programming problem with the knowledge you had, and then it can follow the patterns, the breadcrumbs that were used. I think that's the need.

Ruggiero Lovreglio: 30:28

Yeah, we don't have a lot of big project solution publicly available for everyone, and so we can't use it to train. Instead, in many other fields, the results are already out there, whatever gets generated. I don't believe that this is a big limitation, because most of these models are mimicking us. They are not yet capable to be okay alumni is letting it be smart now and try to use it in the real world, but they are. I will say that the model that we have improved. They are still like at the stage that they need to have the real experience and hit against the wall.

Wojciech Wegrzynski: 31:05

I can give you a fire safety engineering example of how knowledge doesn't mean answer, and it's from my PhD. So let's assume you're solving an optimized case for smoke flow in a shopping mall. First you have to solve the room of the fire, which means you have to use some sort of plume model. Let's say you're using Thomas plume. Then you need to get that smoke outside through doors. So you need a model for flow outside of the doors, and those are scars.

Wojciech Wegrzynski: 31:32

There are some approximation, Then the flow under balcony. That's completely unsolved. There's a very rough, rough approximation by Margaret Lowe. There is some stuff in NFPAs but like ridiculously rough. There's my PhD, which is in Polish. Then you have a plume along the wall which is Harrison Spearpoint and you have suddenly five, six models that you have to connect in a very nice symphony, one fitting another, and there is no single piece of literature that will tell you how exactly to do them for any generic case. So I mean, if AI was able to figure that out, I would be very surprised. I mean, if you ask it about the Thomas or Harrison Spearpoint work, it's going to tell you about it. But to be able to apply that in practice, that's a hell of a challenge.

Wojciech Wegrzynski: 32:20

I would say, we are not there yet.

Ruggiero Lovreglio: 32:21

In practice, that's a hell of a challenge. I would say we are not there yet. That's why we are saying that we don't have general intelligence, because that's something that general intelligence will be capable of. We know many humans will probably have the critical thinking to pull together this. It requires quite a lot of training also for our human beings to reach the level. In fact, you did it for your PhD. Phd is not like you go to an undergrad. I spent three years on that. You don't go to an undergrad and say, hey, tell me that you're certain you don't know how to answer this.

Wojciech Wegrzynski: 32:49

By definition, if you're doing PhD on something, you are, at that point, probably the most capable human being on the planet doing this exact thing, unless you're doing PhD on something that a hundred other people is pursuing, which is very unlike in the fire science right. So, indeed, yeah.

Ruggiero Lovreglio: 33:04

Amir. We can do many steps and probably Amir can tell us more.

Amir Rafe: 33:07

You know, as a solo I think I've been programming more than 15 years and I worked with the AI from the 2022, and worked with a lot of machine learning and deep learning models. I said this as a background. I wanted to say I believe this we can't use AI for generating the solution Because the AI can't think now. For that we are waiting for the AGI artificial general intelligence. And the main problem that we have with the AI model is the causality. They can't understand the causal relationships between the variable that we have, for example, for the fire simulation, for the fire problem, for the you know, in my area, in the transportation problem or evacuation problem. So for that reason, if we wanted to create the solution using the AI for our problems, we need to inject the causal relationships to the models. We have some thinking models, like the OpenAI O3 or DIGSIG R1. They are working with the reinforcement learning and they tell you yeah, this is the thinking process, but I think this is not the thinking process. This is just shrinking the prompt and you know we can spread the prompt, giving your responses, solutions, so we can use AI.

Amir Rafe: 34:41

I believe that we can use AI to create some pipeline, make automations, create some simulation input that we did before or for the, you know, create some consistency during your using the manuals or using the guidelines Because, as you know, for example, we have NFPA 101 as a safety code. It has more than 500 pages and it has a lot of complicated texts and graphs and it's even hard to extract accurate data based on our problem. So AI can help us. Based on our problem, so AI can help us. Ai can parse these documents, can interpret plots, tables and, using the rack system that Swena mentioned, we can extract the accurate data from this. So AI can help us in this field, but I believe AI can't help us to generate the solution.

Wojciech Wegrzynski: 35:42

I'll ask you a question and I completely do not expect that you know an answer. I'm very happy if you tell me what you feel. But when they were training stuff like ChatGPT or Grok there's this library of NFPA as you've mentioned you think it all went into the machine of Learny Learny. You think it all went into the machine of Learny Learny, or they somehow stopped the capitalist tense and said no, we're not touching it, it's a protected content. Because I have a feeling everything went in and I wonder, like, if I ask you the question about a specific aspect of a code, to what extent it's giving me an answer about directly from the code, or and to what extent it's giving me an answer based on some comments from Facebook from five years ago from random guy.

Ruggiero Lovreglio: 36:33

It's important that sometimes it can be, you know, the answer of something that was just being trained using someone talking about the code rather than the code itself, so it has been filtered the information of the code by someone else, possibly, although it's not where I run.

Ruggiero Lovreglio: 36:48

Unless it has been filtered in the wrong way, hopefully, and a good way to make an attempt to see is like to ask okay, can you quote the code that you're using once or twice? And I believe that they have put also some protection wherever he's developing those models, because there are also against OpenAI, because people are claiming, ah, they use my book without even asking for permission, and I believe that now you won't even get a honest answer from the model if you say can you please give me the port out of this book?

Wojciech Wegrzynski: 37:23

I think the filtration happened at the response layer, not at the teaching layer. I think they just fed it anything they could and now they're just filtering it to not create a lawsuit-compliant responses to humans.

Ruggiero Lovreglio: 37:36

That's why data injection and RAD are the way forward for specific field like ours. What was data injection, data- injection. We talk about the general concept that we have a long-term and short-term memory. Yes, that injection is when you just give you all the context you want the answer on in one prompt.

Wojciech Wegrzynski: 37:56

Okay.

Ruggiero Lovreglio: 37:56

Basically, you ask tell me about this or that, but this is all the knowledge that you need to use to generate the answer. So it's really important with a really big short-term memory so you can inject all the information. So the model will try to generate answers based on this. Can you do it easily? Yeah, you can do it with ChachPT. If you create your own GPT, it's desperate using that injection and whenever you ask something, you basically tell the model use all this second knowledge to give me an answer. On this specific context, rad is not that accessible if you are at the ABC or general TAI and I guess Amir will do a much better job on explaining RAD, because yesterday they told me about RAD. Amir Poudoiioukianoukianoukianoukianou.

Amir Rafe: 38:42

You know, you mentioned before about the hallucination. It was the main problem that we faced during the 2023. So they produced some wrong references and wrong answers. So after that, they developed some methods regarding to AI find answer based on the real document. So they develop the rack which we will outland the generation. So we say the AI, use this real document to find answers.

Amir Rafe: 39:16

For me, we have two solutions. You know, create the AI specialized in the area, for example, in the fire engine. We can fine tune the large language model and we can use the RAC method. So the RAC method based on my experience, the RAC method works better when we are facing some manuals and guidelines, because we inject the you know the data and the real documents to the AI. But in the process of fine-tuning, you should provide some Q&A to the model and after that, fine-tune model based on the data. For this I should say, for the first time, we produced a big data set Q&A for the evacuation of fire safety more than 24,000 question and answer within resources. Everyone can access it using the Hugging phase.

Amir Rafe: 40:13

When we evaluated the fine-tuned model and the RAC system, we found the RAC works better because during the RAC, you can test different methods, for example methods or inject the direct documents to the AI and you can set the temperature for the model. So you can say, just use this document without any creativity. You can just use your abilities in the combining tracks to generate the responses. So for extracting the accurate data, I think, based on my experience, the RAC works better than the 510 model. We fine-tuned the large language model GEMAT-3, for building, for the fire engineering and fire science, so that everyone can access this using the Hugging Face and you know, load it and use it.

Amir Rafe: 41:07

This model is a lightweight model so you can use this. But in the process of the RAG and the fine-tuning model for extracting data from the documents and treatment of the hallucination, the main thing is the privacy and cost. So it's a trade-off between the privacy and cost. If you don't have any concern regarding the copyright or something like this, so you can use the commercial model, like the cloud chat, gpd, and you can send the documents to the server and after that, using the VGA structure, and get responses. But if you have based on your data, that you have based on the document that you have, if you have a concept regarding copyright or the privacy, you can use the local models without you know connecting to the concert.

Wojciech Wegrzynski: 42:00

Yeah, that's exactly where I wanted to get eventually in this episode. I mean, we're at the point where ChatGPT could help you a lot with your engineering work. Imagine you're writing a fire strategy and you would like to write to do some general summary of what's the building you're working on. You could technically upload the whole documentation of that model into ChatGPT, which opens the possibility that in a few years you will end up in jail. You cannot exclude that. Unfortunately, it's not something. I would recommend that you send your confidential data to ChatGPT. The same if you're writing a research grant, you should really not upload your whole research grant into ChatGPT to give you a summary, because you're lazy to write it.

Wojciech Wegrzynski: 42:41

How do we and I understand, observing Professor Reno over the internet that there is this magic way where you just set up your own instance of an AI assistant of some sort that works locally on your machine? Reno mentioned you don't even need an internet. It's going to work. So tell me like it's. Basically, you become Google or Meta. You said you build up your own language model and use for your needs. How does that work?

Ruggiero Lovreglio: 43:11

The good news is for free, because all the stuff that you use it's open, so you need to download it. The true answer is not for free, because you need to have a really good graphic card, especially if you want to have a usable model. Just to give you some context, personally, I believe that the local models that are below 10 billion parameters they are usable, but not that great. To generate the answer, they can be not great, it depends on the task. So that means that you need to have a graphic card with more than 20 bytes of dedicated RAM, and so it's getting expensive. That's why, if you see, the price of graphic card is going up. We had a spike when there was Bitcoin booming. Then the system collapsed because people stole mining coins, and now we are using the same graphic card. We were playing all our favorite video games on VR and we can recycle the same graphic cards for AI. So what you can do is just install on your computer one of the different user interface if you want to have an extension similar to JGDT.

Ruggiero Lovreglio: 44:17

I put a lot of my tool links down below in the URL LN studio page assist. My favorite is open web UI. Put a lot of my tool links down below using LN Studio Page Assist. My favorite is Open Web UI. It's really flexible, but it comes with a lot of headaches to install it. And then you install on your local PC one of those open model.

Ruggiero Lovreglio: 44:36

My advice is to go to the Ulama list and pick something that will be running on your graphic cards and you can even do an attempt to have a really big model. For instance, on my PC I can install easily a 17 video parameters on my dedicated not dedicated RAM, on the traditional RAM that is used with CPU, and you see that the seven video parameters start to actually lie. So it became like okay, no, this is not usable, unless I want to have a good answer. I give a question and then I go to sleep and the morning after I read the answer, that's what it became. So if you want something usable, that is meaning 10, 20 tokens per second, so instead of looking like a GPT answer, then you need to have something while you're on your week.

Wojciech Wegrzynski: 45:26

But do you have it trained from the scratch? Does it come pre-trained with some abilities Like what's the baby stage of that software? You install it. What can it do since you install it? I guess nothing or you can.

Ruggiero Lovreglio: 45:41

once you have the model, you can start using it just for.

Wojciech Wegrzynski: 45:44

What does it mean? Once you have the model, you can start using it just for.

Ruggiero Lovreglio: 45:45

What does it mean once you have a model, you install the software and then you need, either through the software or if there are other ways you need to then install the model that you want to use on this software. So the software is just a user interface for you that allows you to. But where do I get the model? The model is free from Hacking Face or Lama List. It will allow you to download. But where do I get the model? The model is free from Hacking Face or Lama List. It will allow you to download whatever model.

Wojciech Wegrzynski: 46:12

And that model is already like responsive pre-trained able to work.

Ruggiero Lovreglio: 46:16

It's like Chachapiti style. It's already answering all the questions and then there are different customizations that you can add to have RADs to have, if a layer you can change, temperature system prompts. You have three control on how the model is running and the billions of parameters. Do you choose that when downloading the model? It's related to the graphic card that you have. If you have a graphic card that has four gigabytes of dedicated RAM, I would suggest to not go above four billion parameters, Otherwise it's gonna run on the other RAM, the one that the CPU use, and it's gonna be incredibly slow. So of course you can think I'm on the biggest model ever, yeah, but most likely your computer won't be able to write a tablet's yet really dedicated graphic card or cluster.

Wojciech Wegrzynski: 47:07

And when I want to teach it some fire safety stuff let's say I have a 50 papers by myself. I want it to be extremely familiar with my papers what do I do?

Ruggiero Lovreglio: 47:17

You have two options right. That means that you process the data. Could be semantic database. Semantic database means that you ask about A and the models try to extract information from the existing knowledge close to the A concept, nearly close, and you can say how much close. Instead, if you do the fine-tuning, as Amir was saying, that means that you use all this information to create question and answer based on these contents. So there is a lot of more processing of the data and then you can have a model that is already trained on this specific information so it will run more free Instead of with the rack prompt. You will see a lot of delay because you get the prompt there. The prompt gets converted in a sematic meaning. When we go in the Docker base, you get all the information related to this semantic stuff that you are getting, bring it back to the general model and then give you an answer. So it might take 10, 15 seconds.

Wojciech Wegrzynski: 48:17

I'm patient enough. I downloaded movies from Torrent and it took a week. I'm patient enough for this. My children are definitely not patient enough for this, but we'll see where it gets in some years.

Ruggiero Lovreglio: 48:28

Yeah, we came from the internet, good.

Wojciech Wegrzynski: 48:32

So what level of utility you can get by fine-tuning that Like. Can you expect this to be truly supportive to fire safety? Engineers' work, daily work, in that case?

Amir Rafe: 48:45

Just before that I should say a good news Olamob yesterday published a software for the Mac and Windows.

Amir Rafe: 48:54

So the easy way is to use the local models is go to Olamob and download the software of the Olamob and choose the model that you want from the olala and choose the model that you want. For the first step I propose use the FIVE 3 from the Microsoft 3.8 billion so you can run it for any computer, or GMAT 3, 1 billion or 4 billion, so it's the first thing that you can work with the local models. From the fine-tuning side. It's a little hard to describe the whole process. You know as an engineer, not as a computer science, but you know. Imagine you have a brain that's trained on the general data but you wanted to inject some specific data to generate, for example, responses regarding the fire science without using the external documents. So when you're fine-tuning 4 billion models for the fire engineer so you don't need to upload any documents to that you should create a database question and answer real question and expected answer that you need to get from the Mars language model.

Amir Rafe: 50:21

So, you should create this database and after that inject this database and train the model based on that. So, as I've said before, we created 24,000 questions and answered database and after that fine-tuned the model. So it's a little hard because you as an expert that you wanted to fine-tune a model that you wanted to find to you on a model, you should read the document and after that extract the relevant question and create a good answer for that question and ask the dedicated database for it. There are some methods to create this database that I did. You can use active learning. You can use the AI to generate question and answer. There are some tools for that, but, based on the results of my dissertation that I did two days ago, the rack system in the evacuation problem works better than the finder, but I used based on the GPU that I had. I'd use the GemR3 4 billion parameters. Maybe if I fine-tune the model for example, 12 billion parameters the fine-tune model works better than the rack.

Ruggiero Lovreglio: 51:39

I was saying that we need to acknowledge also the work done with Mike Speedpoint and Pete Lawrence from Great Junies. We have done a lot of things also to ask fire engineers okay, that's the answer that you get out of the mouth of our rug what do you think about the quality of the answer? And in the paper that I published they were saying no, not that good. Oh, actually, this solution is good. And we were also forcing the model to provide a comment on what was the reason it was getting these results. So it's good to see that for some tasks they were like that's not bad.

Wojciech Wegrzynski: 52:15

I mean, if you're talking about spear point response, that's a very high evaluation. If he tells you it's not bad like that, that's probably somewhere like five top percentile.

Ruggiero Lovreglio: 52:25

I trained Amir on Mike as well, before working with him.

Wojciech Wegrzynski: 52:28

I am not sure if the world is ready for Spearpoint GPT, but it would definitely be very, very interesting. Okay, guys, I'll try to put some papers in the show notes. I see Amir's paper on enhancing occupant evacuation simulations, the one that you were referring to. The last year paper was which one?

Amir Rafe: 52:54

No, it was the paper that we worked with that you said. This year we presented the ASC.

Ruggiero Lovreglio: 53:02

And Nancy occupant evacuation simulation using LLMs and retrieve augmented generation.

Wojciech Wegrzynski: 53:08

Fantastic, good, good, good Guys. I mean it's great to catch up on the modern technology and I think we've provided fire safety community a good wrap up of what's been happening in the last years, what the technology looks like, and I think it's. I mean, we've not explicitly spoken about the future and how exciting it is, but it's kind of obvious, like I mean, it's obvious where we're heading and this general AI, agi this is going to be very, very interesting if it eventually happens. Even getting close will create some very interesting times, and let's hope that the ripples that scatter through the world are not too damaging to fire safety engineering and actually enhance our capability to deliver great engineering. I would love for AI to take the very boring parts of my job and allow me to do creative and interesting parts of my jobs, though artists thought the same, and then mid-journey came, so I'm like not so sure. Any final words of encouragement or perhaps warnings that you want to say at the end, rino?

Ruggiero Lovreglio: 54:28

no-transcript, and I believe that we need to start integrating more and more these courses within the Far Forthright Engineering courses. It's like not having sex in vocational school or figuring out yourself. It's the same, like here, there are a lot of risks if you don't know all the possible things that you can get bad and good, if you just go wild and you explore by yourself. Like we were talking about copyrights, you don't want to end up uploading, on a language model, the SFPN book and you don't have the rights to do it because you don't have the copyrights. You have the rights to have a copy of it and use the copy of. That Doesn't mean that you have now the rights to resell this FPM book or get it to another company to make use of it. So it's we need to start discussing about all the good things, the bad things and be open about it. Henry Suryawirawan.

Wojciech Wegrzynski: 55:19

Yep Good Amir. Anything from your AI journey that you'd like to share with people? That Amir.

Amir Rafe: 55:26

Ibrahimov, I want to just say a general thing to the engineers Don't be anti-AI engineers, because AI is a tool that we can make our works easier, so we can use it, but we should maintain our creativity as a human, as a researcher and as an engineer. For example, I don't like to generate text or report using the AI because I want it to read my brain and actions, not from the AI. But AI it's just a tool, like the programming language. We can use it to make our work and, you know, create work easier. So don't be anti-AI engineers.

Wojciech Wegrzynski: 56:18

Use these tools and enjoy that, yeah and I can second that and I resonate with what you said, reno, about ethical use and efficient use and just useful use of AI. I wonder are you working on that? In Massey, like you're an academic, you say things that AI should be integrated in teaching. Are you integrating AI in teaching?

Ruggiero Lovreglio: 56:37

Pushing, pushing, pushing, pushing, and sometimes universities are too slow to reshuffle themselves to integrate new knowledge. I believe that is going to be a good selling point for many degrees. If we start properly teaching people the basics about what is AI or can they use and how to optimize what used to be done with Excel or done in a simple manual way, so that will give the people a much higher hedge manual way. So that will give the people a much higher hedge. And I'm afraid some universities are too afraid, too scared, to understand what is the technology. And I understand where they're coming from, because they don't want to be sued, they don't want this and that, but I am one of the of the person here in Massey creating problems.

Ruggiero Lovreglio: 57:24

They say hey, let's go hey, let's do, let's do it, I can do it tomorrow guys.

Wojciech Wegrzynski: 57:29

And Amir, with what you said, I think opposing AI today is like opposing sewing machines or steam engines in 19th century. And another parallel that comes in front of my head is my maths teacher telling me to learn how to multiply in my head, because I'm not going to carry a calculator with me my entire life. Right, haha, joke's on you. So, like the future is interesting, we're living in very interesting times. I've heard that it's an old Chinese curse, so I hope the times do not become too interesting. Thank you guys for coming to the Fire Science Show. We at least made this hour of the day of our listeners more interesting. I hope the times do not become too interesting. Thank you, guys for coming to the Fireside Show. We at least made this hour of the day of our listeners more interesting, I hope. Thanks, guys, thank you so much, thank you.

Wojciech Wegrzynski: 58:14

Thank you so much, and that's it. Thank you for listening. Since we've recorded this episode a week ago, there has been at least three new releases of large language models. Like every big player in the market is releasing a new model this week, it seems, so what a crazy timeline to be live, and it becomes quite a scattered environment really, and even with chat, gpt, open, ai, you have so many different models to pick from. I think we start to be more and more competent on which models are better for particular uses, and that's something you can get through training and trying out.

Wojciech Wegrzynski: 58:53

One important take of this episode and truly the reason why I wanted this episode in in the fire science show is that they've shared some concepts about how to set up your own LLM running on your local computer. It's not going to run on your very basic laptop if you have one that you just used to write, but if you have something akin of a gaming PC or a gaming laptop, then probably your chances are that you will have good enough graphics cards to run an LLM and if you have that capability, I think it's quite interesting to try and set up your own instance of large language model and with that instance you can privately play so you can upload your reports, you can upload your notes, you can upload your data and see what the model can get out of that, and by doing that you can start training, you can start improving and eventually perhaps you can create something that actually helps you with your workflows. This is something that has been going through my head for a long time. I had many discussions with many colleagues about how to do it, and usually we've stopped at the point of privacy, like you really cannot upload projects to the Internet. Another thing that I'm curious about is the capability of this AI to read technical drawings of the buildings. So I'm not sure what it will do with AutoCAD drawings. I'm not sure if it can take a Revit file and understand.

Wojciech Wegrzynski: 1:00:30

Perhaps a Revit integrated AI that could run on your own computer with like privacy protection is something that could make a big difference in the industry. I'm not sure if it's Well okay, I'm pretty sure it's being developed by someone because it's not the most innovative idea of the year. I'm pretty sure it's being developed by someone because it's not the most innovative idea of the year. I'm pretty sure someone's working on that and I hope they succeed. If you do that, I hope you succeed and we get some interesting ai supportive tools for our engineering workflows. Um, that would be it for today's episode of the fire Science Show.

Wojciech Wegrzynski: 1:01:04

A link to the Amir's paper is in the show notes. All Amir link is in the show notes. You can use that to start and see where you get with your own large language model journey on your own local instance, with good privacy and not sending stuff to some Altman, which you probably do not want to do. Thanks for being here with me in the human-arranged, human-recorded and non-AI-created podcast delivering imperfect human content to you every Wednesday. Yours truly, wojciech Cheers, see you here next week, wednesday. Another financial episode. Bye-bye, bye.