Code Riff
come see how ai is used for human flourishing and to have more fun.
we interview ai-pilled practitioners using ai to enhance their work and lives.
hosts:
- Eric Tan - vibecoder https://www.linkedin.com/in/erictisme/
- Yaohong Ch'ng - senior engineer, founder of superuser hq. https://www.linkedin.com/in/yaohongchng/
subscribe and join us on our journey of figuring things out together.
episode takeaways: https://coderiff.substack.com
youtube: https://www.youtube.com/@CodeRiffAI
Code Riff
Global FMCG Researcher Uses AI for Ingredient Analysis
Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.
Cosmetic scientists from FMCG companies spend days clicking between PubChem, INCI Decoder, and supplier sites just to figure out what's in one competitor product. Aidil is a cosmetic chemist. He does this manually in Excel and through hundreds of searches in a textbox.
We tried to solve that task with Claude Code - live, in 90 minutes. 10 products decoded, and a similarity matrix.
What you'll hear:
- Why a trained scientist still loses a day per product to manual ingredient lookups
- Aidil dictating what the tool should do - in plain English, no code
- Building a Python script live that pulls CAS numbers, molecular formulas, and chemistry from PubChem
- Tagging each ingredient's cosmetic function with INCI Decoder (surfactant, emulsifier, thickener)
- The messy bits: synonyms, rate limits, caching, and a PubChem API that doesn't have pKa
- A similarity matrix that clusters Pantene with Olaplex, and Head & Shoulders on its own
- Aidil's payoff: "it has the potential to just take a few seconds"
Hosts: Eric Tan (non-technical builder) & Yaohong Ch'ng (Founder, Superuser HQ, ex-Stashaway head of Data)
Guest: Aidil Juhari - ex-FMCG R&D scientist, cosmetic chemist
Got a problem you want us to solve live? Fill out the form:
https://forms.gle/DSyLzPAoR6x2M4Np9
We write here: https://substack.com/@coderiff
Join our community: https://chat.whatsapp.com/Dmp5eEEsAZhJTB6LjcIG3c?mode=gi_t
LEARN ALONG:
- FMCG: Fast-Moving Consumer Goods - shampoo, toothpaste, detergent. Aidil's world.
- INCI: The official scientific name for every cosmetic ingredient.
- CAS number: A passport number for molecules - unique ID for every chemical.
- PubChem: Free public chemistry database run by the US government.
- API: How one program asks another for data, instead of copying from a webpage.
- pKa: A chemistry number for how acidic or basic a molecule is. PubChem doesn't have it, which bit us.
Connect with us:
- Eric on LinkedIn: https://www.linkedin.com/in/erictisme/
- Yaohong on LinkedIn: https://www.linkedin.com/in/yaohongchng/
- Superuser HQ: https://superuserhq.com/
- Email: code.riffs.ai@gmail.com
Code Riff - messy real-world problems, solved with AI, so you can too.
Researchers in, my previous organization, spent a lot of time doing things the manual way. one study for 30 ingredients will take across two to three weeks. We are saving two to three weeks of lab time by just looking at the ingredient analysis alone.
Yaohong:better than human. I mean, I wouldn't be able to ask you this kind of question.
Aidil:This is quite eyeopening for me la and will take some time to digest, but I'm very happy with the results to see that something that will have taken a long time has the potential to just take a few seconds.
Eric:Hi everyone. I'm Eric, host of the Code Riff Podcast. I'm a vibe coder. And I've been coding for the past one half years using English. And with me, I have Yaohong, uh, founder of Super User hq, who is also exploring the frontier of using AI to transform organizations. And with us, we have our very special guest. Uh, Aidil. Aidil is an ex-scientist, uh, from a global FMCG company, and we're super glad that, uh, you could join us today. Aidil, uh, thanks for joining us.
Aidil:Very happy to be here. Um, I saw what you've done in your previous episodes, so I'm excited to see what we can build together today.
Eric:Yes. All right. Yeah, it is really nice, to get a wide variety of, different industries. And today we are going to dive into the world of an FMCG scientist. Aidil. Could you just explain a bit more about the problem that you face and how you hope to approach this, problem
Aidil:okay. Well, day to day our product researchers, we have to conduct competitive landscape of, other products out there. We want to understand how these ingredients are being used. What is the order of these ingredients, because that determines how much of it is inside. And lastly, also, how do these ingredients change with time as well? So to do this currently, the researchers in, my previous organization, they spent a lot of time doing things the manual way. So they would have to gather information from a chemical database that would be the primary source, so it would take about one whole day to compile all of this information into, one sheet because we have to crosscheck and get several other information as well from different websites. And I think that's what clicking, clicking from one website to another, to keep searching for all of all of this, putting it into that tiny search bar. That's what really stops us from being faster. Yeah. So
Eric:so this is,
Yaohong:I'm curious Aidil, is there like a database available out there that, because I'm sure you're not the only ones searching for this kind of stuff. Right.
Aidil:so there are, so there are databases available. Uh, they are available through APIs, so we have to create a program to call it, uh, to this.
Yaohong:API. Okay. Need to pay? No. free. Yeah. Okay.
Aidil:I have a table of which one we need to pay. Which one we, uh, is behind the paywall. Yeah,
Yaohong:Okay. Interesting. All.
Aidil:yeah,
Eric:Maybe if I were to just share my screen and an idea, you can talk through a bit more about what you're talking, you're thinking about, uh, I'm just sharing my screen right now. And then there, there were two sheets, right? You sent me. Do you wanna go through one of them first or?
Aidil:yeah. So, um, the first one is a product ingredient list. Um, on the left you see the actual product in itself. Um, then on the right most column is the individual ingredients, um, that you get, uh, when you copy and paste this from a website. Yeah. So if
Eric:me of like when you eat food as well. Like you, I mean, like, if you wanna be healthier, you look at, you want the first one to be like healthier, right?
Aidil:Then you need to also see whether this, uh, particular ingredient is a, is a sneaky way to hide that actually there's a sugar, for example. Uh, what, uh, sugar cane extract. Actually it means sugar.
Eric:But, okay. Sorry. Back to this ingredient list for this, uh, Nizoral Anti-Dandruff shampoo. There's product on the left and then ingredient on the right, and then this continues for like many rows.
Aidil:For many roles, right? Yeah. So this is typically what we see and we would start to compile it as well. The step before this is that we have to copy paste it from the individual websites, and we need to go to the individual company websites as well, which will give us the full list of the ingredients.
Eric:Okay. So how is that related to this example data or is not, is it related? Yeah.
Aidil:yeah. So then from this, uh, list of ingredients, right? How it goes to this, uh, sample data is that from each ingredient that we have, we are gonna have to look at the, um, related chemical properties and the chemical structures. So all these weird, weird numbers that you see over here, uh, are actually the molecular formula, which would be helpful, uh, to the, uh, product researchers as well as the formulators to understand what are the type of behavior that the ingredient will exhibit. Okay. Uh, so that's for that. Um, and one thing as well, um, I included,
Eric:Uh, when you talk about what it will exhibit, what do you mean by what behavior it exhibits?
Aidil:So basically it will tell us how the, it will tell us how the ingredient behave. Like whether it's actually more thick, more thin, whether it is actually. Um, whether it's actually acidic.
Eric:Okay.
Aidil:uh, all of these things tell us, um, give us an indication what type, what is the function of the ingredient inside the formula itself.
Eric:And you get that through the, like example, the molecular molecular weight and all this kind of stuff. Is it
Aidil:Correct? Yes. So all of these numbers and additional information, all of these numbers, they're also helpful. Like it tells us whether this ingredient, does it dissolve in water or not? Yeah. If it dissolves in water, then the formula will be very easy to make. Like, if it doesn't dissolve, then we have, we have to find out how the competitor did it la, yeah.
Eric:So if I would've summarize the steps, there are three main steps. So you basically get a product from, your boss or something, and then you need to,
Aidil:boss would kill me if I ask, if I asked. Sorry. So we have to go to the company website itself
Eric:Okay,
Aidil:Yeah. Get the list of
Eric:so you go to your company website, you get the product, right? Then you get the list of ingredients. Um, then after that, from the list of ingredients you look at, uh, related, you extract out the, the chemical properties through sources. And in this case, the sources are called Pubchem, P-U-B-C-H-E-M, and INCI decoder, INCI decoder.
Aidil:Yeah.
Eric:Okay. And you need to do this for how many products.
Aidil:uh, um, single competitive scanning, it would be 30 products plus minimum. It can go up to 60-100 depending on the scale of the study.
Eric:Okay. So this is part of a broader study that you typically do in your workday.
Aidil:This will usually be step zero of, uh, a lot of other steps to do. And if we look at it, um, the information itself is very useful for us, but the process of compiling it makes us wonder why there isn't a faster way to do it for now.
Eric:Okay.
Yaohong:I, I'm curious if you have done this for a while. Uh, isn't there some kind of list somewhere already that contains some of this information?
Aidil:Yes. Um,
Yaohong:So they're like cross reference or
Eric:F, control F.
Aidil:Yeah, you're right. So there is, um, usually most of the time this would be. Uh, if, if you don't mind me diving a bit deeper because you asked this question. So it will be helped by, um, expert formulators like, or senior formulators. Whereas, um, that job role for a formulator is sometimes very different from the job role of a product researcher who has to do the competitive scanning.
Yaohong:But
Eric:but
Yaohong:glycerine is a very common ingredient.
Aidil:just an example. So the reason why I use Glycerine, because it can also be found as Glycerol, uh, as a salt. So it's a very, um interesting case because, um, you have a lot of different synonyms for it, and that is what I wanted to see. Yeah. Can we actually list down what is the correct listing from a synonym of glycerin for glycerol, uh, things like that.
Yaohong:Because if, if I was the one doing this, let's say this is my job every day, I would probably have a spreadsheet containing some of things that are quite common already. Then I can just copy and paste from there. For some of this, right?
Aidil:Um,
Yaohong:Or they're doing it from scratch all the time,
Aidil:no.
Yaohong:right?
Aidil:not so common would be, uh, things like your, uh, botanical extracts, which, uh, depending on the flavor of the flavor of the day, what the competitors, um, are looking at, they will change from time to time. Um, so the type of botanical extracts that the competitors use will be very different from what they were using, um, from the time of botanical extracts that they're using.
Yaohong:Oh, you mean the products can change in terms of the ingredients?
Aidil:Definitely,
Yaohong:the, the products will change over time, even if the same, same, same brand, same product.
Aidil:Uh, same brand. Same brand. That it will sell different variants also. Yeah. So you guys need to pay more attention to your wife's, uh, to your wife's, um, bathroom shelf. You'll know that the reason why, um, shampoo they can make money, uh, is because women, they will keep changing the flavor, the fragrance every time. Yeah. So it's always, uh, so it's always a game of, um, so it's always a game of finding the, the right fragrance, the right ingredient to get people to start buying. Or you'll see their bathroom shelf, uh, they will use halfway and then they will go and buy another bottle. Why? It is Because they want
Yaohong:Explains all the bottles in my, in my bathroom,
Aidil:Exactly. Yeah.
Yaohong:My shampoo is always the same.
Aidil:yeah. This is, um, really deeply into the consumer behavior. Yeah. So that's why the type of, uh, the, the type of thickener that we use is also very, very, uh, is also very important to, uh, to the ingredient list as well.
Yaohong:How about, let's start the ball rolling, right? Let's find a product. I'm curious now how that, what, what goes in and the attributes that we can get from, uh, your, your point of view right. By just by reading your ingredients list.
Aidil:Okay.
Yaohong:Not super curious. yeah.
Aidil:So then you have to, um, you have to, uh, forgive me if my information is not so correct because my role is adjacent to a formulator. So there are some things that I pick up which are correct, and there are some things that a formulator would know better. Yeah. Uh, do you
Yaohong:It's okay. Yeah.
Eric:Okay, cool. I'm just going to get my Claude Code up. So I'll sacrifice my computer for this, uh, and run it dangerously let's just do a, a rough prompt and then feel free to correct me. You are a researcher in. A global FMCG company and, you are to help me to automate a process. Step one is identifying a product or list of products to analyze. Step two is taking the ingredients. Uh, step two is taking out and extracting ingredients from the ingredient list of, uh, product. Step three is, uh, after we got all the ingredients we need to, um, get,
Aidil:chemical properties.
Eric:please continue.
Aidil:Yeah., To find the chemical properties as well as the function as it relates to a cosmetic product.
Eric:Let me just, uh, paste that in first. Um, I am going to paste in an example, of the properties that we need with the sources of, the sources of the information. in in the row above. Of course, this is not the most Aidil way to structure the data. I understand that. Um, but, uh, feel free to restructure the data in ways that, you know, would be more appropriate as well. Yeah. Yaohong um, anything to add on this prompt or any critic? Yeah,
Yaohong:actually why not? Why didn't you just put the file inside? That is the drive, the folder. I
Eric:yeah.
Yaohong:might be more accurate if instead of copy and pasting. Yeah.
Eric:Okay. Yeah.
Yaohong:Then you just copy in and then you say the file is here, this is our reference. Yeah.
Eric:yep. Yeah. Yep.
Yaohong:Or, or any other files that, that is relevant to this. Yeah.
Eric:Uh, I am just going to drag in the files. Uh, please find the files here. Um, so there's the ingredient list.
Yaohong:At add sign that? No. That uh, like email? No. Yeah. Then the ingredient is, yeah. There you go. Tap then press tab.
Eric:okay.
Yaohong:Tap. Yeah. There you go. You didn't, you didn't know this.
Eric:No, I usually just drag, drag it in example
Yaohong:okay. Yeah. But you can do this
Eric:Okay. So I'm just going to give a bit of commentary. Uh, so ingredient list is this is, where we, based on the products, extract ingredients based on, uh, what is stated on the website. Uh, and then we honor list every single ingredient in a separate row. Then on the example data, uh, this is kind of the output we want, but please, uh, structure so it looks cleaner, like Yeah. As well. Um, we want to do this for, let's just say, 10, uh, products today to test if it works well. Uh, I'm just gonna turn on plan mode. So that's important because if you jump too fast into coding, then you might not get what you want.
Aidil:And you won't be
Yaohong:So every use, every, every, you scroll up a bit. So you see, that's why when you copy and paste, see what have been confusing. see how the multi-line broke the ingredient list, you see?
Eric:yes.
Yaohong:So that's one issue when you want to copy and paste from, uh, Excel or something. Yeah.
Eric:Yeah. Yeah. It's true. I, but I, I think I would push back also that when I do this, it can actually understand
Yaohong:Figure it
Eric:table look like and
Yaohong:there's a,
Eric:it's like the spacings and stuff like that. It's, it's managers to do that. And then when I do this for like simple. Business stuff. Um, for Excel, like when I copy and paste, it's actually like perfectly accurate back into an Excel sheet. So it actually still, still kind of
Aidil:what is the kind of data structure that you would, uh, you would suggest for this? Because you were mentioning that this is not the most appropriate or the most clear.
Eric:I personally would have the, the sources in a, in a column, and then each of these kind of, maybe glassing can repeat itself, but, have a,
Aidil:I understand. So it'll be a tall column now.
Eric:Yeah.
Aidil:That makes sense. It's like,
Eric:each row.
Aidil:Product ingredient, property value, and then source. Yeah.
Eric:Yeah. Okay, cool. How should the output be structured? Okay. Very nice. Uh, okay, So Aidil when you look at this, which two do you prefer?
Aidil:Yeah. One thing that I find suspicious is the last column, it just says pubchem. And I think like the human sense interpretation of it, is that like, just not what we discussed. That should have been the source. A lot of this researchers, they just want everything in one glance lab. They would rather have a long. Which is, I think, the reason why I put it in that format in the first place. Yeah. So the single sheet, let's try the single sheet first, then see how it turns up.
Eric:Edge cases. How should we handle ingredients? Pubchem can't resolve cleanly.
Aidil:Flag it out. Flag
Eric:Flag it out. Okay, so it's the first one, right? Pre-clean and then do the best effort and flag it out. Where should this run In a Python script.
Aidil:Uh, yeah. I have a question. Um, the expected output, after we do all of this prompt that, how did it know what's the expected output is? When we were going through the options, uh, the interview questions that Claude Code was asking us.
Yaohong:correct. Yeah.
Eric:Yeah,
Aidil:I
Yaohong:It inferred from the files that we sent on how it should look like already. Right. But then, because it was unclear, that's why there was some clarifications that you asked.
Aidil:Okay,
Eric:yeah.
Yaohong:Yeah. Pretty cool. Actually, better than human. I mean, I wouldn't be able to ask you this kind of question.
Eric:I, I was telling I do, we are the, we are the AI today.'cause I have no idea what this all is. Okay. So it has written a plan.
Aidil:Oh, this is the plan.
Eric:Okay.
Yaohong:Now it generates a plan before it starts coding yeah. Right.
Eric:Okay, so, okay, I would press enter, but I think Yeah, it's good to take it slow just so we don't waste
Yaohong:today you're changing your, your style.
Eric:Yeah. Yeah. I'm changing 'cause I had another session with my cousin who's a software engineer. He, he grilled me basically for being a bit too fast. So just to summarize, uh, this is a research, piece for Aidil ex- uh, FMCG scientists, right? We want to run a test run on 10 products, uh, and X number of ingredients. This is a one off thing, not a long lived tool. Okay, so the decisions, number one is a single flat sheet. Number two, there is a source column, uh, of the two sources that you mentioned. Number three, there is a function field number four, edge cases. So if you can't find it, what happens? Number five is the implementation of the Python script. And then in terms of the output schema and the kind of in ingredient information that we want. Uh, these are the lists, uh, idea. Feel free to stop us anytime,
Aidil:Um, I just wanted to take a look again at the second source because I saw something called derived. What
Eric:derived in key name.
Aidil:Yeah.
Eric:Um, so it will use the in key INCI Dakota page title if available. If not,
Aidil:Oh, so that's why it's making a decision for itself?
Eric:Yeah, I think it, it might have.
Aidil:fallbacks.
Eric:just means like, if you don't get the thing you want, then what's the most like reasonable thing to do? Yeah.
Aidil:Okay. Got it.
Eric:Yeah, yeah. Yeah. And then PKA is not very available with the API Yeah.
Yaohong:Yep.
Eric:cool. Um, yeah. So, and then for pre-cleaning there, there's a pre-cleaning step, which
Yaohong:Basically, it's trying to clean up the active ingredient names now. Like, like it would take out active from here
Eric:oh, I don't like this
Yaohong:then numbers also. Actually some of it might be useful. I dunno. When it comes to the color code
Eric:Okay.
Yaohong:and all that, what does this use? Is this used?
Aidil:I would say the first two are important, right? Because sometimes this, uh,
Yaohong:First of all, you need to know the actual thing. Right?
Aidil:yeah, the database is sometimes quite sensitive. Like you have this in the, uh, you have this in the brackets and then it's going to return a negative result. And I think that's why it made the active decision to do that, because this might be, uh, how to say, like something that already know stuff. Same thing
Eric:you will you keep it?
Aidil:Yeah.
Yaohong:I, I think it should be kept this way. Uh, you scroll up again. We do, we have the pre clean name down, down, down to the columns again. Yeah.
Eric:Or column. Okay.
Aidil:imagine the researcher have to manually remove, remove, remove rather that now. Claude doing that?
Eric:Yeah. Okay. Sorry, can I just get an example of this? Like if I look at this, is this the right pitch or is it the other pitch?
Yaohong:Yeah. Go back. Go back. Yeah. This page,
Eric:so
Aidil:Yeah. So we can look at this one, the head and shoulders classically. Um, the active, so it will say that this is the
Eric:Oh, so we want this thing
Aidil:Yeah. Correct. So we want the, uh, the zinc, uh, we don't
Eric:So that's exactly what they said. Okay. Yeah. Okay, cool. Okay, so then I think, um, this will be put into a Python script, and then this is the script structure.
Yaohong:Okay to me. Yeah.
Eric:Thank you. And then there's a caching, which I guess is like saving
Yaohong:guess you don't need to heat again because it's the same ingredient. Yeah. Assumes there will be some I see. Of 240 rows only 133 are.
Eric:okay, so this is the problem where, you know, you have the same, the same ingredients, right? So you don't wanna have duplicate rolls, which makes sense as well. So and the rate limit is for just to make sure that
Yaohong:Five requests per second. Yeah.
Eric:Yeah. Okay. Just to make sure that if you get stuck, um, then it continues so that it doesn't get stuck., Some limitations, PKA will be left blanked. Um, and then pub chem, CID wins.
Yaohong:Is that true? Is that true? I do,
Aidil:Yeah. So, um.
Yaohong:only paid Then can can can get,
Aidil:The, you're talking about the first point?
Yaohong:yeah, the first one, pk.
Aidil:it's not because I managed to look at the, at the website, like the M website. You look into it, the, it will be there. It's one of the, it's one of the first page properties already. You don't need to,
Yaohong:Oh, but the website, but the API may not have.
Aidil:the API may not have. Okay. Then. Then maybe I either, when I'm not sure. I've never personally used the the API, so we are
Yaohong:Actually we should, we should. Let's, let's verify that though.
Eric:Yeah. Okay,
Yaohong:because A is very different from, uh, calling API versus scripting the website. It's quite a different task.
Eric:so which one should we do as in for, for the PKA thing? PKA is left blank. This is only true if we use API. So if API doesn't have, please use the website, please. Like, just scrape the website. Yeah.
Yaohong:but if your website has all the data, you don't need to run to tension
Eric:Yeah. Okay.
Yaohong:script,
Eric:script the website.
Yaohong:Actually, uh, yeah. I'm very curious. I want, I want to go and find the API now just to verify this claim. Gimme a second. Yeah.
Eric:Same for strip in. Can I say this for the strip ingredients are fixed as well. Like should we just get it to do
Aidil:So we, I think, uh, basically anything in the parenthesis can be ignored. Anything in parenthesis Yeah. In records can be ignored.
Eric:Okay. Um, and then for the, any other things for the limitations to call out? Like, we are trying to overcome the limitations now, right?
Yaohong:I got something. I got something. So Claude is right that there's no PKA property because in the API, because it says that PK is pKA is an experimental annotation rather than a computer property. So if you want to retrieve it programmatically, you must use the PUGView API let me put in a chat.
Eric:Gotcha. Gotcha.
Yaohong:It still within PubChem website. It's
Eric:I don't need to copy paste it. No, I, I, I have comprehended it in my brain, so
Aidil:Okay.
Eric:kudos to me. Okay. Okay. Same for Okay. PUGView API. Okay. Shall we just comment on that then?
Aidil:So that means because the API is with PUGView, then we can forget about scripting the website. Right.
Yaohong:Correct. Call API is always faster.'cause when you scrape website, you, you actually got to, you know. Download the page itself and parse the data. Definitely. than calling API. Yeah. See now you can find the P view. You see,
Aidil:Ah.
Yaohong:the API and can retrieve it. Yeah. But you can see the values are pretty large. 1.4 megabyte, four seven compounds. Yeah, you were right. Yeah, of course I'm right. I Googled it. If return 14.4, that matches Glycerin. Yeah..
Eric:Yep.
Aidil:You, are you a computer scientist, Yaohong by training or like,
Yaohong:No, I'm a finance guy.
Aidil:oh, you're a finance guy.
Eric:and an information systems major, please.
Yaohong:I usually downplay that. So, prior to running my company, I was in a startup there for a couple years and I went to a VC and then I saw the AI wave and I decided, oh, maybe it's time to build something again with what we learned over the last couple of years with ai.
Eric:Alrighty. It is time to review the results. Okay. Now, Aidil, I need you to read this out for us and tell us your honest thoughts and opinions because we will not be able to judge this. So you are definitely, yeah. We are not taking you out of any job here.
Aidil:Okay. so what we see over here is a very, very long, uh, long Excel sheet. I think there are, um, a few close to 20 columns. Uh, what is in the first column is the ingredient name that we have provided. Um, second column is, uh, what it has processed that name to be so that it will return the right results in the database. Um, then what is very interesting here, um, is that it's able to, uh, pick out, uh, what are the right chemical formula names. So what I'm seeing, uh, is quite amazing. What I'm seeing in my, uh, screen right now is. Not just what's at the name, at the back of the bottle, but also the actual formula names we will need to have, then we can talk to our formulators. Oh, this is actually the ingredient that is being used by the, by the competitors. Uh, is it possible to synthesize it in the lab? Right? Where can we source it from? Uh, what is the properties of this? So this is very useful information. Um, What I wanted also the most, um, I think the most important column as well is the the right most column the function. Yes. Function, yes. Function is, um, will let us know, um, which ingredients are actually contributing to the kind of very thick or very flowy, really behavior that men don't think about. Women are very, very, very sensitive to that. So the order in which they appear on the ingredient list as well is important. So, um, I was talking to my friend, um, what can we do if we have this data? So what we can think about is how can we then cluster this products according to whether they are a bit more thick, uh, a bit more thin. Without running any lab experiments, no lab studies, everything is in, uh, everything is just from the computer itself. Just what we can get from the public website. And you can imagine if that is successful right? Then one study for 30 ingredients will take across two to three weeks. We are saving two to three weeks of lab time lab data by just looking at the ingredient analysis alone. If we can automate this process la. So what I would imagine as the next step, take this and then ask, uh, push this to LLM, uh, or push this to Claude Code, whatever, right? And then ask, can we cluster the, uh, products by the, by the ingredients where their ingredient functions and the relative order in the ingredient is. So I will imagine that that's a very, very. That's a very inspiring next step I think, uh, for us that we can explore.
Yaohong:My business brain is working. Can we, can we turn this in the product and sell to off,
Eric:no, no. We are doing this for intellectual curiosity. Please, please hold, hold your horses.
Yaohong:because, you know, you can save two to three weeks of lab time, right? So how much, uh, can I charge for
Eric:Maybe if you were to just look does it look like trash or
Aidil:No, it's not trash.'cause this INCI decoder, um, the main terms that I was looking at is the surfactant, which is, um, if we go to the left, then it should link back to a ingredient which has a sulfate in it. So a sanity check would be, for example, number 16. the right. What it should be is a surfactant yeah. Because uh, that's the main ingredient in your
Eric:surfactant. Right.
Aidil:That's my sanity check.
Eric:Okay. So what we just did here in an hour or so, um, was, uh, yeah. Maybe something that would've taken four hours-ish. So we save like three hours, I think, for this. Um,
Yaohong:one time. Yeah. So if you can just repeat this the next time. It's Yeah.
Eric:Yeah. So this one hours would, would
Yaohong:The next time would be faster because you just upload the Excel that you want to
Aidil:we already have the script right to to,
Yaohong:Quite. You just run the script again?
Aidil:It will take just a few, few seconds to get this something that will have taken the afternoon,
Eric:yeah. And I think also the next step, is to continue refining it, right? Because you would wanna freeze the columns and then take a look at whether it's accurate, and then if it's not, then you would wanna continue to iterate on it. Okay. So there's a product profile sheets, uh, which was what Aidil was saying we'll take how long? Two to three weeks. And then there's also product similarity sheets, right?
Aidil:This is something that, yeah, this is something that would be, easier for me to understand. So that means, like, for example, uh, Pantene Pro-V is very similar to olaplex.
Eric:Hmm.
Aidil:Why, why that makes sense to me is that, uh, Pantene is a daily moisture. Then, uh, the Olaplex is, uh, a maintenance shampoo. So what this tells me just from the title alone is that both are very highly conditioning. So women use them to make sure that their hair is soft and smooth. Uh, they don't use it to clean their hair. They use it to make sure that their hair is soft and smooth. Yeah. So, uh, that's why in terms of the ingredient, it's going to be very, very, very similar. So, yeah, it does make sense to me. I'm curious how, I'm curious how they come up with this similarity table. But I think that would be for another day.
Yaohong:Very cool. So technically you can find a similar shampoo that's cheaper and then replace it in your toilet
Eric:Oh
Yaohong:idea, right?
Aidil:Shampoo in your portfolio and then charge higher for it.
Yaohong:So I can scrap all the shampoos in the market and try and do as experiment like this. That's so cool. Right?
Aidil:Yeah. thanks so much, um, for the time. This is quite eyeopening for me la and will take some time to digest, but I'm very happy with the results to see that something that will have taken a long time has the potential to just take a few seconds. If we already developed the script, and I don't need to have knowledge about all this API or web scraping to get it
Eric:for us, we are not the experts in Yeah. In this, and yet we are still able to help you out, uh, with this, so, Interesting that, you could open our eyes to this like, other aspect.
Aidil:I'm happy that you, the, the two of you guys were also very patient AI as well, taking in
Eric:What, what you
Yaohong:gonna all, all, all I'm gonna do now is to find all the stuff I have scanned, the ingredients, and try to make myself, and make
Aidil:we can throw it into, uh, throw it
Yaohong:try, I'll ask Claude code to generate the similar compound from the ingredient list,
Aidil:Yeah. You, you
Yaohong:Save a lot of money. Many. If I can make like all the official products, uh, it's gonna save a lot of money.
Eric:yeah,
Aidil:try and you let me know. Okay.
Eric:yeah. Alright then. But, uh, thanks so much Aidil. Thanks Yaohong.
Yaohong:Thank you.
Eric:yeah, I think, um, it was, it was very nice to, to chat with you guys.
Aidil:Okay, bye-bye.
Yaohong:right.
Eric:Thank you so much for watching episode five of the Code Riff podcast. If you enjoyed what you've heard, uh, please subscribe to us on whichever platform that you are on, and it would also really help if you could offer us a review as that would really help more people find the podcast. Uh, once again, if you would like to come on the pod with us and share a problem, feel free to fill up the Google form. Uh, you also can join our, community on WhatsApp if you would like to just, you know, have a chat with us and see what we are up to these days. There is also a substack that you can follow along where we write our insights from the episodes that we have and understand where, you know, AI can take us eventually in different industries and, and different walks of life. Uh, but yeah, thanks for joining us on the Code Riff podcast and we'll see you on the next episode. Bye.