Pragmatic Data Scientists

Why is your aha moments wrong?

Yuzheng Sun

Hi everyone. Today, I'm going to talk about a very dangerous and a very common misconception in Aha-Moments. In doing growth for products, we all heard about these aha moments, probably from the Facebook and Dropbox example. Aha moments are defined as the key moments in user journey in a product when the user realized why they want to use the product. In the Facebook example, aha moments are defined as users having more than seven friends. So they realize this is an app I can use to connect with my friends. We love aha moments because they are a perfect combination of our quantitative and qualitative understanding of the users. And we can actually define this events and put it up on the dashboard. Then we can spend resource and focus on driving up aha moments. However, the problem is, they are usually wrong. Why? Because of Endogeneity, latent variables, or selection bias. What I mean by this term in the setting of aha moment is. When we talk about aha moments, there is usually a causal statement. The implicit causal statement is: Aha moments cause users to become active. That is how we justify all the resource and efforts we spent on driving aha moments. We believe that if users experience these aha moments, they suddenly turn from inactive users to active users. But in reality, it is usually the other way around. Users experience these aha moments because they are active users. For example, users who are likely to use Facebook are going to find their friends faster on Facebook. In monetization examples, we can also define users who display certain behavior are more likely to be our paid user. Then similarly, it's actually users who have higher paying intent that would demonstrate these signals of purchase intent early on. So in the end all the resource you spent on trying to make a difference did not actually make any difference. You are just selecting the users who had the higher intent at the beginning and bombard them with notifications, emails, marketing and different kinds of incentives to accelerate the journey. It didn't actually create much value for the business. What's worse is in reality, this mistake is almost inevitable for two reasons. The first reason is because our definition of aha moments are user behaviors. And when it is user behaviors, it is the user's choice. And the user's choice has a lot of latent variables. And one of the most important latent variable is the user's propensity to be an active user. In other words, because this is user's choice, high intent users are more likely to choose to do the aha moments; the low intent users are less likely to choose to do the aha moments. The second reason is your aha moment is going to be highly correlated with your desired outcome. If you draw a correlation matrix between your aha moments and your desired user activity, they're going to have a high correlation. Coupled with aha moments being users active choice, you are going to have this endogeneity problem, and you are never going to be able to make the causal statement that your PM wants you to make. So knowing this, what can we do? I have two advices. The first advice is accept what aha moments are: they are the proxies and the predictors of user intent and just use it as is and they can be useful as well. For example, your desired output whether it's a monetization or maybe it's an activity that can be very deep in the funnel and aha moments can be very good early indicators of this user being active. So you can use these"aha moments". Let's actually call them proxies or indicators early signals to train your ML model, so your ML models can reach scale earlier. Or you can have more sample size to study what works and what doesn't work. As early indicators or proxies or predictors, these so called"aha moments" are actually useful. The second advice is: if you want to draw the causal claim, you have to make the treatment exogenous. And this means you need to make this whether it's access to certain features or access to certain experiences in your control, not in the user's control. Then you actually know that this assignment into the feature or the experience is exogenous to the user's intent. For example, you can give access to certain features or certain incentive programs randomly to a group of users and look at people who have access to the feature or the incentive, how their long term performance differ from people who do not have access. That is the core of AB testing and online experimentation. That is a whole new topic, we'll go into those topics later, but we won't go into depth today. Let's summarize the takeaway in this video. The first takeaway is: your aha moment, if it is a choice of the users, then it is going to be endogenous, you cannot make causal claims. It is not going to be your aha moment causing the user to be highly active. It is going to be highly active users choose to experience aha moments. Two: Use these so called"aha moments", what they actually are. They are proxies, early indicators, or predictors and use them as is. Three: If you want to draw causal claims, make it exogenous, make the treatment in your control, give it randomly to different people, then you can make causal claims. Hope you find this video useful. Next time when your stakeholder push you to define some aha moments, share this video. Let's them understand that the aha moment are not the causal aha moments that they think they are. Let me know in the comments, if you have any questions. This is Pragmatic Data Scientists. See you next time. Bye.