07/11/2023
Predicting Player Churn with Graph Learning
Author: Ivaylo Bahtchevanov
The Player Retention Problem
Every online and mobile game developer is familiar with the basic equation for successfully monetizing a game: CPI < LTV (i.e. you need to be able to generate more revenue from your players than it costs to acquire them).
No one builds a game with the intent of losing money, but the unfortunate truth is that, for most games, the majority of players churn within the first few days. With so many games and options competing for user attention, players get bored easily and move on. Game developers spend roughly $15 billion on player acquisition annually, only to have 75% of players churn after 24 hours and 90% after 30 days.
Players drop off quickly after the first few days, but the ones that stay active are likely to play for months or years. The average game’s retention curve flattens out as new players become regular players, offering downstream monetization opportunities. Below is an example of the retention curve for King of Thieves, which shows a common trend – past day 30, the retained users become your base users.
Successfully engaging players during the first few sessions defines the core loop of your game and sets a clear path for the performance and depth of engagement. This is when retention is most actionable. An improvement of just a few percentage points in retention here dramatically increases the average lifetime value of your players.
Common Churn Prediction Methods
Most game developers collect telemetry on players and their activity and will use this data to proactively engage with players to prevent churn. Using traditional machine learning techniques, game developers will predict which behavior or attributes signal risk and then apply retention tactics, such as in-game offers, personalized notifications or game experiences, temporarily reduced ad load, and more.
Regression analysis can help understand early actions and factors that correlate with churn during onboarding, but unfortunately this approach can break easily. There are often many complex factors in a player’s decision-making process – it requires a very nuanced understanding of their behavior, and sometimes there’s no clear precedent in the activity. If the game makes any changes to the onboarding experience, the data collection and training process needs to start from scratch.
Game makers might also leverage deep learning or more sophisticated approaches in order to learn from the rich interactions and the contextual information on players (device details, personal information, behavioral patterns, acquisition channel, behavior in other platforms). These approaches rely on handcrafted features that are very time-intensive, introduce bias, and are difficult to scale when games reach the order of millions of player interactions on a daily basis, and teams need to spend time to maintain machine learning models.
What’s more, traditional machine learning requires a fixed input, but when you have asymmetric player interactions and historical data, this creates a very sparse training set.
Optimal Retention Through Graph Representation Learning
Because of these reasons, a game environment can naturally be represented as a graph, where players are entities connected to components of the platform based on their interactions and relationships. Rather than thinking of each individual player as an entirely separate entity, you can imagine there might be many indirect connections between different players based on observed commonalities.
Graph Learning, and more specifically, Graph Neural Networks (GNNs) have emerged as the leading machine learning approach designed specifically for processing data that can be represented as graphs, bringing representational learning to this particular data type. GNNs are able to capture complex relationships and dependencies between entities in a graph without any feature engineering and can operate on large amounts of heterogeneous data. A prediction on a single user will leverage signals from the entire graph, capturing the similarities of other users and turning them into useful indicators of potential behavior. GNNs also scale more effectively than traditional machine learning methods and can process the hundreds of millions of player interactions and contextual data points generated in games. These advantages lead to more accurate predictions for churn to optimize retention and lifetime value of your players at scale.
You can read more about the benefits of GNNs in our deep dive blog.
GNNs bring a new set of advantages that are very well-suited to the churn prediction problem in gaming. Both researchers and developers have seen performance improvements with graph learning techniques. Researchers have trained state-of-the-art AI agents using GNNs for Starcraft and other real-time strategy games. Rovio uses machine learning to predict churn and adjust game difficulty in real-time for Angry Birds, a game with over ten million DAUs. Graph-based data lets the Finnish game developer leverage signals from the game environment and other players to forecast a new player’s behavior before there is any historical behavior on that user.
Forecasting behavior for a new player is difficult, and GNNs help developers overcome the data cold-start problem. Even if there is no data on a player, the graph-based model can make accurate predictions of future behavior based on the rich context and features from their graph data. For the average game that loses 75% of players in the first 24 hours, this fast prediction ability is critical.
The Kumo Approach
Kumo’s approach enables you to go from raw data to large scale GNN-powered churn predictions in minimal time-to-value. The user connects their data tables directly, and Kumo will build your enterprise graph from the raw data, learning the relationships and interactions across all entities in your data tables.
Kumo provides a flexible no-code interface that makes it easy to generate any number of predictions for many use cases. Imagine being able to write SQL queries to predict which players are at risk of churn while simultaneously performing scenario-analysis on which future actions will result in the best possible retention for these users. You can see an example of how a game developer can build a graph and predict churn quickly in the churn-prediction walkthrough.
The same query language can also tackle other parts of the player lifecycle, such as personalizing their gaming experience to maximize ARPU, LTV, NPS, or to forecast future behavior of specific players. In other words, you can re-use the same platform to power the entire player ecosystem without having to build any new ML pipelines or features.
If you are interested in hearing more about Kumo, reach out to us directly!