Hybrid GNNs: Transforming Recommendation Systems with Kumo AI
Recommendation systems have been a subject of great innovation over the past few decades, starting with matrix factorization in the early 2000s and evolving into two-tower and other deep learning approaches in the 2010s. In recent times, graph neural networks emerged as the leading approach to power recommender systems, enabling many well known tech companies (such as Pinterest, Uber, Amazon, and others) to deliver magical customer experiences and double-digit lifts in business metrics. Kumo offers a robust architecture known as hybrid graph neural networks (hybrid GNNs), which has been empirically shown to deliver outstanding performance on both public Kaggle data science challenges, and real-world production deployments.Understanding the complexity of recommendation systems
At their core, recommendation systems are responsible for recommending content or products to users, with the goal of providing inspiration to users. However, developing these systems is inherently challenging, due to fickle preferences and complex patterns of human behavior. Users vary significantly in preferences; some are explorers who constantly seek new experiences, while others are repeaters who prefer familiarity. This is made even more challenging in the face of big data, cold-start items, new users with very little interaction history, and lack of data diversity. Because this problem is so challenging, many engineering teams have developed multi-stage recommendation pipelines involving numerous candidate generation steps followed by a complex ensemble of ranking models. These systems take tens of millions of dollars to build, in terms of human and infrastructure cost, and are challenging to maintain over time.Introducing the hybrid GNN architecture
To address these challenges, Kumo developed a hybrid GNN approach that can create great recommendations with a single model, while capturing the nuanced behaviors of different users with remarkable accuracy. The hybrid GNN models two distinct user behaviors differently within a single backbone GNN model: repeated interactions and explorative interactions; hence, referring to this model as a hybrid GNN. The hybrid GNN is the default model architecture for recommendation and personalization tasks at Kumo, and it can be fine tuned to your specific dataset using the Model Planner.Why GNNs are ideal for recommendations
GNNs are highly-suitable for recommendation tasks because, unlike traditional models, they can leverage rich graph connectivity patterns to gain a deeper understanding of user preferences and insights that are often missed by other algorithms. The recommendation problem forms a bipartite graph between users and items, where nodes represent the users and items, and edges represent the user-item interactions. Edges often come with timestamps. Moreover, multiple edges may exist between pairs of nodes, since a user may repeatedly interact with the same item (e.g., repeat ordering of the same product in e-commerce). Given the bipartite graph of the past interactions, a recommendation task can be cast as a link prediction task—one that calls for predicting future interactions between user nodes and item nodes.
Model input: Processing data with the hybrid GNN
The hybrid GNN model is designed to capture fine-grained user behaviors by leveraging graph connectivity. Similar to a standard GNN model (e.g., GraphSAGE), the Hybrid GNN model processes input through a subgraph centered around each user node. For simplicity and efficiency, consider a 1-hop neighbor sampler:
Exploring the hybrid GNN model architecture
The key innovation of the hybrid GNN is its hybrid approach to computing item scores per user, which are then sorted to produce the top K item recommendation for the user. **Specifically, the hybrid GNN computes item scores differently based on whether or not items are sampled within the subgraph. ** These differing scoring approaches are as follows:
Training and optimization: Maximizing performance with the hybrid GNN
The hybrid GNN is trained end-to-end, optimizing both types of item scores as well as the repetition scalar altogether to maximize the predictive performance of future user-item interactions.This way, the hybrid GNN will figure out the user behaviors from data on its own, producing highly accurate predictions that capture the complex nature of repetition versus exploration behaviors.Empirical studies: Assessing hybrid GNN performance
The hybrid GNN performance was tested using a Kaggle H&M recommendation challenge. The challenge called for predicting the top 12 items each user would purchase in the next 7 days, with model performance measured by mean average precision (MAP) @ 12. The dataset contains two years of historical data consisting of 1.4M users, 106K items, and 31.7M interactions between them. The challenge attracted a total of 3,000+ teams that submitted results to the public Kaggle leaderboard, over the course of the 3-month competition held in 2022.Comparing hybrid GNN results to top Kaggle competitors
The following are the results of the hybrid GNN, evaluated on the hidden test set after the competition (Kaggle allows post-competition submissions). A comparison of the hybrid GNN results to the top Kaggle competitor submissions is also provided below:Model | MAP@12 score on Kaggle public leaderboard |
Hybrid GNN | 0.031 |
Kaggle top 10% | 0.024 |
Kaggle Median | 0.021 |
Ablation studies
In order to confirm that the hybrid GNN produces better results than a traditional GNN approaches, Kumo ran an ablation study, where only one ranking technique was used at prediction time. The hybrid GNN was more than 100% better than the “inner product” approach, which is the standard approach used by two-tower recommendation models.Model | MAP@12 | Hybrid GNN is |
Approach (1) - Use MLP to score items in the sampled GNN subgraph. | 0.023 | 35% better |
Approach (2) - Inner product between user embedding and item embeddings. | 0.015 | 107% better |
Real-world applications: Deploying the hybrid GNN in production
Kumo has deployed the hybrid GNN model architecture to production in many enterprises, resulting in significant boosts in model performance when compared to internal baselines and improvements in revenues and customer experiences. The following illustrates Kumo’s recommendation performance on a large-scale local food delivery service—with the task of recommending the restaurants that each customer will most likely order from in the next 7 days (out of 600K+ restaurants). The impact of the Kumo recommendations powered by the hybrid GNN architecture generated over $100 million in additional sales for the food delivery company.Model | MAP@12 score |
Kumo hybrid GNN | 0.32 |
Approach (1) | 0.31 |
Approach (2) | 0.27 |
Conclusion: The power of the hybrid GNN in recommendation systems
The hybrid GNN model is a testament to the power of innovative machine learning techniques in handling the intricacies of recommendation systems. By simplifying the recommendation process into a single model that adapts to various user behaviors, this not only enhances user satisfaction but also provides a scalable, efficient solution for businesses aiming to personalize their services. As enterprises continue to seek out technologies that can deliver precise recommendations in real-time, hybrid GNN stands out as a beacon of innovation and performance in the data science community and beyond. kumo.ai- Table of Contents
-
- Hybrid GNNs: Transforming Recommendation Systems with Kumo AI
- Understanding the complexity of recommendation systems
- Introducing the hybrid GNN architecture
- Why GNNs are ideal for recommendations
- Model input: Processing data with the hybrid GNN
- Exploring the hybrid GNN model architecture
- Empirical studies: Assessing hybrid GNN performance
- Real-world applications: Deploying the hybrid GNN in production
- Conclusion: The power of the hybrid GNN in recommendation systems