Kumo AI Wins 2024 A.I. Award for Most Innovative AI Technology Learn more

09/09/2024

Why Are Shallow Embeddings Important in Recommendations?

diagram of user and items making the shape of a graph for use in predictions

By Davey Wang and  Matthias Fey

In the construction of recommendation systems, embeddings play a crucial role in enhancing performance by capturing the nuanced and complex relationships between users and items. These representations help models understand not just direct interactions but also potential, implicit interactions that have not yet occurred, making recommendations more accurate and personalized.

The importance of sampling negatives

Sampling negatives is the process of selecting non-interacted or non-relevant items for training a recommendation model. In recommendation systems, the data of positive interactions—examples where a user has shown interest in an item by clicking, purchasing, or liking it are greatly outnumbered by negatives:  the absence of an interaction.

Given the quantity of the negatives, it is impractical to use all examples, so “negative sampling” is employed to select a manageable subset for training the model. There are two key reasons for sampling negatives:

  1. Handling imbalance: in most datasets, negative interactions vastly outnumber positive ones. Negative sampling balances the dataset and ensures that the model learns not only from what users like but also from what they ignore.
  2. Improving efficiency: it is computationally expensive to consider all possible negative examples (which could be millions) for each user. Sampling a representative set of negative examples reduces the computational load during training.

The role of shallow embeddings in sampling negatives

Shallow embeddings offer an efficient and scalable solution for handling the negative examples that are abundant in recommendation systems. Negative examples, or the absence of interactions between a user and an item, make up the vast majority of the data in recommendation systems. In this context, shallow embeddings become invaluable, allowing the system to effectively sample a large set of these negatives and process them efficiently during training.

Why performance matters

Shallow embeddings are particularly effective in handling imbalanced and sparse datasets. In most recommendation systems, a significant portion of user-item interactions are unobserved. Shallow embeddings, leveraging latent factors, can capture general patterns such as:

  • Popularity trends
  • Seasonal variations
  • User preferences for certain categories
  • Demographic appeals

By understanding these general patterns, shallow embeddings enable the model to infer potential interactions—even for negative examples—thereby enhancing the system’s performance. They allow the model to extrapolate to unseen interactions, making the recommendation process more reliable and robust.

Scalability in sampling negatives

One of the key benefits of shallow embeddings is their computational efficiency when sampling negative examples. Given the scale of most recommendation systems, this efficiency is critical. Shallow embeddings can process large datasets quickly without the computational burden that comes with more complex deep learning techniques, making them ideal for real-time or large-scale applications.

Shallow vs. deep embeddings

Where shallow embeddings are powerful in capturing broad patterns and handling negative examples, pure graph neural network (GNN) embeddings have historically struggled with these signals. This is because deep GNN embeddings typically focus on local similarity—the relationships between items within a user’s direct network or sub-graph.

Shallow embeddings, on the other hand, capture both local and positional similarity:

  • Local similarity refers to how items in the immediate neighborhood (or sub-graph) of a user are similar to one another.
  • Positional similarity considers how close items are to one another in the broader graph, even if they are not directly connected. 

Learn more about the research in the area of shallow vs deep embeddings.

Because of this, shallow embeddings provide a more comprehensive representation, which is why they tend to outperform deep GNNs in certain recommendation tasks.

When are shallow embeddings applied?

Kumo applies shallow embeddings depending on the item’s relationship to the user’s sub-graph. For items that belong to the sub-graph of a user, deep GNNs are used to capture the detailed local similarities. However, for items that do not belong to the sub-graph, shallow embeddings are leveraged to capture broader patterns and provide recommendations.

Learn more about hybrid GNNs.

Conclusion

In conclusion, by offering both deep and shallow embeddings, Kumo provides the key enablers of performant and scalable recommendation systems. By balancing local and positional similarities and efficiently handling negative sampling, it is easier to deliver accurate, timely, and relevant recommendations in large-scale environments. 

Interested in learning more? Request a demo to get started.