Solution Background and Business Value
Item-to-item recommendations help users discover products similar to what they are currently viewing or have purchased. This technique powers features like “You might also like” and “Frequently bought together” on e-commerce platforms. These recommendations:- Increase customer engagement by surfacing relevant products.
- Boost conversion rates by promoting co-purchased or similar items.
- Enhance personalization by learning from user behavior patterns.
Data Requirements and Schema
To train an item-to-item recommendation model, we need structured data that captures relationships between items based on user behavior. Core Tables-
Purchase Item Pairs Table
- Captures item co-occurrence based on user behavior (e.g., items bought in the same order or session).
-
Key attributes:
-
item_id_lhs
: The primary item (left-hand side). -
item_id_rhs
: The similar/co-purchased item (right-hand side). - Optional: Purchase session, timestamps, user interactions.
-
-
Items Table (LHS & RHS)
- Represents unique items in the dataset (even if LHS and RHS contain the same data, Kumo requires them as separate tables).
-
Key attributes:
-
item_id
: Unique product identifier. - Optional: Category, price, brand, description, image embeddings.
-
-
Users Table (Optional)
- Stores customer details that can improve recommendations.
-
Key attributes:
-
user_id
: Unique identifier. - Optional: Location, demographics, past purchase patterns.
-
Predictive Queries
The following predictive query ranks the top 20 most relevant items for each product:Deployment Strategy
Batch Recommendations in a Key-Value Store A common way to serve item recommendations is to precompute them and store them in a key-value store for low-latency retrieval:- Personalized recommendations based on the user’s session.
- Real-time filtering to exclude items already viewed by the user.
- Cache top-10 recommendations for each item in a key-value store.
- Store LHS and RHS embeddings in a vector database.
-
During a session:
- Show cached recommendations first.
- If the user exhausts cached items, backfill recommendations dynamically using dot product similarity.