The business problem
Search is the primary navigation mechanism for e-commerce, knowledge bases, and content platforms. 70% of e-commerce revenue passes through the search box. A 1% improvement in search relevance translates directly to higher conversion rates, longer engagement, and increased revenue. The challenge: different users searching the same query want different results.
Why flat ML fails
- No personalization context: LTR models score query-document pairs using features like BM25, TF-IDF, and click-through rate. They miss user-specific relevance: a developer and a zoologist searching “python” need different results.
- No document relationships: Documents exist in a topic graph. Understanding that Document A and B are related helps rank B when A was clicked for a similar query.
- No session context: The sequence of queries in a session refines intent. “laptop” followed by “laptop 16 inch” narrows the intent, but flat models treat each query independently.
- Cold-start documents: New documents with no click history get poor rankings. Graph connections to existing documents provide initial relevance estimates.
The relational schema
Node types:
User (id, history_emb, segment, platform)
Query (id, text_emb, intent_category)
Document (id, content_emb, category, freshness)
Edge types:
User --[issued]--> Query (timestamp)
Query --[clicked]--> Document (position, dwell_time)
Query --[skipped]--> Document (position)
Document --[related_to]--> Document (topic_similarity)
User --[bookmarked]--> Document (timestamp)The tripartite graph connects users to queries to documents. Click and skip edges provide relevance supervision. Document-document edges capture topical relationships.
PyG architecture: SAGEConv for personalized ranking
import torch
import torch.nn.functional as F
from torch_geometric.nn import SAGEConv, HeteroConv, Linear
class SearchGNN(torch.nn.Module):
def __init__(self, hidden_dim=128):
super().__init__()
self.user_lin = Linear(-1, hidden_dim)
self.query_lin = Linear(-1, hidden_dim)
self.doc_lin = Linear(-1, hidden_dim)
self.conv1 = HeteroConv({
('user', 'issued', 'query'): SAGEConv(
hidden_dim, hidden_dim),
('query', 'clicked', 'document'): SAGEConv(
hidden_dim, hidden_dim),
('document', 'related_to', 'document'): SAGEConv(
hidden_dim, hidden_dim),
('user', 'bookmarked', 'document'): SAGEConv(
hidden_dim, hidden_dim),
}, aggr='sum')
self.conv2 = HeteroConv({
('user', 'issued', 'query'): SAGEConv(
hidden_dim, hidden_dim),
('query', 'clicked', 'document'): SAGEConv(
hidden_dim, hidden_dim),
('document', 'related_to', 'document'): SAGEConv(
hidden_dim, hidden_dim),
}, aggr='sum')
def encode(self, x_dict, edge_index_dict):
x_dict['user'] = self.user_lin(x_dict['user'])
x_dict['query'] = self.query_lin(x_dict['query'])
x_dict['document'] = self.doc_lin(x_dict['document'])
x_dict = {k: F.relu(v) for k, v in
self.conv1(x_dict, edge_index_dict).items()}
x_dict = self.conv2(x_dict, edge_index_dict)
return x_dict
def rank(self, user_emb, query_emb, doc_embs):
# Combine user and query for personalized ranking
joint = F.normalize(user_emb + query_emb, dim=-1)
scores = joint @ doc_embs.T
return scoresUser and query embeddings combine for personalized ranking. Document embeddings are precomputed. Ranking is a dot product, enabling sub-10ms serving.
Expected performance
Search ranking is measured by NDCG@10 (Normalized Discounted Cumulative Gain), not AUROC:
- BM25 (baseline): ~0.55 NDCG@10
- LightGBM LTR (flat): ~0.65 NDCG@10
- GNN (personalized): ~0.75 NDCG@10
- KumoRFM (zero-shot): ~0.76 NDCG@10
Or use KumoRFM in one line
PREDICT relevance FOR query, document
USING user, query, document, click_logOne PQL query. KumoRFM learns personalized relevance from the interaction graph for search ranking.