How do GNNs improve search ranking over traditional LTR models?

Traditional learning-to-rank (LTR) models score each query-document pair independently using features like BM25, click-through rate, and query-document similarity. GNNs additionally model user preferences (past queries, clicks), document relationships (co-clicked documents), and query intent patterns from the full interaction graph.

What graph structure represents search behavior?

Users, queries, and documents form a tripartite graph. Edges connect users to their queries, queries to clicked/skipped documents, and documents to related documents. The graph captures personalization signal (user's search history), relevance signal (click patterns), and document relationships (topical clusters).

How do you handle query intent with GNNs?

Different users issuing the same query may have different intents. The GNN disambiguates by incorporating the user's search history: a developer searching 'python' expects programming results, while a zoologist expects snake information. The user node's neighborhood provides this disambiguation context.

Can GNN search ranking handle real-time queries?

Yes. Pre-compute document embeddings offline. When a query arrives, compute the query embedding using the user's cached profile and rank candidate documents by embedding similarity. New documents get embeddings from their content features and connections to existing documents.

How does KumoRFM handle search relevance?

KumoRFM takes your search logs (users, queries, documents, clicks) and predicts relevance with one PQL query. It automatically constructs the interaction graph and learns personalized ranking from the user-query-document relationships.

Search Ranking with PyG: GNN on User-Query-Document Graphs | PyG Guide

The business problem

Search is the primary navigation mechanism for e-commerce, knowledge bases, and content platforms. 70% of e-commerce revenue passes through the search box. A 1% improvement in search relevance translates directly to higher conversion rates, longer engagement, and increased revenue. The challenge: different users searching the same query want different results.

Why flat ML fails

No personalization context: LTR models score query-document pairs using features like BM25, TF-IDF, and click-through rate. They miss user-specific relevance: a developer and a zoologist searching “python” need different results.
No document relationships: Documents exist in a topic graph. Understanding that Document A and B are related helps rank B when A was clicked for a similar query.
No session context: The sequence of queries in a session refines intent. “laptop” followed by “laptop 16 inch” narrows the intent, but flat models treat each query independently.
Cold-start documents: New documents with no click history get poor rankings. Graph connections to existing documents provide initial relevance estimates.

The relational schema

schema.txt

Node types:
  User     (id, history_emb, segment, platform)
  Query    (id, text_emb, intent_category)
  Document (id, content_emb, category, freshness)

Edge types:
  User     --[issued]-->      Query    (timestamp)
  Query    --[clicked]-->     Document (position, dwell_time)
  Query    --[skipped]-->     Document (position)
  Document --[related_to]-->  Document (topic_similarity)
  User     --[bookmarked]-->  Document (timestamp)

The tripartite graph connects users to queries to documents. Click and skip edges provide relevance supervision. Document-document edges capture topical relationships.

PyG architecture: SAGEConv for personalized ranking

search_model.py

import torch
import torch.nn.functional as F
from torch_geometric.nn import SAGEConv, HeteroConv, Linear

class SearchGNN(torch.nn.Module):
    def __init__(self, hidden_dim=128):
        super().__init__()
        self.user_lin = Linear(-1, hidden_dim)
        self.query_lin = Linear(-1, hidden_dim)
        self.doc_lin = Linear(-1, hidden_dim)

        self.conv1 = HeteroConv({
            ('user', 'issued', 'query'): SAGEConv(
                hidden_dim, hidden_dim),
            ('query', 'clicked', 'document'): SAGEConv(
                hidden_dim, hidden_dim),
            ('document', 'related_to', 'document'): SAGEConv(
                hidden_dim, hidden_dim),
            ('user', 'bookmarked', 'document'): SAGEConv(
                hidden_dim, hidden_dim),
        }, aggr='sum')

        self.conv2 = HeteroConv({
            ('user', 'issued', 'query'): SAGEConv(
                hidden_dim, hidden_dim),
            ('query', 'clicked', 'document'): SAGEConv(
                hidden_dim, hidden_dim),
            ('document', 'related_to', 'document'): SAGEConv(
                hidden_dim, hidden_dim),
        }, aggr='sum')

    def encode(self, x_dict, edge_index_dict):
        x_dict['user'] = self.user_lin(x_dict['user'])
        x_dict['query'] = self.query_lin(x_dict['query'])
        x_dict['document'] = self.doc_lin(x_dict['document'])

        x_dict = {k: F.relu(v) for k, v in
                  self.conv1(x_dict, edge_index_dict).items()}
        x_dict = self.conv2(x_dict, edge_index_dict)
        return x_dict

    def rank(self, user_emb, query_emb, doc_embs):
        # Combine user and query for personalized ranking
        joint = F.normalize(user_emb + query_emb, dim=-1)
        scores = joint @ doc_embs.T
        return scores

User and query embeddings combine for personalized ranking. Document embeddings are precomputed. Ranking is a dot product, enabling sub-10ms serving.

Expected performance

Search ranking is measured by NDCG@10 (Normalized Discounted Cumulative Gain), not AUROC:

BM25 (baseline): ~0.55 NDCG@10
LightGBM LTR (flat): ~0.65 NDCG@10
GNN (personalized): ~0.75 NDCG@10
KumoRFM (zero-shot): ~0.76 NDCG@10

Or use KumoRFM in one line

KumoRFM PQL

PREDICT relevance FOR query, document
USING user, query, document, click_log

One PQL query. KumoRFM learns personalized relevance from the interaction graph for search ranking.

Search Ranking: GNN on User-Query-Document Graphs