Original Paper
Modeling Relational Data with Graph Convolutional Networks
Schlichtkrull et al. (2017). ESWC 2018
Read paper →What RGCNConv does
RGCNConv extends GCNConv with one critical addition: separate weight matrices per relation type. For each node:
- Group incoming messages by edge type (relation)
- Apply a relation-specific weight matrix to each group
- Sum across all relation types
- Add the node's own self-loop transformation
This means the model learns that “purchased” edges transform features differently than “reviewed” edges. A customer who purchased 10 items has a different signal from one who reviewed 10 items, even if the numerical features are similar.
The math (simplified)
h_i' = W_0 · h_i + Σ_r ( 1/c_i,r · Σ_{j in N_r(i)} W_r · h_j )
Where:
W_r = weight matrix specific to relation type r
W_0 = self-loop weight matrix
N_r(i) = neighbors of i via relation type r
c_i,r = normalization constant (typically |N_r(i)|)
With basis decomposition (reduces parameters):
W_r = Σ_b a_r,b · B_b
B_b = shared basis matrix
a_r,b = relation-specific coefficient for basis bEach relation type gets its own transformation. Basis decomposition keeps parameters manageable when you have dozens or hundreds of relation types.
PyG implementation
import torch
import torch.nn.functional as F
from torch_geometric.nn import RGCNConv
class RGCN(torch.nn.Module):
def __init__(self, in_channels, hidden_channels, out_channels,
num_relations, num_bases=30):
super().__init__()
self.conv1 = RGCNConv(in_channels, hidden_channels,
num_relations=num_relations,
num_bases=num_bases)
self.conv2 = RGCNConv(hidden_channels, out_channels,
num_relations=num_relations,
num_bases=num_bases)
def forward(self, x, edge_index, edge_type):
x = self.conv1(x, edge_index, edge_type)
x = F.relu(x)
x = F.dropout(x, p=0.5, training=self.training)
x = self.conv2(x, edge_index, edge_type)
return x
# edge_type is a 1D tensor with the relation type for each edge
# e.g., 0=purchased, 1=reviewed, 2=returned
model = RGCN(in_channels=64, hidden_channels=64, out_channels=num_classes,
num_relations=3, num_bases=10)
out = model(x, edge_index, edge_type)edge_type is a LongTensor mapping each edge to its relation type. num_bases controls the basis decomposition (set to num_relations to disable decomposition).
When to use RGCNConv
- Knowledge graph completion. Predicting missing links in knowledge graphs (FB15k-237, NELL) where edges have distinct semantic types (is-a, part-of, located-in).
- Relational databases as graphs. Enterprise data has multiple table types connected by different foreign key relationships. RGCNConv is the natural first approach for learning on this structure.
- Entity classification in heterogeneous networks. Classifying nodes (users, products, transactions) in a graph with multiple relationship types.
- As a baseline for heterogeneous GNNs. Before trying HGTConv or HANConv, establish an RGCN baseline. It is simpler and the performance gap may be small for your task.
When not to use RGCNConv
1. Too many relation types
Even with basis decomposition, RGCNConv scales linearly with the number of relation types. If you have hundreds of edge types, HGTConv (which uses type-specific projections more efficiently) or HeteroConv (which wraps simpler layers per type) may be more practical.
2. When you need attention
RGCNConv treats all neighbors of the same relation type equally (like GCNConv within each type). For tasks where neighbor importance varies within a relation type, use RGATConv or HGTConv.
3. Homogeneous graphs
If all edges are the same type, RGCNConv reduces to GCNConv with extra overhead. Use GCNConv or GATConv directly.
How KumoRFM builds on this
RGCNConv is the direct ancestor of KumoRFM's relational architecture. The core idea (type-specific transformations) is identical. KumoRFM extends it with:
- Automatic schema mapping: Your database schema becomes the graph structure automatically. No manual definition of node types, edge types, or features.
- Attention within relation types: Not all purchases are equal. KumoRFM adds attention (from TransformerConv) within each relation type.
- Temporal awareness: RGCN treats all edges as static. KumoRFM encodes when each edge was created, capturing recency and temporal patterns.
- Pre-trained relational knowledge: Like a foundation model for relational data, KumoRFM transfers patterns learned from diverse enterprise datasets.