What is RGCNConv in PyTorch Geometric?

RGCNConv implements the Relational Graph Convolutional Network from Schlichtkrull et al. (2017). It extends GCNConv to handle multiple edge types by learning a separate weight matrix for each relation type. This makes it the foundational layer for knowledge graphs, relational databases, and any heterogeneous graph with typed edges.

How does RGCNConv handle multiple edge types?

RGCNConv maintains a separate weight matrix W_r for each relation type r. During aggregation, messages from neighbors connected via relation r are transformed by W_r. The node's representation is updated by summing over all relation-specific aggregations. This lets the model learn that 'purchased' edges mean something different from 'reviewed' edges.

What is basis decomposition in RGCNConv?

With many relation types, having a full weight matrix per relation causes parameter explosion. Basis decomposition addresses this by expressing each relation's weight matrix as a linear combination of a small number of shared basis matrices: W_r = sum(a_r_b * B_b). This reduces parameters from O(R*d*d) to O(B*d*d + R*B), where B is the number of bases.

When should I use RGCNConv vs HGTConv?

Use RGCNConv for simpler heterogeneous graphs with a moderate number of edge types (under 20) where you want a straightforward baseline. Use HGTConv for complex heterogeneous graphs where you also need attention over neighbors and type-specific transformations for both nodes and edges. HGTConv is more expressive but more complex.

Can RGCNConv be used for link prediction in knowledge graphs?

Yes. RGCNConv is commonly used as an encoder for knowledge graph completion. It generates node embeddings that capture multi-relational structure. These embeddings are then fed to a decoder (DistMult, TransE, etc.) that scores candidate triples. This encoder-decoder approach outperforms shallow embedding methods on benchmarks like FB15k-237.

RGCNConv: Relational Graph Convolution Explained | PyG Guide

Original Paper

Modeling Relational Data with Graph Convolutional Networks

Schlichtkrull et al. (2017). ESWC 2018

Read paper →

What RGCNConv does

RGCNConv extends GCNConv with one critical addition: separate weight matrices per relation type. For each node:

Group incoming messages by edge type (relation)
Apply a relation-specific weight matrix to each group
Sum across all relation types
Add the node's own self-loop transformation

This means the model learns that “purchased” edges transform features differently than “reviewed” edges. A customer who purchased 10 items has a different signal from one who reviewed 10 items, even if the numerical features are similar.

The math (simplified)

RGCNConv formula

h_i' = W_0 · h_i + Σ_r ( 1/c_i,r · Σ_{j in N_r(i)} W_r · h_j )

Where:
  W_r     = weight matrix specific to relation type r
  W_0     = self-loop weight matrix
  N_r(i)  = neighbors of i via relation type r
  c_i,r   = normalization constant (typically |N_r(i)|)

With basis decomposition (reduces parameters):
  W_r = Σ_b a_r,b · B_b
  B_b = shared basis matrix
  a_r,b = relation-specific coefficient for basis b

Each relation type gets its own transformation. Basis decomposition keeps parameters manageable when you have dozens or hundreds of relation types.

PyG implementation

rgcn_model.py

import torch
import torch.nn.functional as F
from torch_geometric.nn import RGCNConv

class RGCN(torch.nn.Module):
    def __init__(self, in_channels, hidden_channels, out_channels,
                 num_relations, num_bases=30):
        super().__init__()
        self.conv1 = RGCNConv(in_channels, hidden_channels,
                              num_relations=num_relations,
                              num_bases=num_bases)
        self.conv2 = RGCNConv(hidden_channels, out_channels,
                              num_relations=num_relations,
                              num_bases=num_bases)

    def forward(self, x, edge_index, edge_type):
        x = self.conv1(x, edge_index, edge_type)
        x = F.relu(x)
        x = F.dropout(x, p=0.5, training=self.training)
        x = self.conv2(x, edge_index, edge_type)
        return x

# edge_type is a 1D tensor with the relation type for each edge
# e.g., 0=purchased, 1=reviewed, 2=returned
model = RGCN(in_channels=64, hidden_channels=64, out_channels=num_classes,
             num_relations=3, num_bases=10)
out = model(x, edge_index, edge_type)

edge_type is a LongTensor mapping each edge to its relation type. num_bases controls the basis decomposition (set to num_relations to disable decomposition).

When to use RGCNConv

Knowledge graph completion. Predicting missing links in knowledge graphs (FB15k-237, NELL) where edges have distinct semantic types (is-a, part-of, located-in).
Relational databases as graphs. Enterprise data has multiple table types connected by different foreign key relationships. RGCNConv is the natural first approach for learning on this structure.
Entity classification in heterogeneous networks. Classifying nodes (users, products, transactions) in a graph with multiple relationship types.
As a baseline for heterogeneous GNNs. Before trying HGTConv or HANConv, establish an RGCN baseline. It is simpler and the performance gap may be small for your task.

When not to use RGCNConv

1. Too many relation types

Even with basis decomposition, RGCNConv scales linearly with the number of relation types. If you have hundreds of edge types, HGTConv (which uses type-specific projections more efficiently) or HeteroConv (which wraps simpler layers per type) may be more practical.

2. When you need attention

RGCNConv treats all neighbors of the same relation type equally (like GCNConv within each type). For tasks where neighbor importance varies within a relation type, use RGATConv or HGTConv.

3. Homogeneous graphs

If all edges are the same type, RGCNConv reduces to GCNConv with extra overhead. Use GCNConv or GATConv directly.

How KumoRFM builds on this

RGCNConv is the direct ancestor of KumoRFM's relational architecture. The core idea (type-specific transformations) is identical. KumoRFM extends it with:

Automatic schema mapping: Your database schema becomes the graph structure automatically. No manual definition of node types, edge types, or features.
Attention within relation types: Not all purchases are equal. KumoRFM adds attention (from TransformerConv) within each relation type.
Temporal awareness: RGCN treats all edges as static. KumoRFM encodes when each edge was created, capturing recency and temporal patterns.
Pre-trained relational knowledge: Like a foundation model for relational data, KumoRFM transfers patterns learned from diverse enterprise datasets.

Key Takeaways

1RGCNConv learns separate weight matrices per edge type, making it the foundation for heterogeneous graph learning. Essential for knowledge graphs and relational databases.
2Use basis decomposition (num_bases parameter) when you have many relation types. It reduces parameters from O(R*d*d) to O(B*d*d + R*B).
3RGCNConv is the heterogeneous baseline. Establish performance here before moving to more complex layers like HGTConv (attention) or HANConv (meta-paths).
4Every relational database is a heterogeneous graph. RGCNConv was the first layer to exploit this, and type-specific transformations remain central to enterprise graph learning.
5KumoRFM extends RGCN with attention, temporal encoding, and automatic schema mapping. One PQL query replaces the manual graph construction, type definition, and model training.

RGCNConv: The Foundation for Heterogeneous Graph Learning