Berlin Tech Meetup: The Future of Relational Foundation Models, Systems, and Real-World Applications

Register now:
PyG/Layer10 min read

RGCNConv: The Foundation for Heterogeneous Graph Learning

Real-world data is relational. Customers purchase products, review merchants, and return items. Each relationship means something different. RGCNConv was the first GNN layer to handle this by learning separate transformations per edge type. Here is how it works and where it fits in the heterogeneous GNN landscape.

PyTorch Geometric

TL;DR

  • 1RGCNConv learns a separate weight matrix for each edge type, so 'purchased' edges use different parameters than 'reviewed' edges. This is essential for knowledge graphs and relational databases.
  • 2Basis decomposition reduces parameter count when you have many relation types. Express each relation's weight matrix as a combination of shared basis matrices.
  • 3The foundational layer for heterogeneous GNNs. HGTConv, HANConv, and HeteroConv all build on RGCN's core idea of type-specific transformations.
  • 4Use RGCNConv as a baseline for heterogeneous graphs with typed edges. Upgrade to HGTConv when you also need attention, or HeteroConv for maximum flexibility.
  • 5KumoRFM's Relational Graph Transformer extends RGCN's type-specific approach with attention, temporal encodings, and automatic schema mapping from your database.

Original Paper

Modeling Relational Data with Graph Convolutional Networks

Schlichtkrull et al. (2017). ESWC 2018

Read paper →

What RGCNConv does

RGCNConv extends GCNConv with one critical addition: separate weight matrices per relation type. For each node:

  1. Group incoming messages by edge type (relation)
  2. Apply a relation-specific weight matrix to each group
  3. Sum across all relation types
  4. Add the node's own self-loop transformation

This means the model learns that “purchased” edges transform features differently than “reviewed” edges. A customer who purchased 10 items has a different signal from one who reviewed 10 items, even if the numerical features are similar.

The math (simplified)

RGCNConv formula
h_i' = W_0 · h_i + Σ_r ( 1/c_i,r · Σ_{j in N_r(i)} W_r · h_j )

Where:
  W_r     = weight matrix specific to relation type r
  W_0     = self-loop weight matrix
  N_r(i)  = neighbors of i via relation type r
  c_i,r   = normalization constant (typically |N_r(i)|)

With basis decomposition (reduces parameters):
  W_r = Σ_b a_r,b · B_b
  B_b = shared basis matrix
  a_r,b = relation-specific coefficient for basis b

Each relation type gets its own transformation. Basis decomposition keeps parameters manageable when you have dozens or hundreds of relation types.

PyG implementation

rgcn_model.py
import torch
import torch.nn.functional as F
from torch_geometric.nn import RGCNConv

class RGCN(torch.nn.Module):
    def __init__(self, in_channels, hidden_channels, out_channels,
                 num_relations, num_bases=30):
        super().__init__()
        self.conv1 = RGCNConv(in_channels, hidden_channels,
                              num_relations=num_relations,
                              num_bases=num_bases)
        self.conv2 = RGCNConv(hidden_channels, out_channels,
                              num_relations=num_relations,
                              num_bases=num_bases)

    def forward(self, x, edge_index, edge_type):
        x = self.conv1(x, edge_index, edge_type)
        x = F.relu(x)
        x = F.dropout(x, p=0.5, training=self.training)
        x = self.conv2(x, edge_index, edge_type)
        return x

# edge_type is a 1D tensor with the relation type for each edge
# e.g., 0=purchased, 1=reviewed, 2=returned
model = RGCN(in_channels=64, hidden_channels=64, out_channels=num_classes,
             num_relations=3, num_bases=10)
out = model(x, edge_index, edge_type)

edge_type is a LongTensor mapping each edge to its relation type. num_bases controls the basis decomposition (set to num_relations to disable decomposition).

When to use RGCNConv

  • Knowledge graph completion. Predicting missing links in knowledge graphs (FB15k-237, NELL) where edges have distinct semantic types (is-a, part-of, located-in).
  • Relational databases as graphs. Enterprise data has multiple table types connected by different foreign key relationships. RGCNConv is the natural first approach for learning on this structure.
  • Entity classification in heterogeneous networks. Classifying nodes (users, products, transactions) in a graph with multiple relationship types.
  • As a baseline for heterogeneous GNNs. Before trying HGTConv or HANConv, establish an RGCN baseline. It is simpler and the performance gap may be small for your task.

When not to use RGCNConv

1. Too many relation types

Even with basis decomposition, RGCNConv scales linearly with the number of relation types. If you have hundreds of edge types, HGTConv (which uses type-specific projections more efficiently) or HeteroConv (which wraps simpler layers per type) may be more practical.

2. When you need attention

RGCNConv treats all neighbors of the same relation type equally (like GCNConv within each type). For tasks where neighbor importance varies within a relation type, use RGATConv or HGTConv.

3. Homogeneous graphs

If all edges are the same type, RGCNConv reduces to GCNConv with extra overhead. Use GCNConv or GATConv directly.

How KumoRFM builds on this

RGCNConv is the direct ancestor of KumoRFM's relational architecture. The core idea (type-specific transformations) is identical. KumoRFM extends it with:

  • Automatic schema mapping: Your database schema becomes the graph structure automatically. No manual definition of node types, edge types, or features.
  • Attention within relation types: Not all purchases are equal. KumoRFM adds attention (from TransformerConv) within each relation type.
  • Temporal awareness: RGCN treats all edges as static. KumoRFM encodes when each edge was created, capturing recency and temporal patterns.
  • Pre-trained relational knowledge: Like a foundation model for relational data, KumoRFM transfers patterns learned from diverse enterprise datasets.

Frequently asked questions

What is RGCNConv in PyTorch Geometric?

RGCNConv implements the Relational Graph Convolutional Network from Schlichtkrull et al. (2017). It extends GCNConv to handle multiple edge types by learning a separate weight matrix for each relation type. This makes it the foundational layer for knowledge graphs, relational databases, and any heterogeneous graph with typed edges.

How does RGCNConv handle multiple edge types?

RGCNConv maintains a separate weight matrix W_r for each relation type r. During aggregation, messages from neighbors connected via relation r are transformed by W_r. The node's representation is updated by summing over all relation-specific aggregations. This lets the model learn that 'purchased' edges mean something different from 'reviewed' edges.

What is basis decomposition in RGCNConv?

With many relation types, having a full weight matrix per relation causes parameter explosion. Basis decomposition addresses this by expressing each relation's weight matrix as a linear combination of a small number of shared basis matrices: W_r = sum(a_r_b * B_b). This reduces parameters from O(R*d*d) to O(B*d*d + R*B), where B is the number of bases.

When should I use RGCNConv vs HGTConv?

Use RGCNConv for simpler heterogeneous graphs with a moderate number of edge types (under 20) where you want a straightforward baseline. Use HGTConv for complex heterogeneous graphs where you also need attention over neighbors and type-specific transformations for both nodes and edges. HGTConv is more expressive but more complex.

Can RGCNConv be used for link prediction in knowledge graphs?

Yes. RGCNConv is commonly used as an encoder for knowledge graph completion. It generates node embeddings that capture multi-relational structure. These embeddings are then fed to a decoder (DistMult, TransE, etc.) that scores candidate triples. This encoder-decoder approach outperforms shallow embedding methods on benchmarks like FB15k-237.

Learn more about graph ML

PyTorch Geometric is the open-source foundation for graph neural networks. Explore more layers, concepts, and production patterns.