Original Paper
GNN-FiLM: Graph Neural Networks with Feature-wise Linear Modulation
Brockschmidt (2019). ICML 2020
Read paper →What FiLMConv does
FiLMConv modulates the feature transformation on a per-edge, per-feature basis:
- For each edge (i, j), generate scale (gamma) and shift (beta) from neighbor j's features
- Apply modulation to the transformed target features: output = gamma * (W * h_i) + beta
- Aggregate modulated messages across all neighbors
The math (simplified)
# Generate per-edge modulation parameters
gamma_ij = W_gamma · h_j + b_gamma # scale (per feature)
beta_ij = W_beta · h_j + b_beta # shift (per feature)
# Modulate the target node's transformed features
m_ij = gamma_ij * (W · h_i) + beta_ij
# Aggregate
h_i' = AGG({ m_ij : j in N(i) })
Comparison:
GATConv: scalar alpha_ij * W · h_j (1 param per edge)
FiLMConv: gamma_ij * W · h_i + beta_ij (2d params per edge)
NNConv: NN(e_ij) · h_j (d*d params per edge)FiLMConv sits between GAT's scalar attention and NNConv's full weight matrix in terms of expressiveness and parameter count.
PyG implementation
import torch
import torch.nn.functional as F
from torch_geometric.nn import FiLMConv
class FiLMNet(torch.nn.Module):
def __init__(self, in_channels, hidden, out_channels, num_relations=1):
super().__init__()
self.conv1 = FiLMConv(in_channels, hidden,
num_relations=num_relations)
self.conv2 = FiLMConv(hidden, out_channels,
num_relations=num_relations)
def forward(self, x, edge_index, edge_type=None):
x = F.relu(self.conv1(x, edge_index, edge_type))
x = self.conv2(x, edge_index, edge_type)
return x
# Homogeneous graph (no edge types)
model = FiLMNet(64, 64, num_classes)
# Heterogeneous graph (with edge types)
model = FiLMNet(64, 64, num_classes, num_relations=5)FiLMConv optionally takes edge_type for heterogeneous graphs. With num_relations=1, it operates on homogeneous graphs.
When to use FiLMConv
- Structured reasoning tasks. Program analysis, scene graph understanding, and logical reasoning where the relationship between nodes should modulate feature transformation.
- When you need more than scalar attention. If GATConv's single attention weight per edge is too coarse, FiLMConv provides per-feature modulation.
- Multi-relational graphs without HGTConv complexity. FiLMConv supports num_relations natively, providing a simpler alternative to full heterogeneous layers.
When not to use FiLMConv
- When scalar attention suffices. If GATConv achieves your target accuracy, FiLMConv adds unnecessary parameters.
- Very large graphs. Per-feature modulation increases memory usage. For billion-edge graphs, simpler layers with sampling are more practical.