What is GINEConv in PyTorch Geometric?

GINEConv extends the Graph Isomorphism Network (GIN) to incorporate edge features. While GINConv only uses node features for aggregation, GINEConv adds edge attributes (e.g., bond types in molecules, relationship types in knowledge graphs) into the message computation before aggregation.

What is the difference between GINConv and GINEConv?

GINConv aggregates neighbor node features only: h_i' = MLP((1+eps)*h_i + sum(h_j)). GINEConv adds edge features before aggregation: h_i' = MLP((1+eps)*h_i + sum(ReLU(h_j + e_ij))). This allows the model to distinguish edges with different properties connecting the same pair of nodes.

When should I use GINEConv?

Use GINEConv when your graph has meaningful edge features: molecular graphs (bond type, stereochemistry), knowledge graphs (relation type), transaction networks (amount, timestamp), or any graph where the relationship properties matter for the task.

Is GINEConv still WL-equivalent?

GINEConv maintains WL-equivalent expressiveness while additionally incorporating edge information. The sum aggregation and MLP still provide maximum expressiveness for message-passing, and the edge features add a dimension of discrimination that plain WL and GIN cannot capture.

Can GINEConv be used for graph pre-training?

Yes. GINEConv is the standard backbone for graph self-supervised pre-training strategies. The paper 'Strategies for Pre-training Graph Neural Networks' (Hu et al., 2019) uses GINEConv to demonstrate that pre-training with both node and edge attribute masking improves downstream molecular property prediction.

GINEConv: GIN with Edge Features Explained | PyG Guide

Original Paper

Strategies for Pre-training Graph Neural Networks

Hu et al. (2019). ICLR 2020

Read paper →

What GINEConv does

GINEConv modifies GINConv's aggregation to incorporate edge features before summing:

For each neighbor j, combine its features with the edge features: ReLU(h_j + e_ij)
Sum these edge-enhanced messages across all neighbors
Add the node's own features (scaled by learnable epsilon)
Pass through a multi-layer perceptron

This seemingly small change is critical for graphs where edges carry distinct information. In a molecular graph, the difference between a single bond and a double bond completely changes the molecule's properties. GINConv treats them identically; GINEConv does not.

The math (simplified)

GINEConv formula

# GINConv (ignores edges)
h_i' = MLP( (1 + eps) · h_i + Σ_j h_j )

# GINEConv (uses edge features)
h_i' = MLP( (1 + eps) · h_i + Σ_j ReLU(h_j + e_ij) )

Where:
  e_ij  = edge feature vector for edge (i, j)
  ReLU  = nonlinearity applied after combining node + edge
  The edge features must have the same dimension as node features
  (use a linear projection if they differ)

Edge features are added to neighbor features before aggregation. The ReLU ensures the combination is nonlinear, preserving expressiveness.

PyG implementation

gine_model.py

import torch
import torch.nn.functional as F
from torch_geometric.nn import GINEConv, global_add_pool

class GINE(torch.nn.Module):
    def __init__(self, node_dim, edge_dim, hidden, out_channels, num_layers=5):
        super().__init__()
        self.edge_proj = torch.nn.Linear(edge_dim, hidden)
        self.node_proj = torch.nn.Linear(node_dim, hidden)
        self.convs = torch.nn.ModuleList()
        for _ in range(num_layers):
            mlp = torch.nn.Sequential(
                torch.nn.Linear(hidden, hidden),
                torch.nn.ReLU(),
                torch.nn.Linear(hidden, hidden),
            )
            self.convs.append(GINEConv(mlp))
        self.classifier = torch.nn.Linear(hidden, out_channels)

    def forward(self, x, edge_index, edge_attr, batch):
        x = self.node_proj(x)
        edge_attr = self.edge_proj(edge_attr)
        for conv in self.convs:
            x = conv(x, edge_index, edge_attr)
            x = F.relu(x)
        x = global_add_pool(x, batch)
        return self.classifier(x)

# Usage on molecular dataset
model = GINE(node_dim=9, edge_dim=3, hidden=64, out_channels=1)
# node features: atom type, degree, etc.
# edge features: bond type, stereochemistry, etc.

Project both node and edge features to the same hidden dimension before passing to GINEConv. The edge_attr dimension must match the node feature dimension.

When to use GINEConv

Molecular property prediction. Bond types (single, double, triple, aromatic), bond stereochemistry, and ring membership are critical features encoded on edges.
Graph pre-training. GINEConv is the standard backbone for pre-training strategies that mask and predict both node and edge attributes.
Knowledge graphs with typed relations. Relation types (e.g., “is-a”, “part-of”, “authored-by”) are naturally edge features.
Transaction networks. Transaction amounts, currencies, and timestamps are edge features that distinguish otherwise identical connections.

When not to use GINEConv

Graphs without edge features. If edges carry no attributes, use GINConv. Adding zero-valued edge features adds computation without benefit.
When you need attention. GINEConv treats all neighbors equally (sum aggregation). If neighbor importance varies, consider TransformerConv with edge_attr or GATConv.

GINEConv: Maximum Expressiveness with Edge Features