How do I explain GNN predictions?

PyG provides GNNExplainer, which identifies the most important subgraph and features for a prediction. It learns a mask over edges and features that maximizes the mutual information with the prediction. The result: 'these 5 edges and these 3 features drove this prediction.'

Are attention weights good explanations?

Partially. GATConv attention weights show which neighbors the model weighted most, but they do not explain why those weights are high. Attention weights are correlative (what the model looked at) not causal (what changed the prediction). Use them as a starting point but not as a compliance-grade explanation.

What regulations require GNN explainability?

EU AI Act (high-risk AI systems), ECOA/Reg B (credit decisions in the US), SR 11-7 (model risk management for banks), and GDPR Article 22 (automated decision-making). Any GNN used for credit, insurance, hiring, or fraud decisions in regulated markets needs explanations.

Explaining GNN Predictions for Compliance | PyG Guide

Why GNN explainability is hard

A tabular model’s prediction depends on features: “high transaction amount” and “unusual time” explain a fraud score. A GNN’s prediction depends on features and graph structure: “this transaction connects to a merchant that connects to 5 accounts flagged for fraud last month.” The explanation must capture relational reasoning, not just features.

GNNExplainer

PyG’s built-in GNNExplainer learns masks over edges and features that maximize the mutual information with the prediction:

gnn_explainer.py

from torch_geometric.explain import Explainer, GNNExplainer

explainer = Explainer(
    model=model,
    algorithm=GNNExplainer(epochs=200, lr=0.01),
    explanation_type="model",
    node_mask_type="attributes",  # feature importance
    edge_mask_type="object",      # edge importance
    model_config=dict(
        mode="classification",
        task_level="node",
        return_type="log_probs",
    ),
)

# Explain prediction for node 42
explanation = explainer(data.x, data.edge_index, index=42)

# Top important edges
top_edges = explanation.edge_mask.topk(10)
# Top important features
top_features = explanation.node_mask[42].topk(5)

GNNExplainer takes 0.5-2 seconds per node. For batch explanations, run in parallel on GPU.

Interpreting edge masks

The edge mask scores each edge by importance (0 to 1). High-scoring edges form the “explanation subgraph”: the minimal set of connections that, if removed, would change the prediction.

Attention-based explanations

GATConv computes attention weights per edge, which naturally provide a form of explanation. But attention weights have a known limitation:

Correlative, not causal: High attention on an edge means the model weighted it heavily, not that removing it would change the prediction.
Layer-dependent: Multi-layer GATs produce attention weights per layer. Which layer’s attention should you report? There is no clear answer.
Unreliable for adversarial inputs: Attention can be manipulated to focus on irrelevant edges while the prediction is driven by other features.

Translating graph explanations to business language

Regulators and business stakeholders do not understand “edge mask weight 0.87 on edge (node_42, node_1337).” You need to translate graph explanations into business terms:

Edge to relationship: “Transaction to merchant XYZ on 2026-02-15” instead of “edge (42, 1337).”
Subgraph to narrative: “This account was flagged because 3 of its recent transactions involved merchants connected to a known fraud cluster in Eastern Europe.”
Feature to attribute: “Transaction amount ($4,500) was unusually high for this account” instead of “feature 7 importance 0.92.”

Regulatory requirements by jurisdiction

EU AI Act: High-risk AI systems (credit scoring, fraud detection) must provide meaningful explanations to affected individuals. Graph-based reasoning must be simplified to understandable terms.
US Reg B / ECOA: Credit adverse actions require specific reason codes. GNN explanations must map to standard reason codes (e.g., “insufficient credit history” for cold-start nodes).
SR 11-7: Banks must validate model risk, including explainability of model decisions. GNN model documentation must include explanation methodology.

What breaks in production

Explanation latency: GNNExplainer runs 200 optimization steps per node. At 1 second per explanation, explaining 10,000 flagged transactions takes 3 hours. Precompute explanations in batch, not at serving time.
Explanation inconsistency: Running GNNExplainer twice on the same node can produce different explanations (the optimization is stochastic). Set seeds and average across multiple runs for stable explanations.
Explanation-prediction disagreement: The explanation subgraph may not actually change the prediction when removed (faithfulness problem). Validate explanations by checking that removing top-importance edges changes the prediction.

Explaining GNN Predictions for Compliance