A temporal graph is a graph where edges and features change over time. Each edge carries a timestamp indicating when the interaction occurred. A transaction between two accounts at 2:03 PM on Tuesday is a different event than one at 9:17 AM on Friday, even if it involves the same accounts and the same amount. Temporal graphs capture this distinction. Static graphs do not.
This matters because most enterprise predictions are time-sensitive. When predicting whether a transaction is fraudulent, you should only use information available before that transaction occurred. When predicting whether a customer will churn next month, you cannot use their activity from next week. Temporal graphs enforce this causal constraint by construction.
Static vs temporal: what you lose without time
A static graph collapses all interactions into a single snapshot. Two patterns that look identical in a static graph can be radically different temporally:
- Fraud: 50 transactions in 5 minutes between the same accounts vs 50 transactions over 6 months. Static graph: same edge weight. Temporal graph: obviously different.
- Churn: a customer who logged in daily for 6 months then stopped vs one who never logged in frequently. Static graph: same total login count. Temporal graph: clear declining trajectory.
- Recommendations: a user who bought running shoes last week vs one who bought them 3 years ago. Static graph: same edge. Temporal graph: very different purchase recency.
TemporalData in PyG
PyTorch Geometric provides TemporalData for continuous-time interaction graphs:
from torch_geometric.data import TemporalData
import torch
# Each row is a timestamped interaction
data = TemporalData(
src=torch.tensor([0, 1, 2, 0, 3, 1]), # source nodes
dst=torch.tensor([1, 2, 3, 2, 0, 3]), # destination nodes
t=torch.tensor([100, 200, 300, 400, 500, 600]), # timestamps
msg=torch.randn(6, 16), # interaction features
)
print(data)
# TemporalData(src=[6], dst=[6], t=[6], msg=[6, 16])
# Events are ordered by time
# At t=400, node 0 has interacted with nodes 1 (t=100) and 2 (t=400)
# Node 0 has NOT yet interacted with node 3 (that happens at t=500)TemporalData stores events chronologically. Each interaction has a source, destination, timestamp, and feature vector.
Temporal Graph Networks (TGN)
The TGN architecture, available in PyG, combines three components:
- Memory module: each node maintains a running state vector that updates every time the node participates in an interaction. This captures the node's evolving history.
- Temporal neighbor sampler: when computing a node's representation at time t, only neighbors with interactions before t are sampled. Causal ordering is enforced at the sampling level.
- Message passing with time encoding: time differences between events are encoded and included in the message function, letting the model learn that recent interactions matter more than distant ones.
from torch_geometric.nn import TGNMemory, TransformerConv
from torch_geometric.nn.models.tgn import (
LastNeighborLoader, IdentityMessage, LastAggregator
)
# Temporal neighbor loader: respects causal ordering
neighbor_loader = LastNeighborLoader(
num_nodes=data.num_nodes,
size=10, # sample up to 10 recent neighbors
)
# TGN memory module: tracks evolving node states
memory = TGNMemory(
num_nodes=data.num_nodes,
raw_msg_dim=16, # interaction feature dim
memory_dim=64, # memory state dim
time_dim=16, # time encoding dim
message_module=IdentityMessage(16, 64, 16),
aggregator_module=LastAggregator(),
)
# Process events chronologically
# memory.update_state(src, dst, t, msg)
# At inference: memory[node_id] gives current temporal embeddingTGN maintains per-node memory that evolves with each interaction. The neighbor loader ensures causal correctness.
Enterprise example: real-time fraud detection
A payment processor needs to score each transaction in real time. The temporal graph has account nodes connected by transaction edges, each with a timestamp, amount, and merchant category.
When transaction T arrives at time t between accounts A and B:
- The temporal neighbor sampler retrieves A's recent interactions (before t)
- The memory module provides A's current state (updated through all previous events)
- Message passing aggregates temporal neighborhood information
- The model scores T as fraudulent or legitimate
The model can learn patterns like: “Account A normally transacts with merchants in category X at $50-100. This transaction is with category Y at $5,000, and A had 12 transactions in the last hour (vs its normal 2 per day).” These temporal velocity features emerge automatically from the temporal graph structure.
Discrete-time vs continuous-time
There are two approaches to temporal graphs:
- Discrete-time (snapshot): the graph is divided into time windows (e.g., one snapshot per day). Each snapshot is a static graph processed by a standard GNN. Simple but loses fine-grained temporal resolution.
- Continuous-time (event-based): each interaction has an exact timestamp. TGN and similar architectures operate at this granularity. More complex but captures precise temporal patterns essential for real-time use cases.
For enterprise applications like fraud detection and real-time recommendations, continuous-time temporal graphs are strongly preferred because the exact timing and ordering of events carries critical signal.