Berlin Tech Meetup: The Future of Relational Foundation Models, Systems, and Real-World Applications

Register now:
PyG/Guide7 min read

Directed Graphs: Edges with Direction

A directed graph has edges where direction matters. A follows B is different from B follows A. A cites B is different from B cites A. Direction encodes asymmetric relationships that undirected graphs cannot represent.

PyTorch Geometric

TL;DR

  • 1A directed graph has edges with direction: A->B does not imply B->A. This captures asymmetric relationships like follows, citations, transactions, and dependencies.
  • 2In PyG, edge_index is inherently directed. edge_index[0] = source, edge_index[1] = target. Messages flow from source to target during message passing.
  • 3Adding reverse edges (T.ToUndirected()) helps when all nodes need to receive messages. For semantic direction (sender vs receiver), keep edges directed or add reverse edges as a separate type.
  • 4Direction is critical for fraud detection (money flows from sender to receiver), supply chains (materials flow downstream), and citation analysis (influence flows forward in time).
  • 5GATConv handles directed graphs well because attention weights naturally learn asymmetric importance. Nodes can weight incoming vs outgoing neighbors differently.

A directed graph is a graph where edges have direction. An edge from node A to node B does not imply an edge from B to A. When Alice follows Bob on Twitter, Bob does not necessarily follow Alice. When paper X cites paper Y, paper Y does not cite paper X. When account A sends $1,000 to account B, account B did not send $1,000 to account A. Direction encodes asymmetry.

In many enterprise domains, edge direction carries critical semantic information. Ignoring direction (treating the graph as undirected) loses this signal. A node with 100 incoming transactions and 0 outgoing looks very different from a node with 0 incoming and 100 outgoing, but in an undirected graph they look identical.

Direction in PyG

PyG's edge_index tensor is inherently directed. The first row contains source nodes, the second row contains target nodes:

directed_edge_index.py
import torch
from torch_geometric.data import Data

# Directed edges: 0->1, 0->2, 1->2, 2->3
# Note: each edge appears only ONCE (not twice like undirected)
edge_index = torch.tensor([
    [0, 0, 1, 2],  # source
    [1, 2, 2, 3],  # target
], dtype=torch.long)

x = torch.randn(4, 16)
data = Data(x=x, edge_index=edge_index)

# During message passing:
# Node 1 receives messages from node 0
# Node 2 receives messages from nodes 0 and 1
# Node 3 receives messages from node 2
# Node 0 receives NO messages (no incoming edges)

Directed graph: messages flow from source to target. Node 0 sends but never receives.

When to keep vs remove direction

The decision depends on whether direction carries task-relevant information:

  • Keep directed: fraud detection (money flow direction matters), supply chains (materials flow downstream), citation networks (influence flows forward), dependency graphs (A depends on B, not vice versa).
  • Convert to undirected: social networks where mutual connections dominate, molecular graphs where bond direction is less meaningful, co-authorship networks. Use T.ToUndirected().
  • Add reverse as separate type: the best of both worlds. Keep the original directed edges and add reverse edges as a different relation type. The model learns different weights for “sends_money_to” and “receives_money_from.”
reverse_edges.py
import torch_geometric.transforms as T

# Option 1: Convert to undirected (add reverse of every edge)
transform = T.ToUndirected()
undirected_data = transform(data)

# Option 2: Add reverse edges as a separate relation (for HeteroData)
data['account', 'sends_to', 'account'].edge_index = forward_edges
data['account', 'receives_from', 'account'].edge_index = forward_edges.flip(0)
# Now the model learns separate weights for each direction

Two strategies for handling direction. Option 2 preserves directional semantics while ensuring all nodes receive messages.

Enterprise example: transaction fraud detection

A bank's transaction graph is inherently directed. Account A sends money to Account B. Direction reveals critical fraud patterns:

  • Fan-out: one account sending to many accounts (potential money laundering distribution)
  • Fan-in: many accounts sending to one account (potential mule account collecting proceeds)
  • Cycles: A sends to B, B sends to C, C sends to A (potential round-tripping to disguise fund origins)

In an undirected graph, fan-out and fan-in look identical (both are high-degree nodes). Only by preserving direction can the model distinguish a distribution hub from a collection point. A 2-layer directed GNN captures these patterns: layer 1 sees direct counterparties, layer 2 sees counterparties of counterparties, revealing the full flow topology.

In-degree vs out-degree features

For directed graphs, degree splits into in-degree (incoming edges) and out-degree (outgoing edges). These are powerful structural features:

  • High in-degree, low out-degree: popular receiver (influencer, mule account)
  • Low in-degree, high out-degree: active sender (bot, distribution point)
  • Balanced in/out-degree: normal bidirectional relationships

Adding in-degree and out-degree as node features gives GNNs explicit access to directional structure from the first layer. PyG computes these with degree(edge_index[0]) for out-degree and degree(edge_index[1]) for in-degree.

Frequently asked questions

What is a directed graph?

A directed graph (digraph) is a graph where edges have direction. An edge from A to B does not imply an edge from B to A. Examples: Twitter follows (A follows B does not mean B follows A), citations (paper A cites paper B), transactions (account A sends money to account B).

How does edge direction affect message passing?

In message passing, messages flow along edge direction. In a directed graph with edge A->B, node B receives a message from A, but A does not receive a message from B (unless a reverse edge B->A also exists). This means information flow is asymmetric, which can be desirable (citations, transactions) or problematic (nodes with no incoming edges receive no messages).

Should I add reverse edges to my directed graph?

It depends on the task. For citation networks and social graphs, adding reverse edges often helps because it ensures all nodes receive messages. In PyG, use T.ToUndirected() to add reverse edges automatically. For transaction graphs where direction carries semantic meaning (sender vs receiver), keep edges directed or add reverse edges as a separate relation type.

How do I represent a directed graph in PyG?

PyG's edge_index is inherently directed: edge_index[0] contains source nodes and edge_index[1] contains target nodes. Messages flow from source to target. For undirected graphs, you store each edge twice (A->B and B->A). For directed graphs, you store each edge once in its true direction.

What GNN layers handle directed graphs well?

All PyG message passing layers work on directed graphs since edge_index is inherently directed. GATConv is particularly well-suited because attention weights can learn different importance for different directions. For knowledge graphs with many directed relation types, use RGCNConv with separate relation types for forward and reverse edges.

Learn more about graph ML

PyTorch Geometric is the open-source foundation for graph neural networks. Explore more layers, concepts, and production patterns.