What is a directed graph?

A directed graph (digraph) is a graph where edges have direction. An edge from A to B does not imply an edge from B to A. Examples: Twitter follows (A follows B does not mean B follows A), citations (paper A cites paper B), transactions (account A sends money to account B).

How does edge direction affect message passing?

In message passing, messages flow along edge direction. In a directed graph with edge A->B, node B receives a message from A, but A does not receive a message from B (unless a reverse edge B->A also exists). This means information flow is asymmetric, which can be desirable (citations, transactions) or problematic (nodes with no incoming edges receive no messages).

Should I add reverse edges to my directed graph?

It depends on the task. For citation networks and social graphs, adding reverse edges often helps because it ensures all nodes receive messages. In PyG, use T.ToUndirected() to add reverse edges automatically. For transaction graphs where direction carries semantic meaning (sender vs receiver), keep edges directed or add reverse edges as a separate relation type.

How do I represent a directed graph in PyG?

PyG's edge_index is inherently directed: edge_index[0] contains source nodes and edge_index[1] contains target nodes. Messages flow from source to target. For undirected graphs, you store each edge twice (A->B and B->A). For directed graphs, you store each edge once in its true direction.

What GNN layers handle directed graphs well?

All PyG message passing layers work on directed graphs since edge_index is inherently directed. GATConv is particularly well-suited because attention weights can learn different importance for different directions. For knowledge graphs with many directed relation types, use RGCNConv with separate relation types for forward and reverse edges.

Directed Graphs in GNNs: Asymmetric Edge Relationships | Kumo.ai

A directed graph is a graph where edges have direction. An edge from node A to node B does not imply an edge from B to A. When Alice follows Bob on Twitter, Bob does not necessarily follow Alice. When paper X cites paper Y, paper Y does not cite paper X. When account A sends $1,000 to account B, account B did not send $1,000 to account A. Direction encodes asymmetry.

In many enterprise domains, edge direction carries critical semantic information. Ignoring direction (treating the graph as undirected) loses this signal. A node with 100 incoming transactions and 0 outgoing looks very different from a node with 0 incoming and 100 outgoing, but in an undirected graph they look identical.

Direction in PyG

PyG's edge_index tensor is inherently directed. The first row contains source nodes, the second row contains target nodes:

directed_edge_index.py

import torch
from torch_geometric.data import Data

# Directed edges: 0->1, 0->2, 1->2, 2->3
# Note: each edge appears only ONCE (not twice like undirected)
edge_index = torch.tensor([
    [0, 0, 1, 2],  # source
    [1, 2, 2, 3],  # target
], dtype=torch.long)

x = torch.randn(4, 16)
data = Data(x=x, edge_index=edge_index)

# During message passing:
# Node 1 receives messages from node 0
# Node 2 receives messages from nodes 0 and 1
# Node 3 receives messages from node 2
# Node 0 receives NO messages (no incoming edges)

Directed graph: messages flow from source to target. Node 0 sends but never receives.

When to keep vs remove direction

The decision depends on whether direction carries task-relevant information:

Keep directed: fraud detection (money flow direction matters), supply chains (materials flow downstream), citation networks (influence flows forward), dependency graphs (A depends on B, not vice versa).
Convert to undirected: social networks where mutual connections dominate, molecular graphs where bond direction is less meaningful, co-authorship networks. Use T.ToUndirected().
Add reverse as separate type: the best of both worlds. Keep the original directed edges and add reverse edges as a different relation type. The model learns different weights for “sends_money_to” and “receives_money_from.”

reverse_edges.py

import torch_geometric.transforms as T

# Option 1: Convert to undirected (add reverse of every edge)
transform = T.ToUndirected()
undirected_data = transform(data)

# Option 2: Add reverse edges as a separate relation (for HeteroData)
data['account', 'sends_to', 'account'].edge_index = forward_edges
data['account', 'receives_from', 'account'].edge_index = forward_edges.flip(0)
# Now the model learns separate weights for each direction

Two strategies for handling direction. Option 2 preserves directional semantics while ensuring all nodes receive messages.

Enterprise example: transaction fraud detection

A bank's transaction graph is inherently directed. Account A sends money to Account B. Direction reveals critical fraud patterns:

Fan-out: one account sending to many accounts (potential money laundering distribution)
Fan-in: many accounts sending to one account (potential mule account collecting proceeds)
Cycles: A sends to B, B sends to C, C sends to A (potential round-tripping to disguise fund origins)

In an undirected graph, fan-out and fan-in look identical (both are high-degree nodes). Only by preserving direction can the model distinguish a distribution hub from a collection point. A 2-layer directed GNN captures these patterns: layer 1 sees direct counterparties, layer 2 sees counterparties of counterparties, revealing the full flow topology.

In-degree vs out-degree features

For directed graphs, degree splits into in-degree (incoming edges) and out-degree (outgoing edges). These are powerful structural features:

High in-degree, low out-degree: popular receiver (influencer, mule account)
Low in-degree, high out-degree: active sender (bot, distribution point)
Balanced in/out-degree: normal bidirectional relationships

Adding in-degree and out-degree as node features gives GNNs explicit access to directional structure from the first layer. PyG computes these with degree(edge_index[0]) for out-degree and degree(edge_index[1]) for in-degree.

Key Takeaways

1A directed graph has edges where A->B differs from B->A. Direction encodes asymmetric relationships: follows, citations, transactions, dependencies. Removing direction loses critical signal.
2PyG's edge_index is inherently directed (row 0 = source, row 1 = target). For undirected graphs, each edge is stored twice. For directed graphs, each edge appears once.
3Nodes with only outgoing edges receive no messages. Add self-loops, reverse edges, or reverse-as-separate-type to ensure all nodes learn from their neighborhood.
4Direction is essential for fraud detection (fan-in vs fan-out), supply chains (upstream vs downstream), and citation analysis (influence direction). Never discard direction in these domains.
5Best practice for directed enterprise graphs: add reverse edges as a separate relation type. This preserves directional semantics while ensuring full information flow in both directions.

Directed Graphs: Edges with Direction