Berlin Tech Meetup: The Future of Relational Foundation Models, Systems, and Real-World Applications

Register now:
PyG/Guide7 min read

Node Type Encoding: Representing Different Node Types in the Same Graph

Customers have age and location. Products have price and category. Orders have amount and date. Node type encoding projects these different feature spaces into a shared representation where message passing can operate.

PyTorch Geometric

TL;DR

  • 1Node type encoding projects nodes from different feature spaces (customers: 10 features, products: 25 features) into a shared hidden dimension using type-specific input projections.
  • 2Each table in a relational database becomes a node type. The encoding layer is the bridge between raw heterogeneous features and the GNN's shared computation space.
  • 3Three strategies: type-specific linear projections (simplest), type-specific MLPs (more expressive), and shared projections with type embeddings added (parameter efficient).
  • 4In PyG, HeteroData stores each node type separately. HeteroConv and HGTConv handle type-aware message passing after the initial projection.
  • 5Proper node type encoding is essential. Without it, message passing mixes features from incompatible spaces (concatenating age with product price), producing meaningless representations.

Node type encoding maps nodes of different types into a shared embedding space so they can participate in the same graph neural network computation. In a relational database converted to a graph, each table becomes a node type: customers, orders, products, merchants. Each type has a different set of features with different dimensionalities and semantics. Node type encoding bridges this gap.

The heterogeneity problem

A homogeneous GNN assumes all nodes share the same feature space. This works for citation networks (all nodes are papers) but fails for enterprise data. Consider:

  • Customer: [age, income, location, tenure] (4 features, mixed types)
  • Product: [price, weight, category, brand, description_embedding] (5 features + 128-dim embedding)
  • Order: [amount, quantity, discount, timestamp] (4 features)

You cannot feed a 4-dimensional customer vector and a 133-dimensional product vector into the same linear layer. Node type encoding solves this by giving each type its own input projection into a common d-dimensional hidden space.

Encoding strategies

Strategy 1: Type-specific linear projections

The simplest approach: each node type gets its own linear layer that projects from its native feature dimension to the shared hidden dimension.

type_specific_projection.py
import torch.nn as nn

class NodeTypeEncoder(nn.Module):
    def __init__(self, type_dims, hidden_dim):
        super().__init__()
        # One projection per node type
        self.projections = nn.ModuleDict({
            'customer': nn.Linear(4, hidden_dim),
            'product': nn.Linear(133, hidden_dim),
            'order': nn.Linear(4, hidden_dim),
            'merchant': nn.Linear(12, hidden_dim),
        })

    def forward(self, x_dict):
        return {
            node_type: self.projections[node_type](x)
            for node_type, x in x_dict.items()
        }

After this projection, all node types live in the same hidden_dim space and can participate in message passing.

Strategy 2: Type-specific MLPs

For richer encoding, use a small MLP per type instead of a single linear layer. This is useful when raw features need nonlinear transformations (e.g., log-scaling prices, handling categoricals).

Strategy 3: Shared projection + type embedding

A parameter-efficient alternative: use a shared projection layer for all types (padding shorter feature vectors to a common length) and add a learned type embedding that tells the model which type each node belongs to.

shared_with_type_embed.py
class SharedTypeEncoder(nn.Module):
    def __init__(self, max_features, hidden_dim, num_types):
        super().__init__()
        self.shared_proj = nn.Linear(max_features, hidden_dim)
        self.type_embed = nn.Embedding(num_types, hidden_dim)

    def forward(self, x_padded, type_ids):
        # x_padded: all features zero-padded to max_features
        h = self.shared_proj(x_padded)
        h = h + self.type_embed(type_ids)  # add type signal
        return h

Shared projection uses fewer parameters but requires padding. The type embedding compensates for lost type information.

Beyond feature projection

Node type encoding involves more than dimensionality alignment:

  • Feature normalization: Different types have different value ranges. Customer age (0-100) and order amount ($0-$10,000) need separate normalization before projection.
  • Missing feature handling: Some node types have sparse features. Products without reviews, customers without demographic data. Type-specific encoders can handle missingness differently per type.
  • Categorical encoding: Some types are mostly categorical (product category, customer segment). The type-specific encoder can include embedding layers for categoricals before the projection.

In message passing

After node type encoding, all nodes live in the same d-dimensional space. Message passing then operates across types:

  1. Customer node sends its d-dim representation to connected order nodes
  2. Order node aggregates messages from its customer and product neighbors
  3. The aggregation combines information from different types in the shared space

Type-aware message passing layers (HGTConv, HeteroConv) can further apply type-specific transformations during aggregation, but the initial node type encoding is what makes cross-type message passing possible at all.

Frequently asked questions

What is node type encoding?

Node type encoding is the process of projecting nodes of different types (e.g., customers with 10 features, products with 25 features, orders with 8 features) into a shared hidden dimension so they can participate in the same message passing computation. Each type gets its own input projection layer.

Why do different node types need separate encoders?

Different node types have different feature spaces. A customer node has age and location. A product node has price and category. These features have different dimensionalities, semantics, and value ranges. Separate input projections handle this heterogeneity before mapping to a common representation space.

Can you mix node types in the same GNN layer?

Yes, after projecting all node types to the same hidden dimension. Heterogeneous GNN layers like HGTConv and HeteroConv handle multi-type message passing natively. The key is that messages between different type pairs (customer-to-product vs product-to-category) use type-aware transformations.

Learn more about graph ML

PyTorch Geometric is the open-source foundation for graph neural networks. Explore more layers, concepts, and production patterns.