Berlin Tech Meetup: The Future of Relational Foundation Models, Systems, and Real-World Applications

Register now:
PyG/Dataset6 min read

MNIST Superpixels: When Images Become Graphs

MNIST Superpixels converts the iconic handwritten digit dataset into 70,000 graphs of 75 superpixel nodes each. It demonstrates that GNNs can process visual data and provides the largest standard graph classification benchmark by number of graphs.

PyTorch Geometric

TL;DR

  • 1MNIST Superpixels has 70,000 graphs (60K train, 10K test), each with 75 superpixel nodes and 1 feature (pixel intensity). The task is 10-class digit classification.
  • 2Images are converted to graphs by SLIC superpixel segmentation. Adjacent superpixels are connected by edges. This tests GNNs on a vision task.
  • 3GNNs achieve ~95-97% accuracy (vs 99%+ for CNNs on raw pixels). The point is not to beat CNNs but to test GNN graph classification at scale.
  • 4With 70K graphs, it is the largest standard benchmark by graph count -- useful for testing training pipeline efficiency.

70,000

Graphs

75

Nodes/Graph

1

Features

10

Classes

What MNIST Superpixels contains

MNIST Superpixels takes the classic MNIST dataset of 28x28 grayscale handwritten digit images and converts each image into a graph. The conversion uses SLIC superpixel segmentation to group pixels into 75 superpixel regions (nodes). Each node has 1 feature: the average pixel intensity of its superpixel region. Edges connect superpixels that are spatially adjacent in the original image. The task is to classify each graph as one of 10 digits (0-9).

The dataset preserves MNIST's original 60K/10K train/test split, giving 70,000 total graphs. This makes it the largest standard graph classification benchmark by number of graphs -- useful for testing how efficiently your training pipeline handles large datasets of small graphs.

Why MNIST Superpixels matters

MNIST Superpixels serves two purposes. First, it demonstrates the universality of graph representations: any structured data, including images, can be processed as a graph. This conceptual insight matters for practitioners who may not initially see how GNNs apply to their domain.

Second, it provides a large-scale graph classification benchmark with a familiar task. Molecular datasets (MUTAG, ENZYMES) require domain knowledge to interpret. Everyone understands digit classification. This makes MNIST Superpixels ideal for learning graph classification pipelines (DataLoader batching, global pooling, etc.) without domain distractions.

Loading MNIST Superpixels in PyG

load_mnist_superpixels.py
from torch_geometric.datasets import MNISTSuperpixels
from torch_geometric.loader import DataLoader

train_dataset = MNISTSuperpixels(root='/tmp/MNIST', train=True)
test_dataset = MNISTSuperpixels(root='/tmp/MNIST', train=False)

print(f"Train graphs: {len(train_dataset)}")  # 60000
print(f"Test graphs: {len(test_dataset)}")    # 10000

loader = DataLoader(train_dataset, batch_size=128, shuffle=True)

Standard train/test split matching MNIST. Large batch sizes work well with 75-node graphs.

Common tasks and benchmarks

10-class graph classification. GCN: ~90.1%, GIN: ~96.5%, GAT: ~95.5%, GPS: ~97.2%. The benchmarks from the “Benchmarking GNNs” paper use a 500K parameter budget. GIN's strong performance reflects its ability to distinguish different graph structures, which matters because digits have distinct superpixel topologies (the loops in 8, the vertical stroke in 1).

Example: document layout analysis

The image-to-graph conversion used in MNIST Superpixels has practical applications beyond digit recognition. Document layout analysis converts page scans into graphs of text blocks, images, and tables connected by spatial adjacency. A GNN then classifies regions (header, body, figure, table) and detects reading order. This powers intelligent document processing at companies like Google (for Google Lens) and insurance companies (for claims processing).

Published benchmark results

Graph classification accuracy on MNIST Superpixels with ~500K parameter budget from the Benchmarking GNNs paper.

MethodAccuracy (%)YearPaper
GCN90.12020Dwivedi et al.
GAT95.52020Dwivedi et al.
GraphSage97.32020Dwivedi et al.
GIN96.52020Dwivedi et al.
GatedGCN97.32020Dwivedi et al.
GPS~98.12022Rampasek et al.

Original Paper

Benchmarking Graph Neural Networks

V. P. Dwivedi, C. K. Joshi, A. T. Luu, T. Laurent, Y. Bengio, X. Bresson (2023). Journal of Machine Learning Research

Read paper →

Original data source

The MNIST Superpixels graph dataset was created by Monti et al. (2017) and popularized by Dwivedi et al. (2023). The original MNIST images come from Yann LeCun's MNIST page. The superpixel conversion uses SLIC segmentation from scikit-image.

cite_mnist_superpixels.bib
@inproceedings{monti2017geometric,
  title={Geometric Deep Learning on Graphs and Manifolds Using Mixture Model CNNs},
  author={Monti, Federico and Boscaini, Davide and Masci, Jonathan and Rodola, Emanuele and Svoboda, Jan and Bronstein, Michael M},
  booktitle={CVPR},
  pages={5115--5124},
  year={2017}
}

BibTeX citation for the superpixel graph construction used in MNIST Superpixels.

Which dataset should I use?

MNIST Superpixels vs PATTERN/CLUSTER: All three are from the Benchmarking GNNs suite. MNIST Superpixels is graph-level classification; PATTERN/CLUSTER are node-level. Use MNIST Superpixels to test graph classification pipelines, PATTERN/CLUSTER for node classification expressiveness.

MNIST Superpixels vs ZINC: Both are graph-level tasks with ~500K parameter budgets. ZINC is regression on molecules; MNIST Superpixels is 10-class classification on vision data. ZINC is more popular for architecture comparison.

MNIST Superpixels vs MUTAG: MNIST Superpixels has 70K graphs vs MUTAG's 188. Use MNIST Superpixels for reliable statistical evaluation and for learning graph classification pipelines at scale.

From benchmark to production

Production vision-as-graphs applications use much richer representations: feature vectors from pretrained vision models (not just pixel intensity), multi-scale superpixel hierarchies, and task-specific edge construction (semantic similarity, not just spatial adjacency). The conversion pipeline is more complex, but the GNN processing remains the same.

Frequently asked questions

What is MNIST Superpixels?

MNIST Superpixels converts the classic MNIST handwritten digit images into graphs. Each 28x28 image is segmented into 75 superpixels (nodes). Edges connect adjacent superpixels. Each node has 1 feature (average pixel intensity). The 10-class task is digit classification.

Why convert images to graphs?

Superpixel graphs test whether GNNs can handle vision tasks traditionally dominated by CNNs. The conversion also demonstrates that graph representations are universal: any structured data (images, molecules, social networks) can be expressed as a graph and processed by GNNs.

How do I load MNIST Superpixels in PyTorch Geometric?

Use `from torch_geometric.datasets import MNISTSuperpixels; dataset = MNISTSuperpixels(root='/tmp/MNIST')`. The dataset has 60K train and 10K test graphs, matching the standard MNIST split.

Do GNNs beat CNNs on MNIST Superpixels?

No. CNNs on raw MNIST pixels achieve 99%+ accuracy. GNNs on superpixel graphs achieve ~95-97%. The superpixel conversion loses spatial information. MNIST Superpixels tests GNN capabilities, not whether graphs are optimal for vision.

What is MNIST Superpixels useful for?

It is useful as a large-scale graph classification benchmark (70K graphs), as a teaching tool for graph classification concepts, and as a bridge between the vision and graph ML communities. Its familiar domain (digit recognition) makes GNN behavior intuitive to understand.

Learn more about graph ML

PyTorch Geometric is the open-source foundation for graph neural networks. Explore more layers, concepts, and production patterns.