70,000
Graphs
75
Nodes/Graph
1
Features
10
Classes
What MNIST Superpixels contains
MNIST Superpixels takes the classic MNIST dataset of 28x28 grayscale handwritten digit images and converts each image into a graph. The conversion uses SLIC superpixel segmentation to group pixels into 75 superpixel regions (nodes). Each node has 1 feature: the average pixel intensity of its superpixel region. Edges connect superpixels that are spatially adjacent in the original image. The task is to classify each graph as one of 10 digits (0-9).
The dataset preserves MNIST's original 60K/10K train/test split, giving 70,000 total graphs. This makes it the largest standard graph classification benchmark by number of graphs -- useful for testing how efficiently your training pipeline handles large datasets of small graphs.
Why MNIST Superpixels matters
MNIST Superpixels serves two purposes. First, it demonstrates the universality of graph representations: any structured data, including images, can be processed as a graph. This conceptual insight matters for practitioners who may not initially see how GNNs apply to their domain.
Second, it provides a large-scale graph classification benchmark with a familiar task. Molecular datasets (MUTAG, ENZYMES) require domain knowledge to interpret. Everyone understands digit classification. This makes MNIST Superpixels ideal for learning graph classification pipelines (DataLoader batching, global pooling, etc.) without domain distractions.
Loading MNIST Superpixels in PyG
from torch_geometric.datasets import MNISTSuperpixels
from torch_geometric.loader import DataLoader
train_dataset = MNISTSuperpixels(root='/tmp/MNIST', train=True)
test_dataset = MNISTSuperpixels(root='/tmp/MNIST', train=False)
print(f"Train graphs: {len(train_dataset)}") # 60000
print(f"Test graphs: {len(test_dataset)}") # 10000
loader = DataLoader(train_dataset, batch_size=128, shuffle=True)Standard train/test split matching MNIST. Large batch sizes work well with 75-node graphs.
Common tasks and benchmarks
10-class graph classification. GCN: ~90.1%, GIN: ~96.5%, GAT: ~95.5%, GPS: ~97.2%. The benchmarks from the “Benchmarking GNNs” paper use a 500K parameter budget. GIN's strong performance reflects its ability to distinguish different graph structures, which matters because digits have distinct superpixel topologies (the loops in 8, the vertical stroke in 1).
Example: document layout analysis
The image-to-graph conversion used in MNIST Superpixels has practical applications beyond digit recognition. Document layout analysis converts page scans into graphs of text blocks, images, and tables connected by spatial adjacency. A GNN then classifies regions (header, body, figure, table) and detects reading order. This powers intelligent document processing at companies like Google (for Google Lens) and insurance companies (for claims processing).
Published benchmark results
Graph classification accuracy on MNIST Superpixels with ~500K parameter budget from the Benchmarking GNNs paper.
| Method | Accuracy (%) | Year | Paper |
|---|---|---|---|
| GCN | 90.1 | 2020 | Dwivedi et al. |
| GAT | 95.5 | 2020 | Dwivedi et al. |
| GraphSage | 97.3 | 2020 | Dwivedi et al. |
| GIN | 96.5 | 2020 | Dwivedi et al. |
| GatedGCN | 97.3 | 2020 | Dwivedi et al. |
| GPS | ~98.1 | 2022 | Rampasek et al. |
Original Paper
Benchmarking Graph Neural Networks
V. P. Dwivedi, C. K. Joshi, A. T. Luu, T. Laurent, Y. Bengio, X. Bresson (2023). Journal of Machine Learning Research
Read paper →Original data source
The MNIST Superpixels graph dataset was created by Monti et al. (2017) and popularized by Dwivedi et al. (2023). The original MNIST images come from Yann LeCun's MNIST page. The superpixel conversion uses SLIC segmentation from scikit-image.
@inproceedings{monti2017geometric,
title={Geometric Deep Learning on Graphs and Manifolds Using Mixture Model CNNs},
author={Monti, Federico and Boscaini, Davide and Masci, Jonathan and Rodola, Emanuele and Svoboda, Jan and Bronstein, Michael M},
booktitle={CVPR},
pages={5115--5124},
year={2017}
}BibTeX citation for the superpixel graph construction used in MNIST Superpixels.
Which dataset should I use?
MNIST Superpixels vs PATTERN/CLUSTER: All three are from the Benchmarking GNNs suite. MNIST Superpixels is graph-level classification; PATTERN/CLUSTER are node-level. Use MNIST Superpixels to test graph classification pipelines, PATTERN/CLUSTER for node classification expressiveness.
MNIST Superpixels vs ZINC: Both are graph-level tasks with ~500K parameter budgets. ZINC is regression on molecules; MNIST Superpixels is 10-class classification on vision data. ZINC is more popular for architecture comparison.
MNIST Superpixels vs MUTAG: MNIST Superpixels has 70K graphs vs MUTAG's 188. Use MNIST Superpixels for reliable statistical evaluation and for learning graph classification pipelines at scale.
From benchmark to production
Production vision-as-graphs applications use much richer representations: feature vectors from pretrained vision models (not just pixel intensity), multi-scale superpixel hierarchies, and task-specific edge construction (semantic similarity, not just spatial adjacency). The conversion pipeline is more complex, but the GNN processing remains the same.