16,881
Shapes
2,616
Avg Points
3 (xyz)
Features
50
Part Classes
What ShapeNet contains
ShapeNet is a large-scale dataset of 3D shapes from 16 object categories: airplane, bag, cap, car, chair, earphone, guitar, knife, lamp, laptop, motorbike, mug, pistol, rocket, skateboard, and table. Each shape is represented as a point cloud -- a set of points in 3D space sampled from the object's surface. Each point has 3 features: its x, y, z coordinates. The average shape has 2,616 points.
The task is part segmentation: label each point with its part identity. An airplane has wings, body, tail, and engines. A chair has legs, seat, back, and arms. The 50 part categories span all 16 object categories. This requires the model to understand both the global shape (is this an airplane or a chair?) and local geometry (is this point on a wing or the body?).
Why ShapeNet matters
ShapeNet is the standard benchmark for 3D shape understanding, a field with direct applications in autonomous driving (understanding surrounding objects from LiDAR point clouds), robotics (grasping objects by understanding their parts), and manufacturing (quality inspection via 3D scanning). GNNs are particularly well-suited for point clouds because the spatial neighbor graph captures the local geometry that determines part identity.
The dataset also drove the development of key GNN layers. DGCNN (Dynamic Graph CNN) introduced EdgeConv, which recomputes the neighbor graph in learned feature space at each layer. This dynamic connectivity outperforms static KNN graphs because the most relevant neighbors for classification are not always the spatially closest points.
Loading ShapeNet in PyG
from torch_geometric.datasets import ShapeNet
dataset = ShapeNet(root='/tmp/ShapeNet', categories=['Airplane'])
print(f"Shapes: {len(dataset)}")
shape = dataset[0]
print(f"Points: {shape.num_nodes}") # ~2600
print(f"Coords: {shape.pos.shape}") # [N, 3]
print(f"Part labels: {shape.y.shape}") # [N] per-point labelsLoad specific categories or omit the parameter for all 16. shape.pos has 3D coordinates.
Common tasks and benchmarks
Per-point part segmentation evaluated by mean IoU (intersection over union) across part categories. PointNet: ~83.7% mIoU. PointNet++: ~85.1%. DGCNN (EdgeConv): ~85.2%. Point Transformer: ~86.6%. PointNeXt: ~87.0%. The steady improvements reflect advances in local geometry encoding and global context aggregation.
Example: autonomous vehicle perception
Self-driving cars use LiDAR sensors that produce point clouds of the surrounding environment. Segmenting these point clouds (this cluster of points is a car, that cluster is a pedestrian, those points are the road surface) is a safety-critical task. ShapeNet trains the per-point classification capability; production autonomous driving systems apply it to real-time LiDAR data at 10-20 frames per second with millions of points.
Published benchmark results
Part segmentation on ShapeNet measured by mean IoU (intersection over union) across part categories. Higher is better.
| Method | mIoU (%) | Year | Paper |
|---|---|---|---|
| PointNet | 83.7 | 2017 | Qi et al. |
| PointNet++ | 85.1 | 2017 | Qi et al. |
| DGCNN | 85.2 | 2019 | Wang et al. |
| Point Transformer | 86.6 | 2021 | Zhao et al. |
| PointNeXt | 87.0 | 2022 | Qian et al. |
| PointMLP | 86.1 | 2022 | Ma et al. |
Original Paper
ShapeNet: An Information-Rich 3D Model Repository
Angel X. Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, Jianxiong Xiao, Li Yi, Fisher Yu (2015). arXiv preprint
Read paper →Original data source
The ShapeNet Core dataset is available from shapenet.org. The part segmentation annotations used in PyG were provided by Yi et al. (2016) and are available from the Stanford ShapeNet Part page.
@techreport{chang2015shapenet,
title={ShapeNet: An Information-Rich 3D Model Repository},
author={Chang, Angel X and Funkhouser, Thomas and Guibas, Leonidas and Hanrahan, Pat and Huang, Qixing and Li, Zimo and Savarese, Silvio and Savva, Manolis and Song, Shuran and Su, Hao and Xiao, Jianxiong and Yi, Li and Yu, Fisher},
year={2015},
institution={Stanford University --- Princeton University --- Toyota Technological Institute at Chicago},
note={arXiv:1512.03012}
}BibTeX citation for the ShapeNet dataset.
Which dataset should I use?
ShapeNet vs ModelNet40: ShapeNet is for part segmentation (per-point labels). ModelNet40 is for whole-shape classification (one label per object). Choose based on your task: do you need to label parts of an object, or classify the whole thing?
ShapeNet vs S3DIS: ShapeNet has clean CAD shapes. S3DIS is real indoor 3D scans with noise and occlusion. Use ShapeNet for controlled benchmarks; S3DIS for real-world robustness.
ShapeNet vs ScanObjectNN: ScanObjectNN has real scanned objects with background noise. ShapeNet has perfect synthetic shapes. ScanObjectNN is harder and more realistic.
From benchmark to production
Production point cloud processing handles much larger scenes (100K+ points per frame vs. 2.6K per shape), real-time constraints (10Hz processing for autonomous driving), and noisy sensor data (LiDAR has distance-dependent resolution and occlusion artifacts). The clean, complete shapes in ShapeNet are a starting point; production robustness requires training on noisy, partial, and dynamic data.