Name: Objaverse-XL
Creator: Allen Institute for AI (AI2)
License: https://creativecommons.org/licenses/by/4.0/

Overview

Objaverse‑XL is a large, open 3D object corpus created by the Allen Institute for AI (AI2). It spans diverse, textured objects suitable for training generative 3D models, reconstruction, and simulation. See the official pages for canonical details and updates:

Data Structure & Formats

Formats: GLB, USDZ (textures embedded or referenced)
Metadata: per‑object JSON with categories, license, and basic properties
Thumbnails: preview images for quick inspection

/objaverse-xl ├── models/ # GLB/USDZ files ├── metadata/ # JSON sidecar metadata └── thumbnails/ # Preview images

Python Quickstart

pip install objaverse import objaverse uids = objaverse.get_uids()[:100] objects = objaverse.load_objects(uids=uids) # Iterate and process for uid, data in objects.items(): name = data.get('name', uid) # do something useful...

Tip: start with subset pulls; full mirrors are multi‑TB and slow to move.

For advanced filtering (license/category/polycount), use the dedicated guide: Subsets & Metadata →

Benchmarks & Usage

Text‑to‑3D pretraining and supervision for single‑image 3D (e.g., Zero123 family)
Asset banks for AR/VR or simulation experiments
Long‑tail category coverage for rare class robustness

See related research and tools: threestudio • Stable DreamFusion

Storage Planning

Hot data on NVMe; archive to HDD/NAS after preprocessing
Keep paths short to avoid inode/path length issues
Checksum large transfers; prefer rsync/aria2 for robustness

Objaverse‑XL vs Alternatives

For a broader context, see the Top 5 Dataset Comparison (XL, ++, ShapeNet, GSO, ModelNet).

FAQ

Is Objaverse‑XL free to use?

Yes, but you must respect each object's license. Always check metadata or the dataset card.

Do I need Linux?

Recommended for large jobs; macOS/Windows work for exploration.