The Drive&Act dataset is a state of the art multi modal benchmark for driver behavior recognition. The dataset includes 3D skeletons in addition to frame-wise hierarchical labels of 9.6 Million frames captured by 6 different views and 3 modalities (RGB, IR and depth).
It offers following key features:
- 12h of video data in 29 long sequences
- Calibrated multi view camera system with 5 views
- Multi modal videos: NIR, Depth and Color data
- Markerless motion capture: 3D Body Pose and Head Pose
- Model of the static interior of the car
- 83 manually annotated hierarchical activity labels:
- Level 1: Long running tasks (12)
- Level 2: Semantic actions (34)
- Level 3: Object Interaction tripplets [action|object|location] (6|17|14)
Source: Drive&Act: A Multi-Modal Dataset for Fine-Grained Driver Behavior Recognition in Autonomous Vehicles