CHI3D | Close Interactions 3D

Subjects in contact interactions TRAIN set: 3 pairs of subjects TEST set: 2 pairs of subjects In each recording, one subject is motion tracked with a marker-based motion capture system (Vicon). This motion is very accurate and is currently released. The second person is tracked using only RGB cameras.
Multiple cameras 4 different views 900x900 resolution 50 fps Camera parameters: extrinsics intrinsics for 2 different camera models (one assuming image distortion, one ignoring it) The TEST set consists of only one random camera viewpoint per interaction scenario, to avoid simplifying the 3D Reconstruction challenge through multi-view triangulation/optimization.
Various actions in various scenarios Each pair of subjects performs the following actions under multiple interaction scenarios: Grab (20 interactions) Handshake (11 interactions) Hit (27 interactions) Holding Hands (7 interactions) Hug (10 interactions) Kick (19 interactions) Posing (8 interactions) Push (25 interactions)
GHUM and SMPLX meshes Person 1: ground-truth, well-alligned mesh - obtained by fitting the GHUM model to accurate 3d markers, multi-view image evidence and body scans Person 2: pseudo-ground-truth mesh - obtained by fitting the GHUM model to multi-view image evidence and body scans We retarget the GHUM meshes to the SMPLX topology and provide pose and shape parameters for both 50 fps
3D skeletons 25 joints (including the 17 Human3.6m joints) Person 1: ground-truth 3d skeletons Person 2: pseudo-ground-truth 3d skeletons regressed from the GHUM meshes 50fps
Contact signature annotations Each of the 631 interaction recordings contain: 1 video timestamp where contact is established and the signature is annotated 1 contact signature annotation (multiple surface correspondences: triangle ids, vertex ids, region ids) timestamps of the start and the end of the contact Due to the 4 viewpoints, this amounts to 2524 pairs of images and contact signatures.