4.7 Article

RGB2Hands: Real-Time Tracking of 3D Hand Interactions from Monocular RGB Video

期刊

ACM TRANSACTIONS ON GRAPHICS
卷 39, 期 6, 页码 -

出版社

ASSOC COMPUTING MACHINERY
DOI: 10.1145/3414685.3417852

关键词

hand tracking; hand pose estimation; hand reconstruction; two hands; monocular RGB; RGB video; computer vision

资金

  1. ERC Consolidator Grant 4DRepLy [770784]
  2. ERC Consolidator Grant TouchDesign [772738]
  3. Spanish Ministry of Science [RTI2018-098694-B-I00 VizLearning]
  4. European Research Council (ERC) [772738] Funding Source: European Research Council (ERC)

向作者/读者索取更多资源

Tracking and reconstructing the 3D pose and geometry of two hands in interaction is a challenging problem that has a high relevance for several human-computer interaction applications, including AR/VR, robotics, or sign language recognition. Existing works are either limited to simpler tracking settings (e.g., considering only a single hand or two spatially separated hands), or rely on less ubiquitous sensors, such as depth cameras. In contrast, in this work we present the first real-time method for motion capture of skeletal pose and 3D surface geometry of hands from a single RGB camera that explicitly considers close interactions. In order to address the inherent depth ambiguities in RGB data, we propose a novel multi-task CNN that regresses multiple complementary pieces of information, including segmentation, dense matchings to a 3D hand model, and 2D keypoint positions, together with newly proposed infra-hand relative depth and inter-hand distance maps. These predictions are subsequently used in a generative model fitting framework in order to estimate pose and shape parameters of a 3D hand model for both hands. We experimentally verify the individual components of our RGB two-hand tracking and 3D reconstruction pipeline through an extensive ablation study. Moreover, we demonstrate that our approach offers previously unseen two-hand tracking performance from RGB, and quantitatively and qualitatively outperforms existing RGB-based methods that were not explicitly designed for two-hand interactions. Moreover, our method even performs on-par with depth-based real-time methods.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Robotics

Learn to Predict How Humans Manipulate Large-Sized Objects From Interactive Motions

Weilin Wan, Lei Yang, Lingjie Liu, Zhuoying Zhang, Ruixing Jia, Yi-King Choi, Jia Pan, Christian Theobalt, Taku Komura, Wenping Wang

Summary: This study focuses on predicting the future states of objects and humans in full-body interactions with large-sized daily objects. A large-scale dataset is collected for training and evaluation, and a graph neural network is proposed to fuse motion data and dynamic descriptors for the prediction task. The results demonstrate that the proposed network achieves state-of-the-art prediction results and is useful for human-robot collaborations.

IEEE ROBOTICS AND AUTOMATION LETTERS (2022)

Article Computer Science, Software Engineering

Estimation of Yarn-Level Simulation Models for Production Fabrics

Georg Sperl, Rosa M. Sanchez-Banderas, Manwen Li, Chris Wojtan, Miguel A. Otaduy

Summary: This paper introduces a methodology for inverse-modeling the yarn-level mechanics of cloth based on real-world fabric mechanical responses. The authors compiled a database of physical tests from different types of knitted fabrics used in the textile industry, demonstrating diverse physical properties. They then developed a system for approximating these mechanical responses with yarn-level cloth simulation and introduced an efficient pipeline for converting fabric-level data to yarn-level simulation.

ACM TRANSACTIONS ON GRAPHICS (2022)

Article Biotechnology & Applied Microbiology

Biomechanical Morphing for Personalized Fitting of Scoliotic Torso Skeleton Models

Christos Koutras, Hamed Shayestehpour, Jesus Perez, Christian Wong, John Rasmussen, Maxime Tournier, Matthieu Nesme, Miguel A. Otaduy

Summary: The method of fitting personalized models of the torso skeleton using biplanar low-dose radiographs provides an accurate and robust solution for the treatment of adolescent idiopathic scoliosis. It can be adopted as part of regular patient monitoring.

FRONTIERS IN BIOENGINEERING AND BIOTECHNOLOGY (2022)

Editorial Material Computer Science, Software Engineering

Foreword to the Special Section on CEIG 2022

Ana Serrano, Jorge Posada, Miguel Otaduy

COMPUTERS & GRAPHICS-UK (2022)

Article Biotechnology & Applied Microbiology

Soft-tissue simulation of the breast for intraoperative navigation and fusion of preoperative planning

Patricia Alcaniz, Cesar Vivo de Catarina, Alessandro Gutierrez, Jesus Perez, Carlos Illana, Beatriz Pinar, Miguel A. Otaduy

Summary: Computational preoperative planning can reduce surgery time and patient risk, but its applicability is limited by deviations between preoperative and intraoperative settings, especially on soft tissues such as the breast. This work proposes a high-performance accurate simulation model of the breast to fuse preoperative information with intraoperative deformation settings. The methodology includes high-quality finite-element modeling, efficient handling of anatomical couplings, and personalized parameter estimation.

FRONTIERS IN BIOENGINEERING AND BIOTECHNOLOGY (2022)

Article Computer Science, Software Engineering

Scene-Aware 3D Multi-Human Motion Capture from a Single Camera

D. C. Luvizon, M. Habermann, V. Golyanik, A. Kortylewski, C. Theobalt

Summary: This work focuses on estimating the 3D position, body shape, and articulation of multiple humans from a single RGB video with a static camera. The proposed approach leverages pre-trained models for various modalities and introduces a non-linear optimization-based method to jointly solve for the 3D position, articulated pose, individual shapes, and scene scale. The method is evaluated on benchmark datasets and demonstrates robustness to challenging in-the-wild conditions.

COMPUTER GRAPHICS FORUM (2023)

Article Computer Science, Software Engineering

IMoS: Intent-Driven Full-Body Motion Synthesis for Human-Object Interactions

Anindita Ghosh, Rishabh Dabral, Vladislav Golyanik, Christian Theobalt, Philipp Slusallek

Summary: In this study, we propose a framework for synthesizing the interaction between virtual characters and surrounding objects using simple instructions. Our results demonstrate that the intent-driven fullbody motion generator we designed can effectively generate motion sequences for virtual characters performing specified actions.

COMPUTER GRAPHICS FORUM (2023)

Article Computer Science, Software Engineering

State of the Art in Dense Monocular Non-Rigid 3D Reconstruction

Edith Tretschk, Navami Kairanda, B. R. Mallikarjun, Rishabh Dabral, Adam Kortylewski, Bernhard Egger, Marc Habermann, Pascal Fua, Christian Theobalt, Vladislav Golyanik

Summary: This article presents the current research status and importance of 3D reconstruction of deformable scenes from 2D image observations, and classifies and compares different types of deformable objects. It also discusses the challenges in the field and the social aspects associated with the usage of the reviewed methods.

COMPUTER GRAPHICS FORUM (2023)

Article Computer Science, Artificial Intelligence

A Deeper Look into DeepCap (Invited Paper)

Marc Habermann, Weipeng Xu, Michael Zollhoefer, Gerard Pons-Moll, Christian Theobalt

Summary: Human performance capture is a vital computer vision problem with numerous applications. We propose a innovative deep learning approach for monocular dense human performance capture, which is trained in a weakly supervised manner without 3D ground truth annotations. Our method outperforms the state of the art in terms of quality and robustness, as shown by extensive qualitative and quantitative evaluations. This work is an extended version of [1] and provides more detailed explanations, comparisons, results, and applications.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2023)

Proceedings Paper Computer Science, Artificial Intelligence

UnrealEgo: A New Dataset for Robust Egocentric 3D Human Motion Capture

Hiroyasu Akada, Jian Wang, Soshi Shimada, Masaki Takahashi, Christian Theobalt, Vladislav Golyanik

Summary: UnrealEgo is a new large-scale naturalistic dataset for egocentric 3D human pose estimation in stereo environments. It utilizes an innovative concept of eyeglasses equipped with fisheye cameras and provides the widest variety of human motions among existing egocentric datasets. Additionally, a novel benchmark method involving a 2D keypoint estimation module for stereo inputs is proposed to enhance 3D human pose estimation performance.

COMPUTER VISION - ECCV 2022, PT VI (2022)

Proceedings Paper Computer Science, Artificial Intelligence

BEHAVE: Dataset and Method for Tracking Human Object Interactions

Bharat Lal Bhatnagar, Xianghui Xie, Ilya A. Petrov, Cristian Sminchisescu, Christian Theobalt, Gerard Pons-Moll

Summary: Modelling interactions between humans and objects in natural environments is crucial for various applications. The lack of a comprehensive dataset has hindered progress in this area. We introduce the BEHAVE dataset, the first dataset that includes multi-view RGBD frames, 3D SMPL and object fits, and annotated contacts between humans and objects. We use this dataset to develop a model for tracking human-object interactions in natural environments using a portable multi-camera setup.

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) (2022)

Proceedings Paper Computer Science, Artificial Intelligence

SNUG: Self-Supervised Neural Dynamic Garments

Igor Santesteban, Miguel A. Otaduy, Dan Casas

Summary: This article presents a self-supervised method for learning dynamic 3D deformations of garments worn by parametric human bodies. By formulating an optimization problem and using physics-based loss terms, neural networks can be trained without precomputing ground-truth data, resulting in a significant speed up in training time.

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) (2022)

Proceedings Paper Computer Science, Artificial Intelligence

Neural Rays for Occlusion-aware Image-based Rendering

Yuan Liu, Sida Peng, Lingjie Liu, Qianqian Wang, Peng Wang, Christian Theobalt, Xiaowei Zhou, Wenping Wang

Summary: This paper presents a new neural representation called NeuRay for the task of novel view synthesis. By predicting the visibility of 3D points to input views, the rendering quality of the radiance field is significantly improved.

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) (2022)

Proceedings Paper Computer Science, Artificial Intelligence

Playable Environments: Video Manipulation in Space and Time

Willi Menapace, Stephane Lathuiliere, Aliaksandr Siarohin, Christian Theobalt, Sergey Tulyakov, Vladislav Golyanik, Elisa Ricci

Summary: We present Playable Environments, a novel representation for interactive video generation and manipulation. It allows the user to move objects in 3D and generate videos by providing a sequence of desired actions based on a single image at inference time. The actions are learned in an unsupervised manner and the camera can be controlled to achieve the desired viewpoint. Our method builds an environment state for each frame that can be manipulated using our proposed action module and rendered back to the image space with volumetric rendering. We also introduce two large-scale video datasets with significant camera movements to set a challenging benchmark. Playable environments enable creative applications such as 3D video generation, stylization, and manipulation that were not possible with prior video synthesis works.

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) (2022)

Article Computer Science, Information Systems

Generation of Truly Random Numbers on a Quantum Annealer

Harshil Bhatia, Edith Tretschk, Christian Theobalt, Vladislav Golyanik

Summary: This study explores how qubits in modern quantum annealers can generate truly random numbers. The researchers demonstrate how the annealing process can be used to measure thousands of random binary numbers simultaneously. These numbers can then be converted into uniformly distributed natural or real numbers within desired ranges. The study also discusses the properties of the observed qubits and various physical factors that impact the performance of the generator.

IEEE ACCESS (2022)

暂无数据