Article
Engineering, Electrical & Electronic
Shaoxiang Guo, Eric Rigall, Yakun Ju, Junyu Dong
Summary: This article discusses the challenges of estimating 3D hand pose from a monocular RGB image and proposes a simple and efficient deep neural network to improve this task. By designing a feature chat block, the model is able to better handle the relationship between joint and skeleton features, resulting in improved accuracy and faster inference speed.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
(2022)
Article
Computer Science, Artificial Intelligence
Sanjeev Sharma, Shaoli Huang
Summary: This study presents an end-to-end framework that robustly predicts hand prior information and accurately infers 3D hand pose using ConvNet models, introducing a novel keypoint-based method to enhance hand detector's robustness and two geometric constraints inspired by human hand's biological structure to improve 3D coordinates prediction.
PATTERN RECOGNITION
(2021)
Article
Engineering, Electrical & Electronic
Shaoxiang Guo, Eric Rigall, Lin Qi, Xinghui Dong, Haiyan Li, Junyu Dong
Summary: The paper explores the prediction of 3D hand poses from a single RGB image, utilizing multiple feature maps, graph-based convolutional neural networks, and self-supervised modules to improve the accuracy of hand pose estimation.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
(2021)
Article
Engineering, Electrical & Electronic
Moran Li, Jialong Wang, Nong Sang
Summary: A novel compressed latent distribution representation is proposed to address the channel correspondence problem in 3D hand pose estimation from monocular RGB images. By interconnecting 2D and depth feature maps more directly, the proposed method effectively improves cross-dataset performance and achieves state-of-the-art results on benchmark datasets.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
(2021)
Article
Computer Science, Information Systems
Bekiri Roumaissa, Babahenini Mohamed Chaouki
Summary: Hand pose estimation is a significant research topic in computer vision applications. This paper proposes an end-to-end framework called ResUnet network that can efficiently detect and estimate the position of a human hand from a monocular RGB image. The quantitative and qualitative results demonstrate that our regression approach outperforms the current state-of-the-art hand pose estimation methods on three datasets.
MULTIMEDIA TOOLS AND APPLICATIONS
(2023)
Article
Computer Science, Hardware & Architecture
Yi Xiao, Hao Sha, Huaying Hao, Yue Liu, Yongtian Wang
Summary: This article introduces a method for recovering 3D hand mesh from a monocular RGB image. By integrating an analytical method into a neural network, an end-to-end learnable model named IKHand is proposed, which can generate impressive and robust 3D hand meshes under various challenging conditions.
Article
Automation & Control Systems
Qiufu Wang, Jiexin Zhou, Zhang Li, Xiaoliang Sun, Qifeng Yu
Summary: In this article, we propose a robust and accurate monocular pose tracking method for tracking objects with large pose shifts. Using an indexable sparse viewpoint model to represent the object 3D geometry, we establish a transitional view to recover motion continuity and optimize the pose based on region-based optimization algorithm. Finally, a single-rendering-based pose refinement process is used to achieve highly accurate pose results.
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS
(2023)
Article
Engineering, Electrical & Electronic
Yang Liu, Akio Namiki
Summary: This study introduces a novel method for high-frame-rate articulated object tracking using monocular cameras. By integrating dual-quaternion kinematics with fast pixel-wise-posteriors tracking framework, the method is capable of robustly tracking articulated objects with many degrees of freedom in dynamic environments.
IEEE SENSORS JOURNAL
(2021)
Article
Computer Science, Software Engineering
Taeyun Woo, Wonjung Park, Woohyun Jeong, Jinah Park
Summary: This paper provides a comprehensive survey of state-of-the-art deep learning-based approaches for estimating hand pose in the context of hand-object interaction. It discusses various deep learning-based approaches to image-based hand tracking and reviews hand-object interaction dataset benchmarks. Deep learning has emerged as a powerful technique for solving hand pose estimation problems.
COMPUTERS & GRAPHICS-UK
(2023)
Article
Engineering, Biomedical
Nathan Louis, Luowei Zhou, Steven J. Yule, Roger D. Dias, Milisa Manojlovich, Francis D. Pagani, Donald S. Likosky, Jason J. Corso
Summary: This study proposes a novel hand pose estimation model, CondPose, which improves detection and tracking accuracy by incorporating pose prior information. The researchers also collect the Surgical Hands dataset, which provides multi-instance articulated hand pose annotations for publicly available surgical videos, including bounding boxes, pose annotations, and tracking IDs for multi-instance tracking research.
INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY
(2023)
Article
Computer Science, Artificial Intelligence
Divyansh Gupta, Bruno Artacho, Andreas Savakis
Summary: This paper presents HandyPose, a single-pass, end-to-end trainable architecture for 2D hand pose estimation using a single RGB image as input. The proposed method achieves high accuracy while maintaining manageable size complexity and modularity of the network. The advanced multi-level waterfall module and multi-scale approach contribute to the performance improvement. The results demonstrate that HandyPose is a robust and efficient architecture for 2D hand pose estimation.
PATTERN RECOGNITION
(2022)
Article
Computer Science, Information Systems
Yuan Gao, Shogo Matsuoka, Weiwei Wan, Takuya Kiyokawa, Keisuke Koyama, Kensuke Harada
Summary: This paper presents a method for estimating the 6D pose of an object grasped by a robot hand using RGB cameras on the palm and visuotactile sensors on the fingertips. By combining the two types of sensors, it can handle objects made from various materials. The method allows in-hand pose estimation without the need for preparation or specific environmental backgrounds. The proposed method includes deep-learning-based background subtraction and denoising auto-encoder-based sensor fusion. The results demonstrate the benefits of the proposed combination and mechanism, providing essential knowledge for readers considering similar configurations for pose estimation.
Article
Computer Science, Interdisciplinary Applications
Fu-Song Hsu, Te-Mei Wang, Liang-Hsun Chen
Summary: In this paper, a vision-based method for improved two-handed glove tracking is proposed, which utilizes a single camera to identify left, right, or both gloves and accurately predict glove positions. The system achieved high accuracy and speed in tracking on a validation set.
Article
Chemistry, Analytical
Jiangying Zhao, Yongbiao Hu, Mingrui Tian
Summary: This study proposes a method for estimating the pose of excavator manipulators using computer vision technology, demonstrating its feasibility through experiments and error analysis while establishing a measurement system to simulate the pose estimation process.
Article
Robotics
Jiunn-Kai Huang, William Clark, Jessy W. Grizzle
Summary: This letter introduces the concept of optimizing target shape for LiDAR point clouds to eliminate pose ambiguity and proposes a method to estimate target vertices using the target's geometry. By using the optimal shape and the global solver, high localization accuracy can be achieved even at a distance of 30 meters away.
IEEE ROBOTICS AND AUTOMATION LETTERS
(2022)
Article
Robotics
Weilin Wan, Lei Yang, Lingjie Liu, Zhuoying Zhang, Ruixing Jia, Yi-King Choi, Jia Pan, Christian Theobalt, Taku Komura, Wenping Wang
Summary: This study focuses on predicting the future states of objects and humans in full-body interactions with large-sized daily objects. A large-scale dataset is collected for training and evaluation, and a graph neural network is proposed to fuse motion data and dynamic descriptors for the prediction task. The results demonstrate that the proposed network achieves state-of-the-art prediction results and is useful for human-robot collaborations.
IEEE ROBOTICS AND AUTOMATION LETTERS
(2022)
Article
Computer Science, Software Engineering
Georg Sperl, Rosa M. Sanchez-Banderas, Manwen Li, Chris Wojtan, Miguel A. Otaduy
Summary: This paper introduces a methodology for inverse-modeling the yarn-level mechanics of cloth based on real-world fabric mechanical responses. The authors compiled a database of physical tests from different types of knitted fabrics used in the textile industry, demonstrating diverse physical properties. They then developed a system for approximating these mechanical responses with yarn-level cloth simulation and introduced an efficient pipeline for converting fabric-level data to yarn-level simulation.
ACM TRANSACTIONS ON GRAPHICS
(2022)
Article
Biotechnology & Applied Microbiology
Christos Koutras, Hamed Shayestehpour, Jesus Perez, Christian Wong, John Rasmussen, Maxime Tournier, Matthieu Nesme, Miguel A. Otaduy
Summary: The method of fitting personalized models of the torso skeleton using biplanar low-dose radiographs provides an accurate and robust solution for the treatment of adolescent idiopathic scoliosis. It can be adopted as part of regular patient monitoring.
FRONTIERS IN BIOENGINEERING AND BIOTECHNOLOGY
(2022)
Editorial Material
Computer Science, Software Engineering
Ana Serrano, Jorge Posada, Miguel Otaduy
COMPUTERS & GRAPHICS-UK
(2022)
Article
Biotechnology & Applied Microbiology
Patricia Alcaniz, Cesar Vivo de Catarina, Alessandro Gutierrez, Jesus Perez, Carlos Illana, Beatriz Pinar, Miguel A. Otaduy
Summary: Computational preoperative planning can reduce surgery time and patient risk, but its applicability is limited by deviations between preoperative and intraoperative settings, especially on soft tissues such as the breast. This work proposes a high-performance accurate simulation model of the breast to fuse preoperative information with intraoperative deformation settings. The methodology includes high-quality finite-element modeling, efficient handling of anatomical couplings, and personalized parameter estimation.
FRONTIERS IN BIOENGINEERING AND BIOTECHNOLOGY
(2022)
Article
Computer Science, Software Engineering
D. C. Luvizon, M. Habermann, V. Golyanik, A. Kortylewski, C. Theobalt
Summary: This work focuses on estimating the 3D position, body shape, and articulation of multiple humans from a single RGB video with a static camera. The proposed approach leverages pre-trained models for various modalities and introduces a non-linear optimization-based method to jointly solve for the 3D position, articulated pose, individual shapes, and scene scale. The method is evaluated on benchmark datasets and demonstrates robustness to challenging in-the-wild conditions.
COMPUTER GRAPHICS FORUM
(2023)
Article
Computer Science, Software Engineering
Anindita Ghosh, Rishabh Dabral, Vladislav Golyanik, Christian Theobalt, Philipp Slusallek
Summary: In this study, we propose a framework for synthesizing the interaction between virtual characters and surrounding objects using simple instructions. Our results demonstrate that the intent-driven fullbody motion generator we designed can effectively generate motion sequences for virtual characters performing specified actions.
COMPUTER GRAPHICS FORUM
(2023)
Article
Computer Science, Software Engineering
Edith Tretschk, Navami Kairanda, B. R. Mallikarjun, Rishabh Dabral, Adam Kortylewski, Bernhard Egger, Marc Habermann, Pascal Fua, Christian Theobalt, Vladislav Golyanik
Summary: This article presents the current research status and importance of 3D reconstruction of deformable scenes from 2D image observations, and classifies and compares different types of deformable objects. It also discusses the challenges in the field and the social aspects associated with the usage of the reviewed methods.
COMPUTER GRAPHICS FORUM
(2023)
Article
Computer Science, Artificial Intelligence
Marc Habermann, Weipeng Xu, Michael Zollhoefer, Gerard Pons-Moll, Christian Theobalt
Summary: Human performance capture is a vital computer vision problem with numerous applications. We propose a innovative deep learning approach for monocular dense human performance capture, which is trained in a weakly supervised manner without 3D ground truth annotations. Our method outperforms the state of the art in terms of quality and robustness, as shown by extensive qualitative and quantitative evaluations. This work is an extended version of [1] and provides more detailed explanations, comparisons, results, and applications.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
(2023)
Proceedings Paper
Computer Science, Artificial Intelligence
Hiroyasu Akada, Jian Wang, Soshi Shimada, Masaki Takahashi, Christian Theobalt, Vladislav Golyanik
Summary: UnrealEgo is a new large-scale naturalistic dataset for egocentric 3D human pose estimation in stereo environments. It utilizes an innovative concept of eyeglasses equipped with fisheye cameras and provides the widest variety of human motions among existing egocentric datasets. Additionally, a novel benchmark method involving a 2D keypoint estimation module for stereo inputs is proposed to enhance 3D human pose estimation performance.
COMPUTER VISION - ECCV 2022, PT VI
(2022)
Proceedings Paper
Computer Science, Artificial Intelligence
Bharat Lal Bhatnagar, Xianghui Xie, Ilya A. Petrov, Cristian Sminchisescu, Christian Theobalt, Gerard Pons-Moll
Summary: Modelling interactions between humans and objects in natural environments is crucial for various applications. The lack of a comprehensive dataset has hindered progress in this area. We introduce the BEHAVE dataset, the first dataset that includes multi-view RGBD frames, 3D SMPL and object fits, and annotated contacts between humans and objects. We use this dataset to develop a model for tracking human-object interactions in natural environments using a portable multi-camera setup.
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022)
(2022)
Proceedings Paper
Computer Science, Artificial Intelligence
Igor Santesteban, Miguel A. Otaduy, Dan Casas
Summary: This article presents a self-supervised method for learning dynamic 3D deformations of garments worn by parametric human bodies. By formulating an optimization problem and using physics-based loss terms, neural networks can be trained without precomputing ground-truth data, resulting in a significant speed up in training time.
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)
(2022)
Proceedings Paper
Computer Science, Artificial Intelligence
Yuan Liu, Sida Peng, Lingjie Liu, Qianqian Wang, Peng Wang, Christian Theobalt, Xiaowei Zhou, Wenping Wang
Summary: This paper presents a new neural representation called NeuRay for the task of novel view synthesis. By predicting the visibility of 3D points to input views, the rendering quality of the radiance field is significantly improved.
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)
(2022)
Proceedings Paper
Computer Science, Artificial Intelligence
Willi Menapace, Stephane Lathuiliere, Aliaksandr Siarohin, Christian Theobalt, Sergey Tulyakov, Vladislav Golyanik, Elisa Ricci
Summary: We present Playable Environments, a novel representation for interactive video generation and manipulation. It allows the user to move objects in 3D and generate videos by providing a sequence of desired actions based on a single image at inference time. The actions are learned in an unsupervised manner and the camera can be controlled to achieve the desired viewpoint. Our method builds an environment state for each frame that can be manipulated using our proposed action module and rendered back to the image space with volumetric rendering. We also introduce two large-scale video datasets with significant camera movements to set a challenging benchmark. Playable environments enable creative applications such as 3D video generation, stylization, and manipulation that were not possible with prior video synthesis works.
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022)
(2022)
Article
Computer Science, Information Systems
Harshil Bhatia, Edith Tretschk, Christian Theobalt, Vladislav Golyanik
Summary: This study explores how qubits in modern quantum annealers can generate truly random numbers. The researchers demonstrate how the annealing process can be used to measure thousands of random binary numbers simultaneously. These numbers can then be converted into uniformly distributed natural or real numbers within desired ranges. The study also discusses the properties of the observed qubits and various physical factors that impact the performance of the generator.