Article
Computer Science, Artificial Intelligence
Qiang Nie, Ziwei Liu, Yunhui Liu
Summary: Lifting 2D human pose to 3D pose is a challenging task due to the ambiguity between 2D and 3D data and the lack of well-labeled 2D-3D pose pairs. This paper proposes a framework that leverages labeled 3D human poses to learn a 3D body concept, reducing ambiguity. By treating 2D and 3D poses as different domains, the body knowledge learned from 3D poses is applied to 2D poses, improving the network's ability to generate informative 3D imagination.
INTERNATIONAL JOURNAL OF COMPUTER VISION
(2023)
Article
Computer Science, Artificial Intelligence
Zhongwei Qiu, Kai Qiu, Jianlong Fu, Dongmei Fu
Summary: Modern deep learning-based 3D pose estimation approaches require abundant 3D pose annotations. The lack of diversity in existing 3D datasets hinders the performance and generalization ability of current methods. This paper proposes a novel method to extract weak 3D information directly from 2D images without 3D pose supervision. By utilizing 2D pose annotations and perspective prior knowledge, relative depth of human joints is generated, and a weakly-supervised pre-training strategy is introduced to learn the depth relationship between keypoints on in-the-wild images. After fine-tuning, the proposed method achieves state-of-the-art results on two widely-used benchmarks.
PATTERN RECOGNITION
(2023)
Article
Computer Science, Artificial Intelligence
Andrea D'Eusanio, Alessandro Simoni, Stefano Pini, Guido Borghi, Roberto Vezzani, Rita Cucchiara
Summary: This paper presents RefiNet, a depth-based 3D human pose refinement framework which regresses a fine 3D pose given a depth map and an initial coarse 2D human pose. The framework consists of three modules based on different data representations, and experimental evaluation shows its effectiveness compared to other approaches.
PATTERN RECOGNITION LETTERS
(2023)
Review
Chemistry, Multidisciplinary
Siqi Zhang, Chaofang Wang, Wenlong Dong, Bin Fan
Summary: Depth ambiguity is a major challenge in 3D human pose estimation, and recent strategies have made significant progress in solving this challenge. This survey provides a comprehensive review of the causes and solutions for depth ambiguity, classified into four categories: camera parameter constraints, temporal consistency constraints, kinematic constraints, and image cues constraints. It also discusses performance comparison, challenges, main frameworks, evaluation metrics, and suggests promising future research directions.
APPLIED SCIENCES-BASEL
(2022)
Article
Computer Science, Artificial Intelligence
Shuangjun Liu, Naveen Sehgal, Sarah Ostadabbas
Summary: This paper presents an adapted human pose estimation approach to address the issue of domain shift in 3D human pose estimation. By using synthetic data and adaptive strategies, the proposed method achieves comparable performance with models trained on large-scale real datasets, and also provides a lightweight head to improve existing models.
APPLIED INTELLIGENCE
(2022)
Article
Chemistry, Analytical
Nadav Eichler, Hagit Hel-Or, Ilan Shimshoni
Summary: RGB and depth cameras are commonly used for 3D tracking of human pose and motion. Multiple-camera setups can improve the tracking accuracy by minimizing occlusions, but require spatio-temporal calibration. This paper introduces an approach to on-the-fly spatio-temporal calibration of multiple cameras without specialized devices or equipment, and validates it using Microsoft Azure Kinect.
Article
Chemistry, Analytical
Zhaofeng Niu, Yuichiro Fujimoto, Masayuki Kanbara, Taishi Sawabe, Hirokazu Kato
Summary: In this paper, we propose a new TSDF fusion network called DFusion, which aims to minimize the influences of depth noises and pose noises on the 3D reconstruction process. The network consists of a fusion module that generates a TSDF volume and a denoising module that removes both depth and pose noises. 3D convolutional layers and a specially-designed loss function are utilized to utilize the 3D structural information of the TSDF volume and improve the fusion performance. Experimental results demonstrate the superiority of our method over existing methods.
Article
Computer Science, Hardware & Architecture
Shuqin Liu
Summary: The novel method proposed in this study utilizes a discrete point 3D reconstruction algorithm to estimate human poses, enhancing the generalization ability and accuracy of the model. The use of multiple cameras and the application of principal component analysis contribute to more precise pose estimation.
MICROPROCESSORS AND MICROSYSTEMS
(2021)
Article
Engineering, Electrical & Electronic
Tianlang Chen, Chen Fang, Xiaohui Shen, Yiheng Zhu, Zhili Chen, Jiebo Luo
Summary: This work proposes a new solution to 3D human pose estimation in videos. It decomposes the task into bone direction prediction and bone length prediction, drawing inspiration from human skeleton anatomy. The model performs high-accuracy bone length prediction utilizing global information and introduces a joint shift loss to bridge the training of the prediction networks. The model also utilizes an implicit attention mechanism to mitigate depth ambiguity in challenging poses, achieving superior performance compared to previous methods on benchmark datasets.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
(2022)
Article
Anatomy & Morphology
Kai Kiwitz, Andrea Brandstetter, Christian Schiffer, Sebastian Bludau, Hartmut Mohlberg, Mona Omidyeganeh, Philippe Massicotte, Katrin Amunts
Summary: The human metathalamus plays a crucial role in processing visual and auditory information, and understanding its microanatomy is important for studying its function and involvement in pathologies. This study provides cytoarchitectonic maps of the medial geniculate body (MGB) and lateral geniculate body (LGB) in the BigBrain model, as well as probabilistic maps in reference spaces. These maps can serve as anatomical references for neuroimaging studies and facilitate data integration and modeling of thalamocortical circuits.
FRONTIERS IN NEUROANATOMY
(2022)
Article
Computer Science, Information Systems
Jinyoung Jun, Jae-Han Lee, Chul Lee, Chang-Su Kim
Summary: A novel monocular depth estimator is proposed to improve prediction accuracy on human regions by utilizing pose information. The algorithm consists of two networks, PoseNet and DepthNet, with a feature blending block and a joint training scheme to enhance depth estimation performance significantly.
Article
Chemistry, Multidisciplinary
Junuk Cha, Muhammad Saqlain, Changhwa Lee, Seongyeong Lee, Seungeun Lee, Donguk Kim, Won-Hee Park, Seungryul Baek
Summary: The paper introduces a self-supervised learning framework for 3D human pose and shape estimation that does not require other forms of supervision signals while using only single 2D images. The proposed approach demonstrates effectiveness on 3D human pose benchmark datasets, achieving a new state-of-the-art among weakly/self-supervised methods.
APPLIED SCIENCES-BASEL
(2021)
Article
Computer Science, Artificial Intelligence
Zerui Chen, Yan Huang, Hongyuan Yu, Liang Wang
Summary: In this work, a part-aware 3D human pose estimator is proposed to better handle heterogeneous human body parts, achieving improved performance and reduced parameters compared to previous state-of-the-art models. Through rigorous ablation experiments, the robustness and stability of the searched models are validated, advancing state-of-the-art accuracy on both single-person and multi-person 3D human pose estimation benchmarks with affordable computational cost.
INTERNATIONAL JOURNAL OF COMPUTER VISION
(2022)
Article
Computer Science, Information Systems
Prabesh Paudel, Young-Jin Kwon, Do-Hyun Kim, Kyoung-Ho Choi
Summary: Ergonomics is crucial for industrial operation, especially in manufacturing. Incorrect working postures can lead to injuries and decreased productivity. This study proposes a new framework for analyzing the risk of workers' ergonomic postures using 3D pose estimation from video/image sequences. By comparing human body joint angles with ground truth data, the most reliable body-bending angles can be determined to ensure safe working angles that meet ergonomic requirements. The experiment achieved high accuracy and found significant results.
Review
Engineering, Electrical & Electronic
Zaka-Ud-Din Muhammad, Zhangjin Huang, Rashid Khan
Summary: 3D human body pose estimation and mesh recovery is a difficult and important task that has gained significant attention in real-world applications in recent years. Estimating the pose and shape of a body in motion and reconstructing its mesh have important implications for everyday life.
DIGITAL SIGNAL PROCESSING
(2022)
Editorial Material
Computer Science, Artificial Intelligence
Manuel J. Marin-Jimenez, Javier Romero, Hao Li, Gregory Rogez
INTERNATIONAL JOURNAL OF COMPUTER VISION
(2022)
Article
Computer Science, Hardware & Architecture
Antonio Fuentes-Alventosa, Juan Gomez-Luna, Jose Maria Gonzalez-Linares, Nicolas Guil, R. Medina-Carnicer
Summary: CAVLC, a high-performance entropy method for video and image compression, is widely used in the H.264 standard. While hardware accelerators have been designed, high-performance software implementations of CAVLC, especially GPU-based ones, are limited. In this paper, a new efficient GPU-based implementation of CAVLC called CAVLCU is introduced, which outperforms existing GPU-based implementations.
JOURNAL OF SUPERCOMPUTING
(2022)
Article
Computer Science, Artificial Intelligence
Antonio Fuentes-Alventosa, Juan Gomez-Luna, R. Medina-Carnicer
Summary: The Canny algorithm is a commonly used edge detector with superior performance in noisy environments, but it suffers from a time-consuming process. To address the limitations of GPU implementations, a novel GPU-based unsupervised and distributed Canny edge detector is proposed in this paper, which achieves real-time requirements and outperforms existing GPU and FPGA implementations.
JOURNAL OF REAL-TIME IMAGE PROCESSING
(2022)
Article
Chemistry, Analytical
Sergio Garrido-Jurado, Juan Garrido, David Jurado-Rodriguez, Francisco Vazquez, Rafael Munoz-Salinas
Summary: Square markers are commonly used for camera localization due to their robustness, accuracy, and detection speed. However, most systems do not consider the possibility of observing reflected markers, which can lead to detection errors. This research focuses on reflection-aware square marker dictionaries and presents new algorithms for generating and identifying them. The experimental results show that the proposed approach outperforms existing dictionaries in terms of inter-marker distance and the optimization process significantly improves them.
Article
Computer Science, Artificial Intelligence
Nicolas Luis Fernandez-Garcia, Luis Del-Moral Martinez, Angel Carmona-Poyato, Francisco Jose Madrid-Cuevas, Rafael Medina-Carnicer
Summary: This document presents two proposals regarding the evaluation of polygonal approximations. Firstly, a new measurement called normalized compression ratio and adjustment error (NCA) is proposed to provide a fair evaluation of the performance of polygonal approximations of 2D closed curves. Secondly, a new evaluation methodology based on the optimal quality curve concept is proposed for assessing the measurements. The experiments show that NCA obtains the best results and can be used to fairly evaluate the performance of polygonal approximations.
PATTERN RECOGNITION
(2023)
Article
Chemistry, Analytical
Francisco J. Romero-Ramirez, Rafael Munoz-Salinas, Manuel J. Marin-Jimenez, Miguel Cazorla, Rafael Medina-Carnicer
Summary: This paper proposes a novel visual SLAM approach that efficiently combines keypoints and artificial markers, allowing for a substantial reduction in computing time and memory required without noticeably degrading the tracking accuracy.
Article
Computer Science, Interdisciplinary Applications
David Jurado-Rodriguez, Rafael Munoz-Salinas, Sergio Garrido-Jurado, Rafael Medina-Carnicer
Summary: This study provides a comprehensive evaluation of the most relevant marker systems, comparing them in terms of sensitivity, specificity, accuracy, computational cost, and performance under occlusion. Recommendations on which method to use based on the application requirements are offered.
Article
Chemistry, Analytical
Rafael Aguilar-Ortega, Rafael Berral-Soler, Isabel Jimenez-Velasco, Francisco J. Romero-Ramirez, Manuel Garcia-Marin, Jorge Zafra-Palma, Rafael Munoz-Salinas, Rafael Medina-Carnicer, Manuel J. Marin-Jimenez
Summary: This article introduces the use of deep learning for pose estimation in physical rehabilitation, aiming to help doctors monitor patients' recovery progress more effectively. The study evaluates and compares different pose estimation methods and examines the impact of subject position and camera viewpoint on the results, as well as the necessity of 3D estimation. The findings provide useful insights for optimizing rehabilitation monitoring.
Article
Computer Science, Interdisciplinary Applications
David Jurado-Rodriguez, Rafael Munoz-Salinas, Sergio Garrido-Jurado, Francisco J. Romero-Ramirez, Rafael Medina-Carnicer
Summary: This paper proposes a novel approach that employs an enhanced model combining edges, keypoints, and fiducial markers for robust and real-time tracking. Experimental results demonstrate that our method outperforms state-of-the-art model-based approaches and suggest that fiducial markers are a good choice for texturing models.