Article
Computer Science, Information Systems
Showmik Bhowmik, Soumyadeep Kundu, Ram Sarkar
Summary: Document layout analysis (DLA) is essential for developing a comprehensive document image processing system, aiming to segment document images and identify different regions. The proposed BINYAS system, based on connected components and pixel analysis, outperforms existing methods based on evaluations on four standard datasets.
MULTIMEDIA TOOLS AND APPLICATIONS
(2021)
Article
Medicine, Legal
Itiel E. Dror, Kyle C. Scherr, Linton A. Mohammed, Carla. L. MacLean, Lloyd Cunningham
Summary: This study explored the judgments of practicing forensic document experts and found that their judgments were not biased by the nature of the case, possibly due to the fact that document examiners do not primarily work within an organizational forensic laboratory culture, leading to a lack of consistency.
FORENSIC SCIENCE INTERNATIONAL
(2021)
Article
Computer Science, Artificial Intelligence
Erik Novak, Luka Bizjak, Dunja Mladenic, Marko Grobelnik
Summary: This paper proposes a novel learning-to-rank model named LM-EMD that utilizes a multilingual BERT language model and Earth Mover's Distance (EMD) to measure the relevancy between a document and an input query. The model provides interpretable insights by analyzing the distances and identifying the contributing document tokens to the relevancy.
KNOWLEDGE-BASED SYSTEMS
(2022)
Article
Communication
Eike Mark Rinke, Timo Dobbrick, Charlotte Loeb, Cacilia Zirn, Hartmut Wessler
Summary: In text-as-data studies, expert-informed topic modeling (EITM) is proposed as a flexible and efficient approach to help researchers identify and select subsets of documents addressing specific topics within large text corpora by combining external domain knowledge and probabilistic topic models.
COMMUNICATION METHODS AND MEASURES
(2022)
Article
Automation & Control Systems
Patricia Medyna Lauritzen de Lucena Drumond, Lindeberg Pessoa Leite, Teofilo E. de Campos, Fabricio Ataides Braz
Summary: The relative position of text blocks is crucial in document understanding, however, embedding layout information in a page instance representation is not easy. We introduce a new method called Layout Quadrant Tags (LayoutQT) to encode layout information in textual embedding, enhancing NLP pipelines without expensive multimodal fusion. Our experiments with AWD-LSTM neural network on Tobacco800 and RVL-CDIP datasets show significant improvement in page stream segmentation and document classification, achieving higher F1 scores.
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE
(2023)
Article
Engineering, Industrial
G. Rongen, O. Morales-Napoles, M. Kok
Summary: This study aims to assess the failure probabilities of Dutch dikes and compare them to model results through expert estimation. The research demonstrates that structured expert judgments can be successfully used for estimating the reliability of Dutch flood defenses, despite the presence of uncertainties and overestimated failure probabilities.
RELIABILITY ENGINEERING & SYSTEM SAFETY
(2022)
Article
Mathematical & Computational Biology
Huajie Ye, Cuifeng Li
Summary: Engineering education is based on technical science and aims to train engineers who can utilize science and technology for productive purposes. In recent years, the emergence of new technological advancements has brought about new challenges to engineering education. To meet these challenges, a shift in educational philosophy is necessary, along with a proper understanding and management of various aspects in engineering education. This study introduces the concept of engineering education certification in the context of new infrastructure and explores reform from different perspectives. Experimental analysis shows positive results in the proposed method.
COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE
(2022)
Article
Endocrinology & Metabolism
Joel A. Vanderniet, Vivian Szymczuk, Wolfgang Hogler, Signe S. Beck-Nielsen, Suma Uday, Nadia Merchant, Janet L. Crane, Leanne M. Ward, Alison M. Boyce, Craig F. Munns
Summary: Denosumab is an effective treatment for RANKL-mediated disorders in children and adolescents, although it is not curative and may be used in combination with surgical or other medical treatments. Multidisciplinary planning and expert oversight are necessary to manage the risk of mineral abnormalities. More research is needed to determine optimal treatment regimens and minimize risks.
JOURNAL OF CLINICAL ENDOCRINOLOGY & METABOLISM
(2023)
Article
Environmental Studies
Agnes Balazsi, Juliana Danhardt, Sue Collins, Oliver Schweiger, Josef Settele, Tibor Hartel
Summary: Cultural ecosystem services (CES) are nonmaterial benefits obtained from ecosystems, covering a wide range of domains. European agricultural landscapes are complex social-ecological systems where synergies and trade-offs between production and conservation determine CES values. Experts believe that interdisciplinary approaches and integrative science-policy methodologies are promising to improve CES approach for policy and management, but practical implementation in policies targeting agricultural landscapes still lags behind.
Article
Geosciences, Multidisciplinary
Wan-Yu Shih, Leslie Mabon
Summary: The risk to health from extreme heat has gained attention in scholarship and policy, with demographic and socio-economic factors influencing an individual's susceptibility to extreme heat. Many countries still rely on expert judgments for heat vulnerability assessment, which may not always be evidence-informed and can be influenced by the experts involved.
INTERNATIONAL JOURNAL OF DISASTER RISK REDUCTION
(2021)
Article
Chemistry, Analytical
Ayaz Kafeel, Sumair Aziz, Muhammad Awais, Muhammad Attique Khan, Kamran Afaq, Sahar Ahmed Idris, Hammam Alshazly, Samih M. Mostafa
Summary: Accurate and early detection of machine faults is crucial for industrial preventive maintenance to avoid unexpected downtime and ensure equipment reliability and human safety. This study presents a fault detection system for rotating machines using vibration signal analysis, achieving high accuracy with a hybrid combination of time and spectral features classified by support vector machines.
Article
Sport Sciences
Alexandra H. Roberts, Daniel Greenwood, Mandy Stanley, Clare Humberstone, Fiona Iredale, Annette Raynor
Summary: Coaches primarily rely on intuition in talent identification in sports, which is formed through years of experience, time spent with athletes, and decision context. When selecting athletes, coaches may be more inclined to consider their own ability to improve certain athletes.
JOURNAL OF SPORTS SCIENCES
(2021)
Review
Cardiac & Cardiovascular Systems
Alaide Chieffo, Dariusz Dudek, Christian Hassager, Alain Combes, Mario Gramegna, Sigrun Halvorsen, Kurt Huber, Vijay Kunadian, Jiri Maly, Jacob Eifer Moller, Federico Pappalardo, Giuseppe Tarantini, Guido Tavazzi, Holger Thiele, Christophe Vandenbriele, Nicolas van Mieghem, Pascal Vranckx, Nikos Werner, Susanna Price
Summary: This consensus document summarizes the expert panel's views on the use of short-term percutaneous ventricular assist devices (pVADs) in various clinical settings. pVADs differ in their hemodynamic effects, management, and indications, requiring guidance based on existing evidence and best current practice.
EUROPEAN HEART JOURNAL-ACUTE CARDIOVASCULAR CARE
(2021)
Article
Computer Science, Hardware & Architecture
Dojin Choi, Hyeonbyeong Lee, Kyoungsoo Bok, Jaesoo Yoo
Summary: Researchers establish research directions in new fields through expert advice or papers, but lack expert search services. This paper presents an expert search system based on published papers, calculating expert scores to support researchers' activities.
JOURNAL OF SUPERCOMPUTING
(2021)
Article
Computer Science, Information Systems
Gyu-Hyeon Choi, Jong-Hun Shin, Yo-Han Lee, Young-Kil Kim
Summary: Considerable research has been conducted to improve translation performance by capturing contextual correlation at the document level. The proposed method shows improved translation performance in various translation tasks and benchmark machine translation tasks compared to the state-of-the-art baseline.
Article
Computer Science, Artificial Intelligence
Andres Mafla, Ruben Tito, Sounak Dey, Lluis Gomez, Marcal Rusinol, Ernest Valveny, Dimosthenis Karatzas
Summary: In this study, the task of scene text retrieval is addressed by proposing a single shot CNN architecture for predicting bounding boxes and building compact representations of spotted words. Experimental results demonstrate that the proposed model outperforms previous state-of-the-art while offering significant increase in processing speed and unmatched expressiveness.
PATTERN RECOGNITION
(2021)
Article
Computer Science, Artificial Intelligence
Minesh Mathew, Lluis Gomez, Dimosthenis Karatzas, C. Jawahar
Summary: This work focuses on Question Answering on handwritten document collections, proposing an approach that does not require text recognition. By projecting textual words and word images into a common sub-space, the proposed method can retrieve document snippets potentially containing answers. Results suggest that this approach is suitable for handwritten documents and historical collections.
INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION
(2021)
Article
Computer Science, Artificial Intelligence
Lluis Gomez, Ali Furkan Biten, Ruben Tito, Andres Mafla, Marcal Rusinol, Ernest Valveny, Dimosthenis Karatzas
Summary: The paper introduces a new model for scene text visual question answering which is based on a single attention mechanism and demonstrates competitive performance in two standard datasets. Experimental results show that the model is x5 faster than previous methods at inference time.
PATTERN RECOGNITION LETTERS
(2021)
Proceedings Paper
Computer Science, Artificial Intelligence
Sergi Garcia-Bordils, George Tom, Sangeeth Reddy, Minesh Mathew, Marcal Rusinol, C. Jawahar, Dimosthenis Karatzas
Summary: This paper presents RoadText-3K, a large driving video dataset with fully annotated text, which is three times bigger than its predecessor and contains data from varied geographical locations, unconstrained driving conditions, and multiple languages and scripts. The article also offers a comprehensive analysis of the limitations of state-of-the-art text detection methods and proposes a new tracking model that achieves state-of-the-art results.
DOCUMENT ANALYSIS SYSTEMS, DAS 2022
(2022)
Proceedings Paper
Computer Science, Artificial Intelligence
Josep Brugues i Pujolras, Llufs Gomez i Bigorda, Dimosthenis Karatzas
Summary: Scene Text Visual Question Answering (ST-VQA) is a hot research topic in Computer Vision. Current models have limited performance on multiple languages. This study explores the possibility of obtaining bilingual and multilingual VQA models and demonstrates the performance improvement by using multilingual word embeddings during training.
DOCUMENT ANALYSIS SYSTEMS, DAS 2022
(2022)
Proceedings Paper
Computer Science, Artificial Intelligence
Minesh Mathew, Viraj Bagal, Ruben Tito, Dimosthenis Karatzas, Ernest Valveny, C. Jawahar
Summary: This work explores the automatic understanding of infographic images using a Visual Question Answering technique, and presents a diverse dataset called InfographicVQA. The dataset requires methods to reason over document layout, textual content, graphical elements, and data visualizations. Two Transformer-based baselines are evaluated, but they do not perform as well as humans on the dataset. The study suggests that VQA on infographics can serve as a benchmark for evaluating machine understanding of complex document images.
2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022)
(2022)
Proceedings Paper
Computer Science, Artificial Intelligence
Mohamed Ali Souibgui, Ali Furkan Biten, Sounak Dey, Alicia Fornes, Yousri Kessentini, Lluis Gomez, Dimosthenis Karatzas, Josep Llados
Summary: This paper addresses the challenge of low-resource Handwritten Text Recognition (HTR) by proposing a data generation technique based on Bayesian Program Learning (BPL). Unlike traditional methods, which require a large amount of annotated images, our method can generate human-like handwriting using only one sample of each symbol in the alphabet. Synthetic lines are then created to train state-of-the-art HTR architectures in a segmentation-free fashion. Quantitative and qualitative analyses confirm the effectiveness of the proposed method.
2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022)
(2022)
Proceedings Paper
Computer Science, Artificial Intelligence
Ali Furkan Biten, Lluis Gomez, Dimosthenis Karatzas
Summary: The article discusses object bias (hallucination) in image captioning and presents three simple yet efficient training augmentation methods to reduce it without the need for new data or increased model size. The proposed methods are shown to significantly decrease object bias in the models based on hallucination metrics, and reduce dependency on visual features through experimental demonstration. All code, configuration files, and model weights are available online.
2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022)
(2022)
Proceedings Paper
Computer Science, Artificial Intelligence
Ali Furkan Biten, Andres Mafla, Lluis Gomez, Dimosthenis Karatzas
Summary: The existing datasets for image-text matching task lack the ability to accurately measure semantic relevance. This study proposes two metrics to evaluate the semantic relevance of image-text pairs and introduces a new strategy to improve model performance. Experiments show significant improvements in scenarios with limited training data.
2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022)
(2022)
Proceedings Paper
Computer Science, Information Systems
Ruben Tito, Dimosthenis Karatzas, Ernest Valveny
Summary: Current methods in Document Understanding focus on processing individual documents, while documents are typically organized in collections which provide valuable context for interpretation. To address this issue, DocCVQA introduces a new dataset and task where questions are posed over a whole collection of document images, aiming to provide answers to questions and retrieve the documents containing relevant information. Along with the dataset, a new evaluation metric and baselines are proposed to gain further insights into this new dataset and task.
DOCUMENT ANALYSIS AND RECOGNITION - ICDAR 2021, PT II
(2021)
Proceedings Paper
Computer Science, Artificial Intelligence
Ruben Tito, Minesh Mathew, C. Jawahar, Ernest Valveny, Dimosthenis Karatzas
Summary: The report presents the results of the ICDAR 2021 edition of the Document Visual Question Challenges, including tasks on Infographics VQA and previous tasks. The winning methods performed differently in each task, with the lowest score in the Infographics VQA task. The report also provides detailed descriptions of the datasets used, submitted methods, performance analysis, and progress made in Single Document VQA since 2020.
DOCUMENT ANALYSIS AND RECOGNITION, ICDAR 2021, PT IV
(2021)
Proceedings Paper
Computer Science, Artificial Intelligence
Andres Mafla, Rafael S. Rezende, Lluis Gomez, Diane Larlus, Dimosthenis Karatzas
Summary: This paper introduces a new dataset for cross-modal retrieval involving scene-text instances, proposes approaches leveraging scene text, and conducts experiments to confirm the benefits of utilizing scene text.
2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021
(2021)
Proceedings Paper
Computer Science, Artificial Intelligence
Andres Mafla, Sounak Dey, Ali Furkan Biten, Lluis Gomez, Dimosthenis Karatzas
Summary: By leveraging multi-modal content in the form of visual and textual cues, this study significantly improved the performance of fine-grained image classification and retrieval tasks. The model obtained relationship-enhanced features by learning a common semantic space between salient objects and text found in an image, outperforming previous state-of-the-art in two different tasks.
2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021
(2021)
Proceedings Paper
Computer Science, Artificial Intelligence
Minesh Mathew, Dimosthenis Karatzas, C. Jawahar
Summary: DocVQA is a new dataset for Visual Question Answering on document images, with 50,000 questions defined on 12,000+ images. Analysis shows that existing models perform reasonably well on certain question types, but there is still a large performance gap compared to human performance. Models need to improve on questions where understanding the structure of the document is crucial.
2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021
(2021)
Proceedings Paper
Computer Science, Artificial Intelligence
Klara Janouskova, Jiri Matas, Lluis Gomez, Dimosthenis Karatzas
Summary: The method proposed leverages weakly annotated images to enhance text extraction pipelines, by combining imprecise text transcriptions with weak annotations to generate nearly error-free instances of scene text for training, resulting in consistent improvements in accuracy for state-of-the-art recognition models.
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR)
(2021)