Article
Engineering, Industrial
Guoyan Li, Chenxi Yuan, Sagar Kamarthi, Mohsen Moghaddam, Xiaoning Jin
Summary: This study analyzes the data science and analytics (DSA) skills gap in today's manufacturing workforce, providing insights into the critical technical skills and domain knowledge required for data science and intelligent manufacturing-related jobs. These insights will be helpful for training the next generation manufacturing workforce.
JOURNAL OF MANUFACTURING SYSTEMS
(2021)
Article
Computer Science, Software Engineering
Anamaria Crisan, Brittany Fiore-Gartland, Melanie Tory
Summary: Data science is a rapidly growing discipline and organizations increasingly depend on data science work. Researchers have synthesized a comprehensive model describing data science work and breaking down data scientists into nine distinct roles. They hope this will help visualization researchers discover innovative opportunities to impact data science work.
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS
(2021)
Article
Biochemical Research Methods
Mingzhe Xie, Ludong Yang, Gennong Chen, Yan Wang, Zhi Xie, Hongwei Wang
Summary: RiboChat is an interactive web platform for analyzing and annotating Ribo-seq data, which enables convenient decoding of translation information embedded within the data. It features a user-friendly interface and a cloud computing service, utilizing a chat conversation style to analyze data and find the best-matching analytics module.
BRIEFINGS IN BIOINFORMATICS
(2022)
Article
Business
Naif Radi Aljohani, Ahtisham Aslam, Alaa O. Khadidos, Saeed-Ul Hassan
Summary: This research provides a comprehensive, first-of-its-kind, in-depth, data-driven analysis of the discussions on curriculum alignment in the light of learned skills and acquired skills, illustrating the importance of bibliometric analysis in understanding scholarly contributions.
JOURNAL OF INNOVATION & KNOWLEDGE
(2022)
Article
Computer Science, Software Engineering
Yi Guo, Shunan Guo, Zhuochen Jin, Smiti Kaul, David Gotz, Nan Cao
Summary: This paper reviews the state-of-the-art visual analytics approaches for event sequence data and categorizes them based on analytical tasks and applications. The authors also identify several remaining research challenges and future research opportunities.
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS
(2022)
Review
Physiology
Lena Baum, Marco Johns, Maija Poikela, Ralf Moeller, Bharath Ananthasubramaniam, Fabian Prasser
Summary: Data integration, data sharing, and standardized analyses are crucial for data-driven medical research, especially in the emerging field of circadian medicine. The multimodal datasets in circadian medicine require informatics solutions for integrating and visualizing diverse data types at various temporal resolutions. Challenges include the lack of standards for representing all required data, data storage issues, integrated visualization transformations, and privacy concerns. Specialized approaches are needed for downstream analysis of circadian rhythms. Overall, circadian medicine research provides an ideal environment for developing innovative methods to address challenges related to multimodal multidimensional biomedical data.
Article
Engineering, Environmental
Edoardo Ramalli, Timoteo Dinelli, Andrea Nobili, Alessandro Stagni, Barbara Pernici, Tiziano Faravelli
Summary: Validation and analysis of experiments and models are crucial in various engineering fields. This study proposes a systematic and automated methodology that utilizes the concept of a 'data ecosystem' to provide comprehensive insights about experiments and predictive models. The methodology focuses on data assessment, model performance measurement, and behavior insight extraction through data science techniques. It can be applied to different domains where predictive models are validated against big data in chemical engineering.
CHEMICAL ENGINEERING JOURNAL
(2023)
Article
Green & Sustainable Science & Technology
Kyungho Song, Hyun Kim, Jisoo Cha, Taedong Lee
Summary: The study explored the matching degree between green job supply and demand through big data analysis, revealing that green jobs are concentrated in Seoul and Gyeounggi-do metropolitan areas, with a high number of water- and air-quality-related jobs. However, job searches in the water quality sector outnumbered job openings, indicating the need for green job creation policies to consider timing, regional, and sectoral demand and supply data.
Article
Mathematics
Bogdan Walek, Ondrej Pektor
Summary: The article proposes an approach to mine data from job advertisements, mainly focusing on job requirements. Through verification, the system identified all job requirements in 80% of analyzed advertisements, and also created a list of the most frequent job skills.
Article
Computer Science, Software Engineering
Anton Yeshchenko, Claudio Di Ciccio, Jan Mendling, Artem Polyvyanyy
Summary: In this study, we tackle the challenge of visual analysis of drift phenomena in processes that change over time. We present a system for fine-granular process drift detection and visualizations, which outperforms state-of-the-art methods on synthetic and real-world data. Additionally, our user study confirms the usability and usefulness of our visualizations.
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS
(2022)
Article
Environmental Sciences
Kumari Anjali, Renji Remesan
Summary: This study provides a bibliometric review of the impact of coal mining in India, with a focus on environmental, particularly water-related impacts. The findings show an increasing publication trend, with significant contributions from the Indian Institute of Technology (Indian School of Mines). Pollution issues related to the Jharia coalfield are a major research focus, while studies on quantifying coal mining-induced changes in water regimes at river basin scales are lacking.
ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH
(2023)
Article
Computer Science, Artificial Intelligence
Fatih Gurcan
Summary: This article analyzes a data science-focused Q&A website to uncover the main topics and difficulties in the field. It identifies the most popular and challenging topics, as well as the commonly used tasks, techniques, and tools. These findings are significant for advancing data-driven architectures and promoting relevant technologies and tools.
PEERJ COMPUTER SCIENCE
(2023)
Article
Ecology
Omar Alminagorta, Charlie J. G. Loewen, Derrick T. de Kerckhove, Donald A. Jackson, Cindy Chu
Summary: Exploratory analysis of biological communities and their environmental factors requires specialized tools like parallel coordinates to visualize and explore multivariate data and generate hypotheses about causal relationships. Through two case studies in Canada, the utility and novelty of parallel coordinates in ecology were demonstrated, offering ecologists a practical alternative for visualizing and exploring multivariate data.
ECOLOGICAL INFORMATICS
(2021)
Article
Engineering, Industrial
Xiaodong Li, Yifan Fei, Tracey E. Rizzuto, Fan Yang
Summary: This study analyzed the core factors leading to chronic stress among construction project managers (CPMs), finding that organizational and social factors have significant impacts on job burnout, high job demands can improve professional efficacy, and explored the reversed influences on individual behavior and perceptions after the formation of job burnout.
Article
Biochemistry & Molecular Biology
Cui-Xia Chen, Li-Na Sun, Xue-Xin Hou, Peng-Cheng Du, Xiao-Long Wang, Xiao-Chen Du, Yu-Fei Yu, Rui-Kun Cai, Lei Yu, Tian-Jun Li, Min-Na Luo, Yue Shen, Chao Lu, Qian Li, Chuan Zhang, Hua-Fang Gao, Xu Ma, Hao Lin, Zong-Fu Cao
Summary: This study presents a new genome information visualization analysis process framework based on big data mining technology for infectious disease prevention and control. Through experiments on four infectious pathogens, the framework provides insights from evolution, genome structure, virulence factors, and resistance genes, contributing to rapid pathogen identification and recommending strains for further research.
FRONTIERS IN MOLECULAR BIOSCIENCES
(2021)
Article
Computer Science, Artificial Intelligence
Yuyang Gao, Tanmoy Chowdhury, Lingfei Wu, Liang Zhao
Summary: In this paper, a novel DynAttGraph2Seq framework is proposed to model the complex dynamic transitions of a user's activities and textual information in online health forums over time, and their correspondence to the user's health stage. The proposed model includes a dynamic graph encoder, a two-level sequential encoder, and an interpretable sequence decoder to capture the semantic features and learn the mapping between user activity graphs, user posts, and target health stages. New dynamic graph regularization and hierarchical attention mechanisms are also proposed to enhance the interpretability. Experimental analysis demonstrates the effectiveness and interpretability of the proposed models for health stage prediction.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
(2023)
Article
Sociology
Nandana Sengupta, Madeleine Udell, Nathan Srebro, James Evans
Summary: Social science approaches can predict missing values in dense data sets. The authors propose a matrix factorization approach to impute missing data by identifying underlying factors and reducing their overinfluence for optimal data reconstruction. This approach is useful for sparse data sets with numerous features, such as historical sources or online surveys. The authors demonstrate the consistency of matrix factorization techniques with Rubin's multiple imputation framework and recommend their use in situations where Boolean or categorical data are involved and a large proportion of the data is missing.
SOCIOLOGICAL METHODOLOGY
(2023)
Article
Computer Science, Artificial Intelligence
Xiang Ling, Lingfei Wu, Saizhuo Wang, Tengfei Ma, Fangli Xu, Alex X. Liu, Chunming Wu, Shouling Ji
Summary: This article proposes a multilevel graph matching network (MGMN) for computing the graph similarity between graph-structured objects. The MGMN consists of a node-graph matching network (NGMN) and a siamese GNN, which effectively learn cross-level and global-level interactions respectively. Experimental results demonstrate that the MGMN outperforms state-of-the-art baseline models on both graph-graph classification and regression tasks, and exhibits stronger robustness as the sizes of the input graphs increase.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
(2023)
Article
Computer Science, Artificial Intelligence
Lingfei Wu, Yu Chen, Kai Shen, Xiaojie Guo, Hanning Gao, Shucheng Li, Jian Pei, Bo Long
Summary: Deep learning is widely used in Natural Language Processing (NLP) and there has been a growing interest in applying graph neural networks (GNNs) to NLP tasks. This survey provides a comprehensive overview of GNNs for NLP, including a new taxonomy and organization of existing research in this field. The survey also presents various NLP applications that utilize GNNs, along with benchmark datasets, evaluation metrics, and open-source codes. Additionally, it discusses challenges and future research directions for maximizing the potential of GNNs in NLP.
FOUNDATIONS AND TRENDS IN MACHINE LEARNING
(2023)
Article
Computer Science, Artificial Intelligence
Xiaojie Guo, Shugen Wang, Hanqing Zhao, Shiliang Diao, Jiajia Chen, Zhuoye Ding, Zhen He, Jianchao Lu, Yun Xiao, Bo Long, Han Yu, Lingfei Wu
Summary: Significant advancements have been made in automatic product description generation in the past decade. With the increasing diversification of services provided by e-commerce platforms, it is necessary to adapt the patterns of generated descriptions dynamically. The selling points of products, which are usually written by human experts, can be automatically generated by machines, reducing costs and increasing efficiency.
Article
Computer Science, Artificial Intelligence
Yanyan Zou, Xueying Zhang, Jing Zhou, Shiliang Diao, Jiajia Chen, Zhuoye Ding, Zhen He, Xueqi He, Yun Xiao, Bo Long, Mian Ma, Sulong Xu, Han Yu, Lingfei Wu
Summary: Product copywriting is essential for e-commerce recommendation platforms, aiming to attract users and enhance their experience through textual descriptions of product characteristics. This paper presents the experience of deploying an Automatic Product Copywriting Generation (APCG) system in JD.com's e-commerce recommendation platform. The system consists of a natural language generation component based on a transformer-pointer network and pretrained sequence-to-sequence model, as well as a copywriting quality control component using automatic evaluation and human screening. The APCG system has been successfully deployed on JD.com since February 2021, generating 2.53 million product descriptions and significantly improving click-through rate (CTR) and conversion rate (CVR) compared to baselines.
Article
Multidisciplinary Sciences
Feng Shi, James Evans
Summary: This study investigates the relationship between breakthroughs and surprising advances in science and technology, as well as how these breakthroughs occur. By analyzing millions of research papers and patents in life sciences, physical sciences, and patented inventions, and using a hypergraph model, the authors show that unexpected combinations of research contents and contexts can predict significant impact. These surprising breakthroughs often occur when scientists from different fields publish problem-solving results. This research helps to understand the frontier of science and technology and provides a measure of innovation in these fields.
NATURE COMMUNICATIONS
(2023)
Article
Multidisciplinary Sciences
Bruce W. Herr, Josef Hardi, Ellen M. Quardokus, Andreas Bueckle, Lu Chen, Fusheng Wang, Anita R. Caron, David Osumi-Sutherland, Mark A. Musen, Katy Borner
Summary: The Human Reference Atlas (HRA) is a comprehensive 3D atlas of all cells in the healthy human body. It is created by international experts who use standard terminologies linked to 3D reference objects to describe anatomical structures. The latest release of HRA (v1.2) provides spatial reference data and ontology annotations for 26 organs. Experts access HRA annotations through spreadsheets and view reference object models in 3D editing tools.
Article
Computer Science, Artificial Intelligence
Xiaojie Guo, Lingfei Wu, Liang Zhao
Summary: This article introduces a novel graph-translation-generative-adversarial-nets (GT-GAN) model that can transform source graphs into target output graphs. The model utilizes innovative graph convolution and deconvolution layers to learn the translation mapping and a conditional graph discriminator for classification. Extensive experiments demonstrate that GT-GAN outperforms other methods in terms of both effectiveness and scalability across various domains.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
(2023)
Article
Multidisciplinary Sciences
Avinash Boppana, Sujin Lee, Rajeev Malhotra, Marc Halushka, Katherine S. Gustilo, Ellen M. Quardokus, Bruce W. W. Herr II, Katy Borner, Griffin M. Weber
Summary: More than 150 scientists from 17 consortia are collaborating on an international project to build a Human Reference Atlas, which maps all 37 trillion cells in the healthy adult human body. The project has released the first open and comprehensive database of the adult human blood vasculature, called the Human Reference Atlas-Vasculature Common Coordinate Framework (HRA-VCCF). It includes 993 vessels and their branching connections, 10 cell types, and 10 biomarkers.
Article
Education & Educational Research
Justin Reeves Meyer, Joe E. E. Heimlich, E. Elaine T. Horr, Rebecca F. F. Kemper, Katy Borner
Summary: The article explores the conditions and methods in which museum visitors feel comfortable or uncomfortable sharing sensitive information. The study is grounded in literature on personal information sharing, both in person and online. Interviewing 114 science center visitors, the researchers found that age and gender play important roles in whether or not individuals feel comfortable sharing information in public or through digital questionnaires. The article also discusses the ethical importance of giving visitors choices in sharing their information for evaluation purposes.
JOURNAL OF MUSEUM EDUCATION
(2023)
Article
Computer Science, Artificial Intelligence
Cuiying Huo, Dongxiao He, Chundong Liang, Di Jin, Tie Qiu, Lingfei Wu
Summary: In this work, we propose a new GNN-based trust evaluation method named TrustGNN, which integrates the propagative and composable nature of trust graphs into a GNN framework for better trust evaluation. TrustGNN designs specific propagative patterns for different propagative processes of trust, and distinguishes the contribution of different propagative processes to create new trust. Experiments show that TrustGNN significantly outperforms the state-of-the-art methods on widely-used real-world datasets.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
(2023)
Article
Computer Science, Artificial Intelligence
Yu Chen, Lingfei Wu, Mohammed J. Zaki
Summary: Knowledge graph (KG) question generation aims to generate natural language questions from KGs and target answers. Previous works mostly focus on generating questions from a single KG triple, while this work focuses on generating questions from a KG subgraph and target answers. A bidirectional Graph2Seq model is proposed to encode the KG subgraph, and an enhanced RNN decoder allows direct copying of node attributes for question generation. The model achieves state-of-the-art scores on QG benchmarks and consistently benefits the question-answering (QA) task.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
(2023)
Proceedings Paper
Computer Science, Artificial Intelligence
Nandana Sengupta, Ashwini Vaidya, James Evans
Summary: A proliferation of women's safety mobile applications in India has generated "safety maps" by crowdsourcing street safety perceptions. However, the differential access to information and communication technologies (ICTs) between men and women, as well as the distinctions in their social and cultural experiences, may affect the value and predictive ability of machine learning models utilizing crowdsourced safety perceptions data. By analyzing data from New Delhi, this study reveals significant gender differences in safety perceptions and associated vocabularies, highlighting the implications for the design of platforms relying on crowdsourced data.
PROCEEDINGS OF THE 6TH ACM CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY, FACCT 2023
(2023)
Proceedings Paper
Computer Science, Information Systems
Aehong Min, Wendy R. Miller, Luis M. Rocha, Katy Borner, Rion Brattig Correia, Patrick C. Shih
Summary: This article explores the challenges faced by patients with epilepsy and their caregivers in managing epilepsy-related information, and proposes a framework to address these issues.
PROCEEDINGS OF THE 2023 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, CHI 2023
(2023)