4.8 Article

Skill discrepancies between research, education, and jobs reveal the critical need to supply soft skills for the data economy

Publisher

NATL ACAD SCIENCES
DOI: 10.1073/pnas.1804247115

Keywords

science of science; job market; data mining; visualization; market gap analysis

Funding

  1. NIH [P01 AG039347, U01CA198934]
  2. National Science Foundation (NSF) [1566393, 1839167, 1713567]
  3. Air Force Office of Scientific Research [FA9550-15-1-0162]
  4. NSF [1422902, 1158803]
  5. Direct For Computer & Info Scie & Enginr
  6. Office of Advanced Cyberinfrastructure (OAC) [1566393] Funding Source: National Science Foundation
  7. Direct For Social, Behav & Economic Scie
  8. National Center For S&E Statistics [1422902] Funding Source: National Science Foundation
  9. Division Of Mathematical Sciences
  10. Direct For Mathematical & Physical Scien [1839167] Funding Source: National Science Foundation
  11. Division Of Research On Learning
  12. Direct For Education and Human Resources [1713567] Funding Source: National Science Foundation
  13. SBE Off Of Multidisciplinary Activities
  14. Direct For Social, Behav & Economic Scie [1158803] Funding Source: National Science Foundation

Ask authors/readers for more resources

Rapid research progress in science and technology (S&T) and continuously shifting workforce needs exert pressure on each other and on the educational and training systems that link them. Higher education institutions aim to equip new generations of students with skills and expertise relevant to workforce participation for decades to come, but their offerings sometimes misalign with commercial needs and new techniques forged at the frontiers of research. Here, we analyze and visualize the dynamic skill (mis-) alignment between academic push, industry pull, and educational offerings, paying special attention to the rapidly emerging areas of data science and data engineering (DS/DE). The visualizations and computational models presented here can help key decision makers understand the evolving structure of skills so that they can craft educational programs that serve workforce needs. Our study uses millions of publications, course syllabi, and job advertisements published between 2010 and 2016. We show how courses mediate between research and jobs. We also discover responsiveness in the academic, educational, and industrial system in how skill demands from industry are as likely to drive skill attention in research as the converse. Finally, we reveal the increasing importance of uniquely human skills, such as communication, negotiation, and persuasion. These skills are currently underexamined in research and undersupplied through education for the labor market. In an increasingly data-driven economy, the demand for soft social skills, like teamwork and communication, increase with greater demand for hard technical skills and tools.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Computer Science, Artificial Intelligence

Modeling Health Stage Development of Patients With Dynamic Attributed Graphs in Online Health Communities

Yuyang Gao, Tanmoy Chowdhury, Lingfei Wu, Liang Zhao

Summary: In this paper, a novel DynAttGraph2Seq framework is proposed to model the complex dynamic transitions of a user's activities and textual information in online health forums over time, and their correspondence to the user's health stage. The proposed model includes a dynamic graph encoder, a two-level sequential encoder, and an interpretable sequence decoder to capture the semantic features and learn the mapping between user activity graphs, user posts, and target health stages. New dynamic graph regularization and hierarchical attention mechanisms are also proposed to enhance the interpretability. Experimental analysis demonstrates the effectiveness and interpretability of the proposed models for health stage prediction.

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING (2023)

Article Sociology

Sparse Data Reconstruction, Missing Value and Multiple Imputation through Matrix Factorization

Nandana Sengupta, Madeleine Udell, Nathan Srebro, James Evans

Summary: Social science approaches can predict missing values in dense data sets. The authors propose a matrix factorization approach to impute missing data by identifying underlying factors and reducing their overinfluence for optimal data reconstruction. This approach is useful for sparse data sets with numerous features, such as historical sources or online surveys. The authors demonstrate the consistency of matrix factorization techniques with Rubin's multiple imputation framework and recommend their use in situations where Boolean or categorical data are involved and a large proportion of the data is missing.

SOCIOLOGICAL METHODOLOGY (2023)

Article Computer Science, Artificial Intelligence

Multilevel Graph Matching Networks for Deep Graph Similarity Learning

Xiang Ling, Lingfei Wu, Saizhuo Wang, Tengfei Ma, Fangli Xu, Alex X. Liu, Chunming Wu, Shouling Ji

Summary: This article proposes a multilevel graph matching network (MGMN) for computing the graph similarity between graph-structured objects. The MGMN consists of a node-graph matching network (NGMN) and a siamese GNN, which effectively learn cross-level and global-level interactions respectively. Experimental results demonstrate that the MGMN outperforms state-of-the-art baseline models on both graph-graph classification and regression tasks, and exhibits stronger robustness as the sizes of the input graphs increase.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2023)

Article Computer Science, Artificial Intelligence

Graph Neural Networks for Natural Language Processing: A Survey

Lingfei Wu, Yu Chen, Kai Shen, Xiaojie Guo, Hanning Gao, Shucheng Li, Jian Pei, Bo Long

Summary: Deep learning is widely used in Natural Language Processing (NLP) and there has been a growing interest in applying graph neural networks (GNNs) to NLP tasks. This survey provides a comprehensive overview of GNNs for NLP, including a new taxonomy and organization of existing research in this field. The survey also presents various NLP applications that utilize GNNs, along with benchmark datasets, evaluation metrics, and open-source codes. Additionally, it discusses challenges and future research directions for maximizing the potential of GNNs in NLP.

FOUNDATIONS AND TRENDS IN MACHINE LEARNING (2023)

Article Computer Science, Artificial Intelligence

Intelligent online selling point extraction and generation for e-commerce recommendation

Xiaojie Guo, Shugen Wang, Hanqing Zhao, Shiliang Diao, Jiajia Chen, Zhuoye Ding, Zhen He, Jianchao Lu, Yun Xiao, Bo Long, Han Yu, Lingfei Wu

Summary: Significant advancements have been made in automatic product description generation in the past decade. With the increasing diversification of services provided by e-commerce platforms, it is necessary to adapt the patterns of generated descriptions dynamically. The selling points of products, which are usually written by human experts, can be automatically generated by machines, reducing costs and increasing efficiency.

AI MAGAZINE (2023)

Article Computer Science, Artificial Intelligence

Automatic product copywriting for e-commerce

Yanyan Zou, Xueying Zhang, Jing Zhou, Shiliang Diao, Jiajia Chen, Zhuoye Ding, Zhen He, Xueqi He, Yun Xiao, Bo Long, Mian Ma, Sulong Xu, Han Yu, Lingfei Wu

Summary: Product copywriting is essential for e-commerce recommendation platforms, aiming to attract users and enhance their experience through textual descriptions of product characteristics. This paper presents the experience of deploying an Automatic Product Copywriting Generation (APCG) system in JD.com's e-commerce recommendation platform. The system consists of a natural language generation component based on a transformer-pointer network and pretrained sequence-to-sequence model, as well as a copywriting quality control component using automatic evaluation and human screening. The APCG system has been successfully deployed on JD.com since February 2021, generating 2.53 million product descriptions and significantly improving click-through rate (CTR) and conversion rate (CVR) compared to baselines.

AI MAGAZINE (2023)

Article Multidisciplinary Sciences

Surprising combinations of research contents and contexts are related to impact and emerge with scientific outsiders from distant disciplines

Feng Shi, James Evans

Summary: This study investigates the relationship between breakthroughs and surprising advances in science and technology, as well as how these breakthroughs occur. By analyzing millions of research papers and patents in life sciences, physical sciences, and patented inventions, and using a hypergraph model, the authors show that unexpected combinations of research contents and contexts can predict significant impact. These surprising breakthroughs often occur when scientists from different fields publish problem-solving results. This research helps to understand the frontier of science and technology and provides a measure of innovation in these fields.

NATURE COMMUNICATIONS (2023)

Article Multidisciplinary Sciences

Specimen, biological structure, and spatial ontologies in support of a Human Reference Atlas

Bruce W. Herr, Josef Hardi, Ellen M. Quardokus, Andreas Bueckle, Lu Chen, Fusheng Wang, Anita R. Caron, David Osumi-Sutherland, Mark A. Musen, Katy Borner

Summary: The Human Reference Atlas (HRA) is a comprehensive 3D atlas of all cells in the healthy human body. It is created by international experts who use standard terminologies linked to 3D reference objects to describe anatomical structures. The latest release of HRA (v1.2) provides spatial reference data and ontology annotations for 26 organs. Experts access HRA annotations through spreadsheets and view reference object models in 3D editing tools.

SCIENTIFIC DATA (2023)

Article Computer Science, Artificial Intelligence

Deep Graph Translation

Xiaojie Guo, Lingfei Wu, Liang Zhao

Summary: This article introduces a novel graph-translation-generative-adversarial-nets (GT-GAN) model that can transform source graphs into target output graphs. The model utilizes innovative graph convolution and deconvolution layers to learn the translation mapping and a conditional graph discriminator for classification. Extensive experiments demonstrate that GT-GAN outperforms other methods in terms of both effectiveness and scalability across various domains.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2023)

Article Multidisciplinary Sciences

Anatomical structures, cell types, and biomarkers of the healthy human blood vasculature

Avinash Boppana, Sujin Lee, Rajeev Malhotra, Marc Halushka, Katherine S. Gustilo, Ellen M. Quardokus, Bruce W. W. Herr II, Katy Borner, Griffin M. Weber

Summary: More than 150 scientists from 17 consortia are collaborating on an international project to build a Human Reference Atlas, which maps all 37 trillion cells in the healthy adult human body. The project has released the first open and comprehensive database of the adult human blood vasculature, called the Human Reference Atlas-Vasculature Common Coordinate Framework (HRA-VCCF). It includes 993 vessels and their branching connections, 10 cell types, and 10 biomarkers.

SCIENTIFIC DATA (2023)

Article Education & Educational Research

Museum Visitor Comfort When Sharing Personal Information for Evaluation

Justin Reeves Meyer, Joe E. E. Heimlich, E. Elaine T. Horr, Rebecca F. F. Kemper, Katy Borner

Summary: The article explores the conditions and methods in which museum visitors feel comfortable or uncomfortable sharing sensitive information. The study is grounded in literature on personal information sharing, both in person and online. Interviewing 114 science center visitors, the researchers found that age and gender play important roles in whether or not individuals feel comfortable sharing information in public or through digital questionnaires. The article also discusses the ethical importance of giving visitors choices in sharing their information for evaluation purposes.

JOURNAL OF MUSEUM EDUCATION (2023)

Article Computer Science, Artificial Intelligence

TrustGNN: Graph Neural Network-Based Trust Evaluation via Learnable Propagative and Composable Nature

Cuiying Huo, Dongxiao He, Chundong Liang, Di Jin, Tie Qiu, Lingfei Wu

Summary: In this work, we propose a new GNN-based trust evaluation method named TrustGNN, which integrates the propagative and composable nature of trust graphs into a GNN framework for better trust evaluation. TrustGNN designs specific propagative patterns for different propagative processes of trust, and distinguishes the contribution of different propagative processes to create new trust. Experiments show that TrustGNN significantly outperforms the state-of-the-art methods on widely-used real-world datasets.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2023)

Article Computer Science, Artificial Intelligence

Toward Subgraph-Guided Knowledge Graph Question Generation With Graph Neural Networks

Yu Chen, Lingfei Wu, Mohammed J. Zaki

Summary: Knowledge graph (KG) question generation aims to generate natural language questions from KGs and target answers. Previous works mostly focus on generating questions from a single KG triple, while this work focuses on generating questions from a KG subgraph and target answers. A bidirectional Graph2Seq model is proposed to encode the KG subgraph, and an enhanced RNN decoder allows direct copying of node attributes for question generation. The model achieves state-of-the-art scores on QG benchmarks and consistently benefits the question-answering (QA) task.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2023)

Proceedings Paper Computer Science, Artificial Intelligence

In her Shoes: Gendered Labelling in Crowdsourced Safety Perceptions Data from India

Nandana Sengupta, Ashwini Vaidya, James Evans

Summary: A proliferation of women's safety mobile applications in India has generated "safety maps" by crowdsourcing street safety perceptions. However, the differential access to information and communication technologies (ICTs) between men and women, as well as the distinctions in their social and cultural experiences, may affect the value and predictive ability of machine learning models utilizing crowdsourced safety perceptions data. By analyzing data from New Delhi, this study reveals significant gender differences in safety perceptions and associated vocabularies, highlighting the implications for the design of platforms relying on crowdsourced data.

PROCEEDINGS OF THE 6TH ACM CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY, FACCT 2023 (2023)

Proceedings Paper Computer Science, Information Systems

Understanding Contexts and Challenges of Information Management for Epilepsy Care

Aehong Min, Wendy R. Miller, Luis M. Rocha, Katy Borner, Rion Brattig Correia, Patrick C. Shih

Summary: This article explores the challenges faced by patients with epilepsy and their caregivers in managing epilepsy-related information, and proposes a framework to address these issues.

PROCEEDINGS OF THE 2023 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, CHI 2023 (2023)

No Data Available