4.1 Article

Robust semantic text similarity using LSA, machine learning, and linguistic resources

期刊

LANGUAGE RESOURCES AND EVALUATION
卷 50, 期 1, 页码 125-161

出版社

SPRINGER
DOI: 10.1007/s10579-015-9319-2

关键词

Latent semantic analysis; WordNet; Term alignment; Semantic similarity

资金

  1. US National Science Foundation [1228198, 1250627, 0910838]
  2. Direct For Computer & Info Scie & Enginr
  3. Division Of Computer and Network Systems [1228673] Funding Source: National Science Foundation
  4. Direct For Computer & Info Scie & Enginr
  5. Div Of Information & Intelligent Systems [1250627, 0910838] Funding Source: National Science Foundation
  6. Division Of Computer and Network Systems
  7. Direct For Computer & Info Scie & Enginr [1228198] Funding Source: National Science Foundation

向作者/读者索取更多资源

Semantic textual similarity is a measure of the degree of semantic equivalence between two pieces of text. We describe the SemSim system and its performance in the *SEM 2013 and SemEval-2014 tasks on semantic textual similarity. At the core of our system lies a robust distributional word similarity component that combines latent semantic analysis and machine learning augmented with data from several linguistic resources. We used a simple term alignment algorithm to handle longer pieces of text. Additional wrappers and resources were used to handle task specific challenges that include processing Spanish text, comparing text sequences of different lengths, handling informal words and phrases, and matching words with sense definitions. In the *SEM 2013 task on Semantic Textual Similarity, our best performing system ranked first among the 89 submitted runs. In the SemEval-2014 task on Multilingual Semantic Textual Similarity, we ranked a close second in both the English and Spanish subtasks. In the SemEval-2014 task on Cross-Level Semantic Similarity, we ranked first in Sentence-Phrase, Phrase-Word, and Word-Sense subtasks and second in the Paragraph-Sentence subtask.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.1
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Multidisciplinary Sciences

Multi-qubit correction for quantum annealers

Ramin Ayanzadeh, John Dorband, Milton Halem, Tim Finin

Summary: MQC is a novel postprocessing method for quantum annealers that views the evolution in an open-system as a Gibbs sampler, reducing excited states to new synthetic states with lower energy value. Experimental results show that MQC finds samples with notably lower energy values and improves reproducibility compared to recent hardware/software advances in quantum annealing, such as spin-reversal transforms and classical postprocessing techniques.

SCIENTIFIC REPORTS (2021)

Article Computer Science, Information Systems

The SEMIOTIC Ecosystem: A Semantic Bridge between IoT Devices and Smart Spaces

Roberto Yus, Georgios Bouloukakis, Sharad Mehrotra, Nalini Venkatasubramanian

Summary: Smart space administration and application development face challenges due to the semantic gap between user requirements and IoT device capabilities. The SEMIOTIC ecosystem provides a holistic approach to IoT smart spaces, enabling application development, space management, and service provision. Using a centralized repository and the SEMIOTIC system deployed in each smart space, developers can advertise their applications and interact with them to provide required information, improving reusability and bridging the semantic gap.

ACM TRANSACTIONS ON INTERNET TECHNOLOGY (2022)

Proceedings Paper Computer Science, Artificial Intelligence

One-Shot Federated Group Collaborative Filtering

Maksim E. Eren, Manish Bhattarai, Nicholas Solovyev, Luke E. Richards, Roberto Yus, Charles Nicholas, Boian S. Alexandrov

Summary: This paper presents the first one-shot federated CF implementation, called One-FedCF, to address the privacy problem and communication bottleneck in collaborative filtering. In this approach, clients first apply local CF in parallel to build independent recommenders, then extract global item patterns through joint factorization and build local models through information retrieval transfer.

2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA (2022)

Proceedings Paper Computer Science, Artificial Intelligence

Knowledge Guided Two-player Reinforcement Learning for Cyber Attacks and Defenses

Aritran Piplai, Mike Anoruo, Kayode Fasaye, Anupam Joshi, Tim Finin, Ahmad Ridley

Summary: Cyber defense exercises are crucial for understanding the technical capacity of organizations in facing cyber-threats and discovering unknown vulnerabilities for better defense mechanisms. This paper introduces a two-player game-based reinforcement learning environment that improves the performance of both attacker and defender agents. The convergence of the agents is accelerated through expert knowledge from Cybersecurity Knowledge Graphs.

2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA (2022)

Article Computer Science, Information Systems

JENNER: Just-in-time Enrichment in Query Processing

Dhrubajyoti Ghosh, Peeyush Gupta, Sharad Mehrotra, Roberto Yus, Yasser Altowim

Summary: This study introduces a strategy called JENNER for interactive analytics over incoming data. JENNER progressively improves query answers by exploiting the tradeoffs between cost and quality. Experimental results show that JENNER performs significantly better than naive strategies.

PROCEEDINGS OF THE VLDB ENDOWMENT (2022)

Proceedings Paper Geosciences, Multidisciplinary

QUANTUM-ASSISTED GREEDY ALGORITHMS

Ramin Ayanzadeh, John Dorband, Milton Halem, Tim Finin

Summary: This paper demonstrates how to improve candidate selection in greedy algorithms by leveraging quantum annealers (QAs). By sampling from the ground state of a problem-dependent Hamiltonian using QAs and estimating the probability distribution of problem variables, the proposed quantum-assisted greedy algorithm (QAGA) scheme outperforms state-of-the-art techniques in quantum annealing.

2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022) (2022)

Review Computer Science, Information Systems

Computational Understanding of Narratives: A Survey

Priyanka Ranade, Sanorita Dey, Anupam Joshi, Tim Finin

Summary: Storytelling and the delivery of societal narratives are important for human communication, connection, and understanding. In today's digital age, narratives are conveyed through online mediums such as social media. This shift has made narratives more fragmented and complex, with the potential to influence cultural sentiments, geopolitical events, and more. Therefore, narratives are being used strategically to shape events and promote ideologies. It is crucial to identify and analyze these narratives in order to understand their themes and intentions.

IEEE ACCESS (2022)

Proceedings Paper Computer Science, Artificial Intelligence

Jointly Identifying and Fixing Inconsistent Readings from Information Extraction Systems

Ankur Padia, Francis Ferraro, Tim Finin

Summary: This paper investigates the problem of errors in information extraction systems' outputs and explores methods to detect and correct these errors. The authors contrast consistency with credibility, define and explore consistency and repair tasks, and present a simple yet effective model. Evaluation on three datasets shows consistent improvement in both consistency and repair using a simple MLP model with attention and lexical features.

PROCEEDINGS OF DEEP LEARNING INSIDE OUT (DEELIO 2022): THE 3RD WORKSHOP ON KNOWLEDGE EXTRACTION AND INTEGRATION FOR DEEP LEARNING ARCHITECTURES (2022)

Proceedings Paper Computer Science, Artificial Intelligence

CAPD: A Context-Aware, Policy-Driven Framework for Secure and Resilient IoBT Operations

Sai Sree Laya Chukkapalli, Anupam Joshi, Tim Finin, Robert F. Erbacher

Summary: The Internet of Battlefield Things (IoBT) enhances the operational effectiveness of infantry units by enabling collaboration, secure information sharing, and resilience to attacks. CAPD provides a framework for data and knowledge exchange among autonomous entities, with an IoBT ontology that facilitates controlled information sharing. It enables situational awareness and mitigation of adversary actions, ensuring the resilience of IoBT systems in contested conditions.

ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR MULTI-DOMAIN OPERATIONS APPLICATIONS IV (2022)

Proceedings Paper Computer Science, Information Systems

SmartSPEC: Customizable Smart Space Datasets via Event-driven Simulations

Andrew Chio, Daokun Jiang, Peeyush Gupta, Georgios Bouloukakis, Roberto Yus, Sharad Mehrotra, Nalini Venkatasubramanian

Summary: This paper presents SmartSPEC, an approach to generate customizable smart space datasets using sensorized spaces. It creates a digital representation of a smart space and generates realistic simulated data. The evaluation results show that the trajectories produced by SmartSPEC are more realistic than synthetic data.

2022 IEEE INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING AND COMMUNICATIONS (PERCOM) (2022)

Proceedings Paper Computer Science, Artificial Intelligence

CyBERT: Contextualized Embeddings for the Cybersecurity Domain

Priyanka Ranade, Aritran Piplai, Anupam Joshi, Tim Finin

Summary: CyBERT is a domain-specific BERT model fine-tuned with cybersecurity data, providing high accuracy in performing cybersecurity tasks and offering use-cases in the field of cybersecurity.

2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) (2021)

Proceedings Paper Computer Science, Artificial Intelligence

Generating Fake Cyber Threat Intelligence Using Transformer-Based Models

Priyanka Ranade, Aritran Piplai, Sudip Mittal, Anupam Joshi, Tim Finin

Summary: This paper demonstrates the automatic generation of fake CTI text descriptions using transformers for data poisoning attacks. The attacks result in negative impacts such as incorrect reasoning outputs and disruption of AI-based cyber defense systems.

2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) (2021)

Article Computer Science, Information Systems

A BERT Based Approach to Measure Web Services Policies Compliance With GDPR

Lavanya Elluri, Sai Sree Laya Chukkapalli, Karuna Pande Joshi, Tim Finin, Anupam Joshi

Summary: Data confidentiality is increasingly important, with authorities creating new laws to control how web services data is handled. Web service providers face challenges in complying with evolving regulations across jurisdictions and must update their policies. Comparing web service provider privacy policies with regulatory policies is difficult due to the large and complex nature of regulatory texts.

IEEE ACCESS (2021)

Article Computer Science, Information Systems

Understanding Cybersecurity Threat Trends Through Dynamic Topic Modeling

Jennifer Sleeman, Tim Finin, Milton Halem

Summary: Cybersecurity threats are on the rise and understanding the changing vulnerabilities can help combat new threats. Analyzing cybersecurity document collections through dynamic topic modeling reveals the importance of evolving concepts. Integrating different temporal corpora and representing data in a semantic knowledge graph supports integration, inference, and discovery, enhancing the quality of models.

FRONTIERS IN BIG DATA (2021)

Article Urban Studies

Managing cybersecurity at the grassroots: Evidence from the first nationwide survey of local government cybersecurity

Donald F. Norris, Laura Mateczun, Anupam Joshi, Tim Finin

Summary: This paper examines the management of cybersecurity among local governments in the United States based on the first nationwide survey. The study shows that local governments are largely failing to effectively manage cybersecurity, despite the increasing importance of this function due to constant cyberattacks. Recommendations for improving local government cybersecurity management are provided.

JOURNAL OF URBAN AFFAIRS (2021)

暂无数据