Journal
COMPUTATIONAL LINGUISTICS
Volume 40, Issue 2, Pages 403-448Publisher
MIT PRESS
DOI: 10.1162/COLI_a_00176
Keywords
-
Categories
Funding
- Ministry of Education in Taiwan
Ask authors/readers for more resources
Linguistic steganography is concerned with hiding information in natural language text. One of the major transformations used in linguistic steganography is synonym substitution. However, few existing studies have studied the practical application of this approach. In this article we propose two improvements to the use of synonym substitution for encoding hidden bits of information. First, we use the Google n-gram corpus for checking the applicability of a synonym in context, and we evaluate this method using data from the SemEval lexical substitution task and human annotated data. Second, we address the problem that arises from words with more than one sense, which creates a potential ambiguity in terms of which bits are represented by a particular word. We develop a novel method in which words are the vertices in a graph, synonyms are linked by edges, and the bits assigned to a word are determined by a vertex coding algorithm. This method ensures that each word represents a unique sequence of bits, without cutting out large numbers of synonyms, and thus maintains a reasonable embedding capacity.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available