4.5 Article

Collective Graph Identification

Journal

Publisher

ASSOC COMPUTING MACHINERY
DOI: 10.1145/2818378

Keywords

Algorithm; Design; Experimentation; Performance; Entity resolution; link prediction; collective classification; semi-supervised learning

Funding

  1. National Science Foundation [0746930, IIS1218488]
  2. Direct For Computer & Info Scie & Enginr
  3. Div Of Information & Intelligent Systems [0746930] Funding Source: National Science Foundation

Ask authors/readers for more resources

Data describing networks-such as communication networks, transaction networks, disease transmission networks, collaboration networks, etc.-are becoming increasingly available. While observational data can be useful, it often only hints at the actual underlying process that governs interactions and attributes. For example, an email communication network provides insight into its users and their relationships, but is not the same as the real underlying social network. In this article, we introduce the problem of graph identification, i.e., discovering the latent graph structure underlying an observed network. We cast the problem as a probabilistic inference task, in which we must infer the nodes, edges, and node labels of a hidden graph, based on evidence. This entails solving several canonical problems in network analysis: entity resolution (determining when two observations correspond to the same entity), link prediction (inferring the existence of links), and node labeling (inferring hidden attributes). While each of these subproblems has been well studied in isolation, here we consider them as a single, collective task. We present a simple, yet novel, approach to address all three subproblems simultaneously. Our approach, which we refer to as C-3, consists of a collection of Coupled Collective Classifiers that are applied iteratively to propagate inferred information among the subproblems. We consider variants of C-3 using different learning and inference techniques and empirically demonstrate that C-3 is superior, both in terms of predictive accuracy and running time, to state-of-the-art probabilistic approaches on four real problems.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available