4.4 Article

Towards certain fixes with editing rules and master data

期刊

VLDB JOURNAL
卷 21, 期 2, 页码 213-238

出版社

SPRINGER
DOI: 10.1007/s00778-011-0253-7

关键词

Certain fix; Editing rule; Master data; Data cleaning; Data quality

资金

  1. RSE-NSFC
  2. IBM
  3. National Basic Research Program of China (973 Program) [2012CB316200]
  4. NGFR [973 2011CB302602]
  5. NSFC [90818028, 60903149]
  6. Engineering and Physical Sciences Research Council [EP/H008063/1, EP/E029213/1] Funding Source: researchfish
  7. EPSRC [EP/H008063/1, EP/E029213/1] Funding Source: UKRI

向作者/读者索取更多资源

A variety of integrity constraints have been studied for data cleaning. While these constraints can detect the presence of errors, they fall short of guiding us to correct the errors. Indeed, data repairing based on these constraints may not find certain fixes that are guaranteed correct, and worse still, may even introduce new errors when attempting to repair the data. We propose a method for finding certain fixes, based on master data, a notion of certain regions, and a class of editing rules. A certain region is a set of attributes that are assured correct by the users. Given a certain region and master data, editing rules tell us what attributes to fix and how to update them. We show how the method can be used in data monitoring and enrichment. We also develop techniques for reasoning about editing rules, to decide whether they lead to a unique fix and whether they are able to fix all the attributes in a tuple, relative to master data and a certain region. Furthermore, we present a framework and an algorithm to find certain fixes, by interacting with the users to ensure that one of the certain regions is correct. We experimentally verify the effectiveness and scalability of the algorithm.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.4
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Computer Science, Hardware & Architecture

Virtual Network Mapping in Cloud Computing: A Graph Pattern Matching Approach

Yang Cao, Wenfei Fan, Shuai Ma

COMPUTER JOURNAL (2017)

Article Computer Science, Information Systems

Bounded Query Rewriting Using Views

Yang Cao, Wenfei Fan, Floris Geerts, Ping Lu

ACM TRANSACTIONS ON DATABASE SYSTEMS (2018)

Article Computer Science, Information Systems

From Think Parallel to Think Sequential

Wenfei Fan, Yang Cao, Jingbo Xu, Wenyuan Yu, Yinghui Wu, Chao Tian, Jiaxin Jiang, Bohan Zhang

SIGMOD RECORD (2018)

Article Computer Science, Information Systems

Parallelizing Sequential Graph Computations

Wenfei Fan, Wenyuan Yu, Jingbo Xu, Jingren Zhou, Xiaojian Luo, Qiang Yin, Ping Lu, Yang Cao, Ruiqi Xu

ACM TRANSACTIONS ON DATABASE SYSTEMS (2018)

Article Computer Science, Information Systems

Dependencies for Graphs

Wenfei Fan, Ping Lu

ACM TRANSACTIONS ON DATABASE SYSTEMS (2019)

Article Multidisciplinary Sciences

Making big data small

Wenfei Fan

PROCEEDINGS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES (2019)

Article Automation & Control Systems

Bounded Evaluation: Querying Big Data with Bounded Resources

Yang Cao, Wen-Fei Fan, Teng-Fei Yuan

INTERNATIONAL JOURNAL OF AUTOMATION AND COMPUTING (2020)

Article Computer Science, Information Systems

Dynamic Scaling for Parallel Graph Computations

Wenfei Fan, Chunming Hu, Muyang Liu, Ping Lu, Qiang Yin, Jingren Zhou

PROCEEDINGS OF THE VLDB ENDOWMENT (2019)

Article Computer Science, Information Systems

Block as a Value for SQL over NoSQL

Yang Cao, Wenfei Fan, Tengfei Yuan

PROCEEDINGS OF THE VLDB ENDOWMENT (2019)

Article Computer Science, Information Systems

Deducing Certain Fixes to Graphs

Wenfei Fan, Ping Lu, Chao Tian, Jingren Zhou

PROCEEDINGS OF THE VLDB ENDOWMENT (2019)

Proceedings Paper Computer Science, Information Systems

Parallel Reasoning of Graph Functional Dependencies

Wenfei Fan, Xueli Liu, Yingjie Cao

2018 IEEE 34TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE) (2018)

Proceedings Paper Computer Science, Information Systems

Catching Numeric Inconsistencies in Graphs

Wenfei Fan, Xueli Liu, Ping Lu, Chao Tian

SIGMOD'18: PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA (2018)

Proceedings Paper Computer Science, Information Systems

Discovering Graph Functional Dependencies

Wenfei Fan, Chunming Hu, Xueli Liu, Ping Lu

SIGMOD'18: PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA (2018)

Proceedings Paper Computer Science, Information Systems

Adaptive Asynchronous Parallelization of Graph Algorithms

Wenfei Fan, Ping Lu, Xiaojian Luo, Jingbo Xu, Qiang Yin, Wenyuan Yu, Ruiqi Xu

SIGMOD'18: PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA (2018)

Proceedings Paper Computer Science, Information Systems

Incremental Graph Computations: Doable and Undoable

Wenfei Fan, Chunming Hu, Chao Tian

SIGMOD'17: PROCEEDINGS OF THE 2017 ACM INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA (2017)

暂无数据