Paper Accepted for SSDBM 2019
by Mohammad Mahdavi and Ziawasch Abedjan was accepted for SSDBM 2019.
Here is the abstract:
In this paper, we propose a new approach to estimate the performance of error detection strategies. Our intuition is that error detection strategies will perform similarly on similarly dirty datasets. We introduce the novel concept of dirtiness profiles, which make datasets comparable with respect to their dirtiness.
Our experiments show that our system REDS accurately estimates the performance of error detection strategies and, solely based on automatically extracted features, outperforms the semi-supervised baseline.