Inhalt des Dokuments
Example-Driven Error Detection
Traditional error
detection approaches require user-defined parameters and rules. Thus,
the user has to know both the error detection system and the data.
However, we can also formulate error detection as a semi-supervised
classification problem that only requires domain expertise. The
challenges for such an approach are twofold: (1) to represent the data
in a way that enables a classification model to identify various kinds
of data errors across different data types, and (2) to pick the most
promising data values for learning.
We developed an active learning-based system called ED2 that
achieves state-of-the-art error detection accuracy without any
configuration while requiring only a small fraction of user labels:
github.com/BigDaMa/ExampleDrivenErrorDetection
[1].
ion
Zusatzinformationen / Extras
Quick Access:
Schnellnavigation zur Seite über Nummerneingabe
Auxiliary Functions
Copyright TU Berlin 2008