Inhalt des Dokuments
Paper accepted for the ACM Journal of Data and Information Quality (JDIQ)
The paper "Anatomy of Metadata for Data
Curation" by Larysa Visengeriyeva and Ziawasch Abedjan has been
accepted for publication in the ACM Journal of Data and Information
Here is the abstract:
"Real-world datasets often suffer from various data quality problems. Several data cleaning solutions have been proposed so far. However, data cleaning remains a manual and iterative task that requires domain and technical expertise. Exploiting metadata promises to improve the tedious process of data preparation because data errors are detectable through metadata. This paper investigates the intrinsic connection between metadata and data errors. In this work, we establish a mapping that reflects the connection between data quality issues and extractable metadata using qualitative and quantitative techniques. Additionally, we present a taxonomy based on a closed grammar that covers all existing metadata and allows the composition of novel types of metadata. We provide a case study to show the practical application of the grammar for generating new metadata for data quality assessment. "