direkt zum Inhalt springen

direkt zum Hauptnavigationsmenü

Sie sind hier

TU Berlin

Inhalt des Dokuments

Paper Accepted for the Gongshow Presentation at CIDR 2020

The paper 

"CAFE: Constraint-Aware Feature Extraction from Large Databases" 

by Mahdi Esmailoghli and Ziawasch Abedjan was accepted for the Gongshow Presentation at CIDR 2020. 

Here is the abstract:

"Feature extraction is a core step of most machine learning pipelines. If a given dataset is not providing useful features one has to enhance the dataset at hand with features from additional sources. This will turn into a manual and tedious search and retrieval process. 

At the same time, users may want to uphold certain constraints on the new features, such as consistency of values, interpretability, or fairness.

We propose a constraint-aware feature extraction method that scans a given large database for promising features and effectively translates user-defined constraints into extraction filters. To this end, we show how the initial retrieval engine of our system can swift through million of tables to find relevant features and how our execution planner optimizes the required and data-dependent pruning process for fast and effective feature extraction."

Zusatzinformationen / Extras

Quick Access:

Schnellnavigation zur Seite über Nummerneingabe