TU Berlin

Fachgebiet Big Data ManagementFachgebiet Big Data Management

isti-logo

Inhalt

zur Navigation

Welcome to the Big Data Management Group at the TU Berlin!

Key attributes of Big Data can be described by the three (or several) "V's": Big Volume, Big Velocity, and Big Variety. In this group, we mainly focus on the last "V", the Big variety of data: 

To use and combine data from E-commerce, sensors, and social media services, integration and curation routines have to be employed. The heterogeneity of data impedes the seamless integration of different sources, requiring human intervention in form of exhaustive profiling and data preparation efforts. Hence, research on Big Data calls for scalable data profiling and integration systems that enable curation and consumption of large and many and diverse data sources.

Along with profiling and integration of large datasets, the deployment of sophisticated analytics on data (big analytics) is strongly related to the above mentioned problem. We are interested in systems that leverage mining and machine learning techniques to derive knowledge from dirty and poorly organized data. This includes developing sketching and summarizing techniques that reduce a big dataset to its relevant core information.

Neuigkeiten

Paper Accepted for SIGMOD 2019

05. Mai 2019

Our paper on the configuration-free error detection system was accepted for SIGMOD 2019. mehr zu: Paper Accepted for SIGMOD 2019

Paper Accepted for SSDBM 2019

05. Mai 2019

Our paper on the estimating the error detection performance was accepted for SSDBM 2019. mehr zu: Paper Accepted for SSDBM 2019

First Prize at BTW Data Science Challenge 2019

07. März 2019

Mahdi Esmailoghli and his team won the BTW Data Science Challenge in 2019. mehr zu: First Prize at BTW Data Science Challenge 2019

Short Paper accepted for EDBT 2019

30. Dezember 2018

We are happy to announce that our short paper "Feature Engineering for Cross-Language Record Linkage" by Öykü Özlem Çakal, Mohammad Mahdavi and Ziawasch Abedjan was accepted for presentation at EDBT 2019. mehr zu: Short Paper accepted for EDBT 2019

Paper accepted for ICDE 2019

17. Dezember 2018

We are happy to announce that our research paper "Unsupervised String Transformation Learning for Entity Consolidation" was accepted for publication in the ICDE 2018 proceedings. mehr zu: Paper accepted for ICDE 2019

New Book on Data Profiling

23. November 2018

We are happy to announce that our book on "Data Profiling", written by Ziawasch Abedjan, Lukasz Golab, Felix Naumann, and Thorsten Pappenbrock, is published by Morgan and Claypool and available for purchase. mehr zu: New Book on Data Profiling

Larysa receives PAS Scholarship

21. September 2018

Larysa Visengeriyeva received a PAS scholarship that is awarded to female researchers in the final phase of their dissertation. mehr zu: Larysa receives PAS Scholarship

Visit by Michael Stonebraker

21. September 2018

Michael Stonebraker visits the BigDaMa group at the TU Berlin and gives a talk in the BBDC Seminar. mehr zu: Visit by Michael Stonebraker

Chapter published in eBISS 2017

27. August 2018

Prof. Abedjan contributed one chapter to the sevenths volume of the eBISS series titled "Business Intelligence and Big Data". mehr zu: Chapter published in eBISS 2017

Paper accepted for SSDBM 2018

14. Mai 2018

Our paper on Metadata-Driven Error Detection was accepted for SSDB 2018. mehr zu: Paper accepted for SSDBM 2018

Paper accepted for ICDE 2018

13. Februar 2018

Our paper on data discovery was accepted for ICDE 2018. mehr zu: Paper accepted for ICDE 2018

Demo Paper accepted at ICDE

24. Dezember 2017

Our proposal to demonstrate the workflow generation of data civilizer was accepted at ICDE 2018. mehr zu: Demo Paper accepted at ICDE

DFG Grant: Tractable Curation Workflows

03. November 2017

We are pleased to announce that the DFG is supporting our research ... mehr zu: DFG Grant: Tractable Curation Workflows

Digital Science Match Paticipation

15. Mai 2017

Prof. Abedjan joined the science match with a presentation on data integration research. mehr zu: Digital Science Match Paticipation

Invited Talk at HPI Symposium on Future Trends in SOC

27. April 2017

Prof. Abedjan was invited to present his talk "Data Curation in the Wild: Limits and Challenges" at the annual HPI Symposium on Future Trends in Service-oriented Computing. mehr zu: Invited Talk at HPI Symposium on Future Trends in SOC

Demo Paper accepted at SIGMOD 2017

27. Februar 2017

The Data Civilizer Demo in collaboration with MIT, QCRI, and University of Waterloo was accepted at SIGMOD 2017. mehr zu: Demo Paper accepted at SIGMOD 2017

Tutorial accepted at SIGMOD 2017

21. Februar 2017

Our Tutorial on Data Profiling has been accepted for a 90 minute presentation at SIGMOD 2017. mehr zu: Tutorial accepted at SIGMOD 2017

Onwrks receives Exist Funding

06. Dezember 2016

We congratulate the founders of Onwrks, Anatoli Kantarovich, Nimrod Knoller und Michael Steimel for receiving the Exist starting grant. Onwrks is a Berlin-based software startup, specializing in digital tools for wind turbine data management. It is scientifically mentored by Prof. Ziawasch Abedjan mehr zu: Onwrks receives Exist Funding

Paper für CIDR 2017 angenommen.

12. Oktober 2016

Unser Paper mit dem Titel "The Data Civilizer System" wurde bei der CIDR 2017 Konferenz angenommen. mehr zu: Paper für CIDR 2017 angenommen.

Vortrag im Rahmen des AT&T Forschungsseminars

26. September 2016

Prof. Abedjan hielt einen Vortrag zum Thema "Data Curation in the Wild: Limits and Challenges" mehr zu: Vortrag im Rahmen des AT&T Forschungsseminars

Navigation

Direktzugang

Schnellnavigation zur Seite über Nummerneingabe