I’m currently an applied scientist at Amazon’s Core Machine Learning Team in Berlin, and a guest lecturer at the Database Systems and Information Management Group of TU Berlin.
My research focuses on the intersection of data management and machine learning, and incorporates a wide variety of aspects, such as metadata management for end-to-end ML applications, data quality, system design for parallel data processing, scalable algorithms and lately also the application of data mining to domains such as the web and social networks.
I received my Ph.D. from TU Berlin, advised by Volker Markl. During my studies, I have been interning at IBM Research Almaden and Twitter in California. Furthermore, I’m engaged in Open Source as a member of the Apache Software Foundation, where I have been a committer and PMC member in the Mahout, Giraph and Flink projects. Currently I’m serving as a mentor for the Apache MXNet project during its incubation.
- Our paper on “On the Ubiquity of Web Tracking: Insights from a Billion-Page Web Crawl” has been accepted by the Journal of Web Science
- Our extended abstract on “Declarative Metadata Management: A Missing Piece in End-to-End Machine Learning” has been accepted at the upcoming SysML conference
- Matei Zaharia from Stanford/Databricks will give an invited talk at our upcoming second workshop on “Data Management for End-to-End Machine Learning (DEEM)” at SIGMOD 2018
- Our paper on “Automatically Tracking Metadata and Provenance of Machine Learning Experiments” has been accepted for publication at the Workshop on Machine Learning Systems at NIPS 2017