News

Recent Publications

All Publications

(2020). Technical Perspective: Query Optimization for Faster Deep CNN Explanations. ACM SIGMOD Record (Vol 49, Issue 1).

PDF

(2020). Demand Forecasting in the Presence of Privileged Information. Workshop on Advanced Analytics and Learning on Temporal Data at ECML/PKDD.

PDF

(2020). A Comparison of Supervised Learning to Match Methods for Product Search. eCommerce workshop at SIGIR.

PDF

(2020). Analyzing and Predicting Purchase Intent in E-commerce: Anonymous vs. Identified Customers. eCommerce workshop at SIGIR.

PDF

(2020). Apache Mahout: Machine Learning on Distributed Dataflow Systems. Journal of Machine Learning Research (JMLR), open source software track.

PDF

(2020). AlphaJoin: Join Order Selection à la AlphaGo. PhD workshop at VLDB.

(2020). Fairness-Aware Instrumentation of Preprocessing Pipelines for Machine Learning. Human-In-the-Loop Data Analytics workshop at ACM SIGMOD.

PDF

(2020). HDDse: Enabling High-Dimensional Disk State Embedding for Generic Failure Detection of Heterogeneous Disks in Large Data Centers. USENIX Annual Technical Conference (ATC).

Team

I am part of the Information and Language Processing Systems group led by Maarten de Rijke and the Intelligent Data Engineering Lab led by Paul Groth.


PhD Students


Mozhdeh Ariannezhad
(co-supervised
with Maarten de Rijke)
Olivier Sprangers
(co-supervised
with Maarten de Rijke)
Mariya Hendriksen
(co-supervised
with Maarten de Rijke)
Sami Jullien
(co-supervised
with Maarten de Rijke)
Arezoo Sarvi
(co-supervised
with Maarten de Rijke)
Sergey Redyuk, TU Berlin
(co-supervised with
Volker Markl)


Associated Researchers

Stefan Grafberger, TU Munich
Master Student
Dr. Ji Zhang, Huawei
Postdoc
 

Collaborations

CV

Before joining University of Amsterdam, I have been a Faculty Fellow at the Center for Data Science at New York University, and a Senior Applied Scientist at Amazon Core AI in Berlin, where I worked on data management-related issues of machine learning applications, such as demand forecasting, metadata and provenance tracking of machine learning pipelines and automating data quality verification.

I received my Ph.D. from TU Berlin in 2015, where I have been advised by Volker Markl, head of the database systems and information management group. My co-supervisors were Klaus-Robert Müller from the machine learning group at TU Berlin and Reza Zadeh from Stanford. During my studies, I have been interning with the SystemML group at IBM Research Almaden and the social recommendations team at Twitter in California.

I am engaged in open source as an elected member of the Apache Software Foundation, where I currently mentor the Apache TVM project on behalf of the Apache Incubator. In the past, I have been involved in the Apache Mahout, Apache Flink, Apache Giraph and Apache MXNet projects. I am currently actively contributing to deequ, a library for ‘unit-testing’ large datasets with Apache Spark and recoreco, a fast item-to-item recommender written in Rust.

Service

I am the founder and chair of the workshop series on Data Management for End-To-End Machine Learning (DEEM) at ACM SIGMOD, which started in 2017.

I regularly review submissions to top tier data management conferences. I have been on the program committee at SIGMOD 2017, 2019-2021, VLDB 2021, ICDE 2018-2020, EDBT 2017 & 2021, CIKM’20, the workshop on Exploiting Artificial Intelligence Techniques for Data Management at SIGMOD 2019, the Large-Scale Recommender Systems workshop at the ACM RecSys 2013-2015, the workshop on Applied AI for Database Systems and Applications at VLDB’20, and Provenance Week’20. Additionally, I have reviewed submissions to journals for IEEE TKDE, ACM TIST, IEEE TPDS, IEEE TNNLS, VLDB Journal, the journal track of ECML/PKDD and the open source track of JMLR. I have also been a reviewer for the Amazon Research Awards.

At the University of Amsterdam, I coordinate the honors program for the bachelor AI.

Contact

I’m reachable via email at s.schelter[at]uva.nl. I’m also very actively using twitter as @sscdotopen. Most of the research code that I write is available under an open source license in my github account. Last but not least, I also have a profile in google scholar.