Coresets for Deletion-Robust k-Center Clustering
Published in CIKM, 2024
The \(k\)-center clustering problem is of fundamental importance for a broad range of machine learning and data science applications. In this paper, we study the deletion-robust version of the problem. Specifically, we aim to extract a small subset of a given data set, referred to as a coreset, that contains a provably good set of \(k\) centers even after an adversary deletes up to \(z\) arbitrarily chosen points from the data set. We propose a 4-approximation algorithm that provides a coreset of size \(O(k z)\). To our knowledge, this is the first algorithm for deletion-robust \(k\)-center clustering with a theoretical guarantee. Moreover, we accompany our theoretical results with extensive experiments, demonstrating that our algorithm achieves significantly better robustness than non-trivial baselines against three heuristic gray-box and white-box adversarial deletion attacks.