Shpresim Sadiku

PhD @TUBerlin/BMS
Scientific Assistant @ZuseInstitute

TU Berlin/ZIB

About Me

I am a PhD candidate at the Institute of Mathematics, Technische Universität Berlin, working under the supervision of Prof. Dr. Sebastian Pokutta. I am also affiliated with the Interactive Optimization and Learning (IOL) research lab at the Zuse Institute Berlin, where I hold a Scientific Assistant position.

I received an MSc in Mathematics in Data Science under the supervision of Prof. Dr. Michael Wolf at the Technical University of Munich in May 2020 and a BSc in Mathematics at the University of Tirana in July 2017.

My research interests include studying inverse classification methods for explaining vision-language models (VLMs), primarily focusing on developing efficient adversarial attack and counterfactual explanation (CFE) techniques for VLMs. Currently, I use perturbations generated from CFEs to train robust classifiers against spurious correlations.

My CV can be found here.

Interests

Robustness of Vision-Language Models (VLMs)
Adversarial Attacks / Counterfactual Explanations

Education

PhD, Mathematics, 2025 (exp.)
Technische Universität Berlin
MSc, Mathematics in Data Science, 2020
Technical University of Munich
BSc, Mathematics, 2017
University of Tirana

Publications & Preprints

Shpresim Sadiku, Kartikeya Chitranshi, Hiroshi Kera, Sebastian Pokutta

May 2025

Training on Plausible Counterfactuals Removes Spurious Correlations

We introduce a training paradigm that uses plausible counterfactual explanations (p-CFEs) to match standard model accuracy while reducing reliance on spurious correlations.

Shpresim Sadiku, Moritz Wagner, Sai Ganesh Nagarajan, Sebastian Pokutta

May 2025

S-CFE: Simple Counterfactual Explanations

We study the problem of finding optimal sparse, manifold-aligned counterfactual explanations for classifiers. Canonically, this can be formulated as an optimization problem with multiple non-convex components, including classifier loss functions and manifold alignment (or plausibility) metrics. The added complexity of enforcing sparsity, or shorter explanations, complicates the problem further. Existing methods often focus on specific models and plausibility measures, relying on convex l_1 regularizers to enforce sparsity. In this paper, we tackle the canonical formulation using the accelerated proximal gradient (APG) method, a simple yet efficient first-order procedure capable of handling smooth non-convex objectives and non-smooth l_p (where 0 <= p < 1) regularizers. This enables our approach to seamlessly incorporate various classifiers and plausibility measures while producing sparser solutions. Our algorithm only requires differentiable data-manifold regularizers and supports box constraints for bounded feature ranges, ensuring the generated counterfactuals remain actionable. Finally, experiments on real-world datasets demonstrate that our approach effectively produces sparse, manifold-aligned counterfactual explanations while maintaining proximity to the factual data and computational efficiency.

Shpresim Sadiku, Moritz Wagner, Sebastian Pokutta

April 2025

GSE: Group-wise Sparse and Explainable Adversarial Attacks

In this paper, we present an algorithm that simultaneously generates group-wise sparse attacks within semantically meaningful areas of an image. In each iteration, the core operation of our algorithm involves the optimization of a quasinorm adversarial loss. This optimization is achieved by employing the 1/2-quasinorm proximal operator for some iterations, a method tailored for nonconvex programming. Subsequently, the algorithm transitions to a projected Nesterov’s accelerated gradient descent with 2-norm regularization applied to perturbation magnitudes. We rigorously evaluate the efficacy of our novel attack in both targeted and non-targeted attack scenarios, on CIFAR-10 and ImageNet datasets. When compared to state-of-the-art methods, our attack consistently results in a remarkable increase in group-wise sparsity, e.g., an increase of 50.9% on CIFAR-10 and 38.4% on ImageNet (average case, targeted attack), all while maintaining lower perturbation magnitudes. Notably, this performance is complemented by a significantly faster computation time and a 100% attack success rate.

Shpresim Sadiku, Moritz Wagner, Sebastian Pokutta

September 2022

Wavelet-based Low Frequency Adversarial Attacks

Despite their impressive success in various machine learning tasks, deep neural networks are vulnerable to adversarial attacks. Through the addition of imperceptible levels of distortion to a given image, such attacks can cause a learned network to quite spectacularly misclassify the perturbed input. Several defense approaches including adversarial training and methods manipulating basis function representations of images such as JPEG compression, PCA, wavelet denoising, and soft-thresholding have shown success. The former defense works well in defending against small l_p norm attacks in the pixel representation, whereas the latter methods rely on removing high frequency signal. We show that both training-based and basis-manipulation defense methods are significantly less effective if we restrict the generation of adversarial attacks to the low frequency discrete wavelet transform (DWT) domain, thus providing new insights into vulnerabilities of deep learning models.

Shpresim Sadiku

May 2020

Adversarial Deformations for Neural Ordinary Differential Equations

We present a concise optimal control optimization approach to continuous-depth deep learning models by discussing ideas and algorithms derived from the optimality conditions of the powerful Pontryagin’s Maximum Principle. The new emerging field of constant memory cost models, however, is vulnerable to adversarial attacks. Apart from highlighting the inconsistency of neural networks theoretically, we experiment with adversarial deformations for neural ordinary differential equations on MNIST and compare our results to convolutional neural-network based architectures.