Mateusz Pach

I am a PhD researcher at Technical University of Munich, where I work on interpreting and adapting multimodal generative models.

Previously, I completed my Bachelor's and Master's degrees at Jagiellonian University in Kraków, where I worked with the GMUM research group.

I have also done internships at Amazon and G-Research.

Selected Publications

The Latent Color Subspace: Emergent Order in High-Dimensional Chaos

Mateusz Pach*, Jessica Bader*, Quentin Bouniot, Serge Belongie, Zeynep Akata
ICML 2026

paper code

Inside FLUX’s VAE latent space, we uncover a structured color subspace aligned with Hue, Saturation, and Lightness, enabling training-free color control via closed-form latent manipulation.

Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models

Mateusz Pach, Shyamgopal Karthik, Quentin Bouniot, Serge Belongie, Zeynep Akata
NeurIPS 2025

paper code

With the proposed Monosemanticity Score, we show that SAEs in VLMs discover monosemantic, interpretable features, enabling fine-grained control over learned representations.

Stitch: Training-Free Position Control in Multimodal Diffusion Transformers

Jessica Bader*, Mateusz Pach*, Maria Bravo, Serge Belongie, Zeynep Akata
In Review

paper code

Stitch is a training-free method for controlling spatial positioning in modern text-to-image models via attention constraints, splitting generation into sub-regions and stitching them together.

LucidPPN: Unambiguous Prototypical Parts Network for User-centric Interpretable Computer Vision

Mateusz Pach, Dawid Rymarczyk, Koryna Lewandowska, Jacek Tabor, Bartosz Zieliński
ICLR 2025

paper code

LucidPPN introduces an interpretable prototypical parts-based method that provides clear, unambiguous visual explanations by disentangling color from all other features used by the network.

TORE: Token Recycling in Vision Transformers for Efficient Active Visual Exploration

Jan Olszewski, Dawid Rymarczyk, Piotr Wójcik, Mateusz Pach, Bartosz Zieliński
WACV 2025

paper code

TORE is a token recycling mechanism, improving efficiency in active visual exploration by reusing informative tokens across processing steps.