Marco Mistretta marcomistretta

Hi 👋 I'm Marco Mistretta!

I'm a PhD student in Artificial Intelligence at MICC, University of Florence, working under the guidance of Prof. Andrew D. Bagdanov and Prof. Marco Bertini. With a background in Computer Engineering and AI, my research focuses on pushing the boundaries of Multimodal Vision-Language Models (like CLIP) and their real-world applications.

My work has been published in top-tier venues including CVPR 2026, ICLR 2026, ICLR 2025, ECCV 2024, and NeurIPS 2023 (workshop).

I recently completed an Applied Scientist Internship at Amazon (RufusX Team, London), where I worked on foundational research and development in Generative AI and Multimodal Large Language Models (MLLMs) as part of the Amazon Rufus initiative.

For more information, feel free to visit my website: marcomistretta.github.io

📄 Publications:

IsoCLIP: Decomposing CLIP Projectors for Efficient Intra-modal Alignment CVPR 2026 (main conference) Authors: Magistri S., Goswami D., Mistretta M., Twardowski B., van de Weijer J., Bagdanov A. D.
SpectralGCD: Spectral Concept Selection and Cross-modal Representation Learning for Generalized Category Discovery ICLR 2026 (main conference) Authors: Caselli L., Mistretta M., Magistri S., Bagdanov A. D. Code: GitHub Repository
Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion ICLR 2025 (main conference) Authors: Mistretta M.*, Baldrati A.*, Agnolucci L.*, Bertini M., Bagdanov A. D. Code: GitHub Repository
Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation ECCV 2024 (main conference) Authors: Mistretta M.*, Baldrati A.*, Bertini M., Bagdanov A. D. Code: GitHub Repository
RE-tune: Incremental Fine Tuning of Biomedical Vision-Language Models for Multi-label Chest X-ray Classification NeurIPS 2023, Medical Imaging meets NeurIPS Workshop Authors: Mistretta M., Bagdanov A. D.

💼 Work Experience:

Applied Scientist Intern – Amazon (RufusX Team, London)

July 2025 – December 2025

Worked on Generative AI and Multimodal Large Language Models (MLLMs) within the Amazon Rufus initiative.
Fine-tuned, evaluated, and deployed large-scale multimodal models impacting millions of customers.
Collaborated with scientists and engineers to advance real-world multimodal reasoning and generation.

🌟 Research Interests:

🧠 Multimodal Learning: Combining visual and language data for richer model understanding.
💬 Prompt Learning: Tuning learnable parameters to maximize VLM performance.
🖼️ Contrastive Self-Supervised Learning: Finding patterns in unlabeled data.
♻️ Incremental Learning: Allowing AI models to keep learning without forgetting.
🎯 Few-Shot Adaptation: Quickly adapting AI to new tasks with minimal examples.

🚀 Skills & Technologies:

Programming Languages: Python, Java, C++, MATLAB, R
Frameworks & Tools: PyTorch, TensorFlow, NumPy, OpenCV, Git, Docker

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Marco Mistretta marcomistretta

Achievements

Achievements

Highlights

Organizations

Block or report marcomistretta

Hi 👋 I'm Marco Mistretta!

📄 Publications:

💼 Work Experience:

Applied Scientist Intern – Amazon (RufusX Team, London)

🌟 Research Interests:

🚀 Skills & Technologies:

🔗 Let's Connect!

Pinned Loading

Uh oh!