Tao Hu

Ommer Lab.

prof_pic.jpg

Computer Vision & Learning Group

Akademiestr. 7,Munich

Ludwig Maximilian University of Munich

I am a Postdoctoral Research Fellow with Björn Ommer in Ommer-Lab ( Stable Diffusion Lab ), focused on exploring the scalability and generalization ablity of diffusion model in the nxtaim project. I finished my PhD at VISLab, University of Amsterdam.

I am recruiting for Bachelor, Master and PhD supervision in Munich and globally. If you're interested in collaborating, feel free to send an email.

Open to discussion and collaboration, feel free to send an email.

Focused on introducing inductive bias into neural network to achieve data-efficiency by few-shot learning, generative model, etc. Have a conviction that generative modelling will be the future of discriminative modelling.

Publication | GitHub | CV(updated in Nov.2023) |
LinkedIn
| Research Note | Chat with me |Wechat |

Tao Hu's wechat

news

Dec 10, 2024 [MASK] is All You Need on Arxiv. Two papers including DepthFM accepted by AAAI 2025.
Dec 08, 2024 Distillation of Diffusion Features for Semantic Correspondence got accepted by WACV 2025. Scaling Image Tokenizers with Grouped Spherical Quantization on Arxiv.
Dec 06, 2024 NeurIPS 2024 Excellent Reviewer.
Jul 01, 2024 Two papers(including ZigMa) got accepted by ECCV! Also ZigMa: DiT-style Mamba-based diffusion models has been accepted as oral in ICML Workshop on Long Context Foundation Models (LCFM) :sparkles:
Jun 03, 2024 Give a talk at Adobe Research and A-Star to introduce our work about ZigMa.

selected publications

  1. ./mask.jpg
    [MASK] is All You Need
    Vincent Tao Hu , and Björn Ommer
    In Arxiv , 2024
  2. DepthFM: Fast Monocular Depth Estimation with Flow Matching
    Ming Gui , Johannes S. Fischer , Ulrich Prestel , and 6 more authors
    In AAAI , 2025
    An exploration of flow matching for blazing fast and zero-shot depth estimation
  3. Does VLM Classification Benefit from LLM Description Semantics?
    Pingchuan Ma , Lennart Rietdorf , Dmytro Kotovenko , and 2 more authors
    In AAAI , 2025
  4. Distillation of Diffusion Features for Semantic Correspondence
    Frank Fundel , Johannes Schusterbauer , Vincent Tao Hu , and 1 more author
    In WACV , 2025
  5. Scaling Image Tokenizers with Grouped Spherical Quantization
    Jiangtao Wang , Zhen Qin , Yifan Zhang , and 4 more authors
    2024
  6. Diffusion Models and Representation Learning: A Survey
    Michael Fuest , Pingchuan Ma , Ming Gui , and 3 more authors
    In Arxiv , 2024
    The interplay between diffusion models and representation learning
  7. ZigMa: A DiT-style Zigzag Mamba Diffusion Model
    Vincent Tao Hu , Stefan Andreas Baumann , Ming Gui , and 4 more authors
    In ECCV , 2024
    a DiT-style Mamba-based diffusion models
  8. Boosting Latent Diffusion with Flow Matching
    Johannes S. Fischer , Ming Gui , Pingchuan Ma , and 4 more authors
    In ECCV , 2024
    flow matching for super-resolution
  9. Continuous, Subject-Specific Attribute Control in T2I Models by Identifying Semantic Directions
    Stefan Andreas Baumann , Felix Krause , Michael Neumayr , and 3 more authors
    In Arxiv , 2024
    Prompt Editing in T2I models
  10. Guided Flow Vision Transformer from Self-Supervised Diffusion Features
    Vincent Tao Hu , Yunlu Chen , Mathilde Caron , and 3 more authors
    In Arxiv , 2024
  11. Motion Flow Matching for Human Motion Synthesis and Editing
    Vincent Tao Hu , Wenzhe Yin , Pingchuan Ma , and 7 more authors
    In Arxiv , 2024
  12. Training Class-Imbalanced Diffusion Model Via Overlap Optimization
    Divin Yan , Lu Qi , Vincent Tao Hu , and 2 more authors
    In arxiv , 2024
  13. ./fm.png
    Latent Space Editing in Transformer-based Flow Matching
    Vincent Tao Hu , David W Zhang , Pascal Mettes , and 3 more authors
    In AAAI 2024. Also appear in ICML 2023 Workshop, New Frontiers in Learning, Control, and Dynamical Systems , 2024
  14. ToddlerDiffusion: Flash Interpretable Controllable Diffusion Model
    Eslam Mohamed BAKR , Liangbing Zhao , Vincent Tao Hu , and 3 more authors
    In Arxiv , 2024
  15. ./scribbleseg.png
    Generative Data Augmentation Improves Scribble-supervised Semantic Segmentation
    Jacob Schnell , Jieke Wang , Lu Qi , and 2 more authors
    In SyntaGen CVPR workshop , 2024
    Explore diffusion model for data augmention in segmentation task.
  16. ./fm-s2s.png
    Flow Matching for Conditional Text Generation in a Few Sampling Steps
    Vincent Tao Hu , Di Wu , Yuki M. Asano , and 4 more authors
    In EACL , 2024
    Flow Matching for text generation
  17. ./sgdm-why.png
    Self-Guided Diffusion Models
    Tao Hu* , David W Zhang* , Yuki M. Asano , and 2 more authors
    In CVPR , 2023
    A bridge between the community of self-supervised learning and diffusion models. Short version to appear in NeurIPS 2022 Workshop on Score-Based Methods and NeurIPS 2022 Workshop Self-Supervised Learning Theory and Practice.