publications

publications by categories in reversed chronological order. generated by jekyll-scholar.

2025

  1. DepthFM: Fast Monocular Depth Estimation with Flow Matching
    Ming Gui , Johannes S. Fischer , Ulrich Prestel , and 6 more authors
    In AAAI , 2025
    An exploration of flow matching for blazing fast and zero-shot depth estimation
  2. Does VLM Classification Benefit from LLM Description Semantics?
    Pingchuan Ma , Lennart Rietdorf , Dmytro Kotovenko , and 2 more authors
    In AAAI , 2025
  3. Distillation of Diffusion Features for Semantic Correspondence
    Frank Fundel , Johannes Schusterbauer , Vincent Tao Hu , and 1 more author
    In WACV , 2025

2024

  1. ./mask.jpg
    [MASK] is All You Need
    Vincent Tao Hu , and Björn Ommer
    In Arxiv , 2024
  2. Scaling Image Tokenizers with Grouped Spherical Quantization
    Jiangtao Wang , Zhen Qin , Yifan Zhang , and 4 more authors
    2024
  3. Diffusion Models and Representation Learning: A Survey
    Michael Fuest , Pingchuan Ma , Ming Gui , and 3 more authors
    In Arxiv , 2024
    The interplay between diffusion models and representation learning
  4. ZigMa: A DiT-style Zigzag Mamba Diffusion Model
    Vincent Tao Hu , Stefan Andreas Baumann , Ming Gui , and 4 more authors
    In ECCV , 2024
    a DiT-style Mamba-based diffusion models
  5. Boosting Latent Diffusion with Flow Matching
    Johannes S. Fischer , Ming Gui , Pingchuan Ma , and 4 more authors
    In ECCV , 2024
    flow matching for super-resolution
  6. Continuous, Subject-Specific Attribute Control in T2I Models by Identifying Semantic Directions
    Stefan Andreas Baumann , Felix Krause , Michael Neumayr , and 3 more authors
    In Arxiv , 2024
    Prompt Editing in T2I models
  7. Guided Flow Vision Transformer from Self-Supervised Diffusion Features
    Vincent Tao Hu , Yunlu Chen , Mathilde Caron , and 3 more authors
    In Arxiv , 2024
  8. Motion Flow Matching for Human Motion Synthesis and Editing
    Vincent Tao Hu , Wenzhe Yin , Pingchuan Ma , and 7 more authors
    In Arxiv , 2024
  9. Training Class-Imbalanced Diffusion Model Via Overlap Optimization
    Divin Yan , Lu Qi , Vincent Tao Hu , and 2 more authors
    In arxiv , 2024
  10. ./fm.png
    Latent Space Editing in Transformer-based Flow Matching
    Vincent Tao Hu , David W Zhang , Pascal Mettes , and 3 more authors
    In AAAI 2024. Also appear in ICML 2023 Workshop, New Frontiers in Learning, Control, and Dynamical Systems , 2024
  11. ToddlerDiffusion: Flash Interpretable Controllable Diffusion Model
    Eslam Mohamed BAKR , Liangbing Zhao , Vincent Tao Hu , and 3 more authors
    In Arxiv , 2024
  12. ./scribbleseg.png
    Generative Data Augmentation Improves Scribble-supervised Semantic Segmentation
    Jacob Schnell , Jieke Wang , Lu Qi , and 2 more authors
    In SyntaGen CVPR workshop , 2024
    Explore diffusion model for data augmention in segmentation task.
  13. ./fm-s2s.png
    Flow Matching for Conditional Text Generation in a Few Sampling Steps
    Vincent Tao Hu , Di Wu , Yuki M. Asano , and 4 more authors
    In EACL , 2024
    Flow Matching for text generation

2023

  1. ./fsinr.png
    On the Few-Shot Generalization of Learning on Implicit Neural Representations
    Vincent Tao Hu , David W Zhang , Yunlu Chen , and 6 more authors
    In ICCV NeRF4ADR Workshop , 2023
    Explore few-shot generalization of INR on images.
  2. Query by Activity Video in the Wild
    Vincent Tao Hu , William Thong , Pascal Mettes , and 1 more author
    In ICIP , 2023
    Few-shot video retrieval
  3. ./sgdm-why.png
    Self-Guided Diffusion Models
    Tao Hu* , David W Zhang* , Yuki M. Asano , and 2 more authors
    In CVPR , 2023
    A bridge between the community of self-supervised learning and diffusion models. Short version to appear in NeurIPS 2022 Workshop on Score-Based Methods and NeurIPS 2022 Workshop Self-Supervised Learning Theory and Practice.

2021

  1. ./video_retrieval.png
    Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting
    Martine Toering , Ioannis Gatopoulos , Maarten Stol , and 1 more author
    In WACV , 2021
    Improve video representation by constrasting Prototypical features.

2020

  1. ./focal.png
    Localizing the Common Action Among a Few Videos
    Pengwan Yang* , Tao Hu* , Pascal Mettes , and 1 more author
    In European Conference on Computer Vision(ECCV) , 2020
    Localizing the temporal extent of an action in a long untrimmed video by attention techniques.
  2. ./pointmixup.gif
    Pointmixup: Augmentation for point clouds
    Yunlu Chen* , Tao Hu* , Efstratios Gavves , and 4 more authors
    In European Conference on Computer Vision(ECCV) , 2020
    A simple augmentation method based on MixUp to boost the performance on related tasks of point cloud.
  3. Interactivity proposals for surveillance videos
    Shuo Chen , Pascal Mettes , Tao Hu , and 1 more author
    In International Conference on Multimedia Retrieval(ICMR) , 2020

2019

  1. ./avatar_amcg.png
    Attention-based Multi-Context Guiding for Few-Shot Semantic Segmentation
    Tao Hu , Pengwan Yang , Chiliang Zhang , and 3 more authors
    In AAAI , 2019
    Solve the few-shot segmentation problem by applying attention in multi-scales.
  2. ./avatar_silco.png
    SILCO: Show a Few Images, Localize the Common Object
    Tao Hu , Pascal Mettes , Jia-Hong Huang , and 1 more author
    In International Conference on Computer Vision(ICCV) , 2019
    Design a graph network and apply attention on them to solve the problem of common object localization.

2018

  1. Dense In Dense: Training Segmentation from Scratch
    Tao Hu
    In Asian Conference on Computer Vision(ACCV) , 2018
  2. Sobel heuristic kernel for aerial semantic segmentation
    Tao Hu , Yao Wang , Yisong Chen , and 2 more authors
    In IEEE International Conference on Image Processing (ICIP) , 2018
  3. Accelerating convolutional neural networks with dynamic channel pruning
    Chiliang Zhang , Tao Hu , Yingda Guan , and 1 more author
    In Data Compression Conference (DCC) , 2018