Motion Flow Matching for Human Motion Synthesis and Editing

1CompVis Group, LMU Munich, 2University of Amsterdam, 3CMU, 4A-STAR


Human motion synthesis is a fundamental task in computer animation. Recent methods based on diffusion models or GPT structure demonstrate commendable performance but exhibit drawbacks in terms of slow sampling speeds and error accumulation.

In this paper, we propose "Motion Flow Matching", a novel generative model designed for human motion generation featuring efficient sampling and effectiveness in motion editing applications. Our method reduces the sampling complexity from thousand steps in previous diffusion models to just ten steps, while achieving comparable performance in text-to-motion and action-to-motion generation benchmarks. Noticeably, our approach establishes a new state-of-the-art Fréchet Inception Distance on the KIT-ML dataset. What is more, we tailor a straightforward motion editing paradigm named "sampling trajectory rewriting" leveraging the ODE-style generative models and apply it to various editing scenarios including motion prediction, motion in-between prediction, motion interpolation, and upper-body editing. Our code will be released.

Motion In-Between

The green color represents known motion, while the blue color represents unknown (or generated) motion.

"a person jumps sideways to their left several times, then several times to the right."

"a person does a funky line dance and then exits to their left."

"a person walks diagonally and raises arms in a t pose and seems to be balancing on a wide beam. then stops and drops arms to side."

Motion Prediction

The green color represents known motion, while the blue color represents unknown (or generated) motion.

"a man walks in a curved line."

"a person turns to throw something with his left hand."

"a person walks out, turns, walks back and turns and walks out again and turns."

Text-to-Motion Generation

"a man is standing with feet wide apart and arms out swinging different motions acting like a monkey."

"a man is doing jumping jacks."

"a person at a standstill starts running, then stops."

"a person walked by making the circle"


        title = {Motion Flow Matching for Human Motion Synthesis and Editing},
        author = {Hu, Vincent Tao and Yin, Wenzhe and Ma, Pingchuan and Chen, Yunlu and Fernando, Basura and Asano, Yuki M. and Gavves, Efstratios and Mettes, Pascal and Ommer, Björn and Snoek, Cees G.M.},
        year = {2024},
       booktitle = {Arxiv},