Placing Human Animations into 3D Scenes by Learning Interaction- and Geometry-Driven Keyframes


We present a novel method for placing a 3D human animation into a 3D scene while maintaining any human-scene interactions in the animation. We use the notion of computing the most important meshes in the animation for the interaction with the scene, which we call “keyframes.” These keyframes allow us to better optimize the placement of the animation into the scene such that interactions in the animations (standing, laying, sitting, etc.) match the affordances of the scene (e.g., standing on the floor or laying in a bed). We compare our method, which we call PAAK, with prior approaches, including POSA, PROX ground truth, and a motion synthesis method, and highlight the benefits of our method with a perceptual study. Human raters preferred our PAAK method over the PROX ground truth data 64.6% of the time. Additionally, in direct comparisons, the raters preferred PAAK over competing methods including 61.5% compared to POSA. Our project website is available at

IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2023