LASR: Learning Articulated Shape Reconstruction from a Monocular Video (2021-05-06T00:00:00.000000Z)

TL;DR

This work introduces a template-free approach to learn 3D shapes from a single video with an analysis-by-synthesis strategy that forward-renders object silhouette, optical flow, and pixel values to compare with video observations, which generates gradients to adjust the camera, shape and motion parameters.

Abstract

Remarkable progress has been made in 3D reconstruction of rigid structures from a video or a collection of images. However, it is still challenging to reconstruct nonrigid structures from RGB inputs, due to its under-constrained nature. While template-based approaches, such as parametric shape models, have achieved great success in modeling the "closed world" of known object categories, they cannot well handle the "open-world" of novel object categories or outlier shapes. In this work, we introduce a template-free approach to learn 3D shapes from a single video. It adopts an analysis-by-synthesis strategy that forward-renders object silhouette, optical flow, and pixel values to compare with video observations, which generates gradients to adjust the camera, shape and motion parameters. Without using a category-specific shape template, our method faithfully reconstructs nonrigid 3D structures from videos of human, animals, and objects of unknown classes. Our code is available at lasr-google.github.io.

Authors

Ce Liu

6 papers

Huiwen Chang

4 papers

W. Freeman

16 papers

TL;DR

Abstract

Authors

References58 items

SMPL: A Skinned Multi-Person Linear Model

Online Adaptation for Consistent Mesh Reconstruction in the Wild

3D Bird Reconstruction: a Dataset, Model, and Shape Recovery from a Single View

Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization in the Loop

Implicit Mesh Reconstruction from Unannotated Image Collections

Articulation-Aware Canonical Surface Mapping

PIFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization

RAFT: Recurrent All-Pairs Field Transforms for Optical Flow

Self-supervised Single-view 3D Reconstruction via Semantic Consistency

Deep NRSfM++: Towards 3D Reconstruction in the Wild

PointRend: Image Segmentation As Rendering

Local Deep Implicit Functions for 3D Shape

VIBE: Video Inference for Human Body Pose and Shape Estimation

Accelerating 3D deep learning with PyTorch3D

C3DPO: Canonical 3D Pose Networks for Non-Rigid Structure From Motion

Three-D Safari: Learning to Estimate Zebra Pose, Shape, and Texture From Images “In the Wild”

Deep Non-Rigid Structure From Motion

PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization

Expressive Body Capture: 3D Hands, Face, and Body From a Single Image

Soft Rasterizer: A Differentiable Renderer for Image-Based 3D Reasoning

OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields

Monocular Total Capture: Posing Face, Body, and Hands in the Wild

Creatures great and SMAL: Recovering the shape and motion of animals from video

Neighbourhood Consensus Networks

Image Collection Pop-up: 3D Reconstruction and Clustering of Rigid and Non-rigid Categories

Lions and Tigers and Bears: Capturing Non-rigid, 3D, Articulated Shape from Images

Learning Category-Specific Mesh Reconstruction from Image Collections

Robust Watertight Manifold Surface Generation Method for ShapeNet Models

The Unreasonable Effectiveness of Deep Features as a Perceptual Metric

3D Menagerie: Modeling the 3D Shape and Pose of Animals

A Vote-and-Verify Strategy for Fast Spatial Verification in Image Retrieval

Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image

Universal Correspondence Network

Animated 3D Creatures from Single-view Video by Skeletal Sketching

A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation

Deep Residual Learning for Image Recognition

Adam: A Method for Stochastic Optimization

Microsoft COCO: Common Objects in Context

A Simple Prior-Free Method for Non-rigid Structure-from-Motion Factorization

Non-rigid structure from motion with complementary rank-3 spaces

Articulated pose estimation with flexible mixtures-of-parts

Dense Point Trajectories by GPU-Accelerated Large Displacement Optical Flow

ImageNet: A large-scale hierarchical image database

Articulated mesh animation from multi-view silhouettes

Embedded deformation for shape manipulation

Particle Video: Long-Range Motion Estimation Using Point Trajectories

Shape-From-Silhouette Across Time Part I: Theory and Algorithms

Pose Space Deformation: A Unified Approach to Shape Interpolation and Skeleton-Driven Deformation

Recovering non-rigid 3D shape from image streams

Neural Dense Non-Rigid Structure from Motion with Latent Space Constraints

Shape and viewpoints without keypoints

Point2mesh: A self-prior for deformable meshes

Robust vision challenge. www.robustvision.net

Mesh R-CNN

Volumetric Correspondence Networks for Optical Flow

Ieee Transactions on Pattern Analysis and Machine Intelligence 1 What Shape Are Dolphins? Building 3d Morphable Models from 2d Images

Noname manuscript No. (will be inserted by the editor) A Variational Approach to Video Registration with Subspace Constraints

C 3 dpo : Canonical 3 d pose networks for non - rigid structure from motion Expressive body capture : 3 d hands , face , and body from a single image

Field of Study

Journal Information

Name

Page

Venue Information

Name

Type

URL

Alternate Names