3260 papers • 126 benchmarks • 313 datasets
Surface normal estimation deals with the task of predicting the surface orientation of the objects present inside a scene. Refer to Designing Deep Networks for Surface Normal Estimation (Wang et al.) to get a good overview of several design choices that led to the development of a CNN-based surface normal estimator.
(Image credit: Papersgraph)
These leaderboards are used to track progress in surface-normals-estimation-7
Use these libraries to find surface-normals-estimation-7 models and implementations
No subtasks available.
By introducing a spherical exponential mapping on n-spheres at the regression output, this work obtains well-behaved gradients, leading to stable training and shows how the spherical regression can be utilized for several computer vision challenges, specifically viewpoint estimation, surface normal estimation and 3D rotation estimation.
This work adapts a recently proposed real-time semantic segmentation network, making changes to further reduce the number of floating point operations, and incorporates the raw predictions of the network into the SemanticFusion framework for dense 3D semantic reconstruction of the scene.
This paper presents an end-to-end differentiable algorithm for robust and detail-preserving surface normal estimation on unstructured point-clouds. We utilize graph neural networks to iteratively parameterize an adaptive anisotropic kernel that produces point weights for weighted least-squares plane fitting in local neighborhoods. The approach retains the interpretability and efficiency of traditional sequential plane fitting while benefiting from adaptation to data set statistics through deep learning. This results in a state-of-the-art surface normal estimator that is robust to noise, outliers and point density variation, preserves sharp features through anisotropic kernels and equivariance through a local quaternion-based spatial transformer. Contrary to previous deep learning methods, the proposed approach does not require any hand-crafted features or preprocessing. It improves on the state-of-the-art results while being more than two orders of magnitude faster and more parameter efficient.
This work addresses the unavailability of sufficient ground truth normal data, by leveraging existing 3D datasets and remodelling them via rendering and training a deep convolutional neural network on the task of monocular 360 surface estimation.
This work outputs triangle meshes with spatially-varying materials and environment lighting that can be deployed in any traditional graphics engine unmodified, and introduces a differentiable formulation of the split sum approximation of environment lighting to efficiently recover all-frequency lighting.
iDisc, a new module, implements a continuous-discrete-continuous bottleneck to learn internal discretized representations without supervision, and sets the new state of the art with significant improvements on NYU-Depth v2 and KittI, outperforming all published methods on the official KITTI benchmark.
The Nesti-Net method builds on a new local point cloud representation which consists of multi-scale point statistics (MuPS), estimated on a local coarse Gaussian grid, which is a suitable input to a CNN architecture.
We present the Creative Flow+ Dataset, the first diverse multi-style artistic video dataset richly labeled with per-pixel optical flow, occlusions, correspondences, segmentation labels, normals, and depth. Our dataset includes 3000 animated sequences rendered using styles randomly selected from 40 textured line styles and 38 shading styles, spanning the range between flat cartoon fill and wildly sketchy shading. Our dataset includes 124K+ train set frames and 10K test set frames rendered at 1500x1500 resolution, far surpassing the largest available optical flow datasets in size. While modern techniques for tasks such as optical flow estimation achieve impressive performance on realistic images and video, today there is no way to gauge their performance on non-photorealistic images. Creative Flow+ poses a new challenge to generalize real-world Computer Vision to messy stylized content. We show that learning-based optical flow methods fail to generalize to this data and struggle to compete with classical approaches, and invite new research in this area. Our dataset and a new optical flow benchmark will be publicly available at: www.cs.toronto.edu/creativeflow/. We further release the complete dataset creation pipeline, allowing the community to generate and stylize their own data on demand.
4 insights are proposed that help to significantly improve the performance of deep learning models that predict surface normals and semantic labels from a single RGB image and are demonstrated consistently improved state of the art results on several datasets.
Adding a benchmark result helps the community track progress.