3260 papers • 126 benchmarks • 313 datasets
This task has no description! Would you like to contribute one?
(Image credit: Papersgraph)
These leaderboards are used to track progress in geometric-matching-11
Use these libraries to find geometric-matching-11 models and implementations
No subtasks available.
This work proposes a convolutional neural network architecture for geometric matching based on three main components that mimic the standard steps of feature extraction, matching and simultaneous inlier detection and model parameter estimation, while being trainable end-to-end.
A new fully-learnable Characteristic-Preserving Virtual Try-On Network (CP-VTON) for addressing all real-world challenges in this task, and achieves the state-of-the-art virtual try-on performance both qualitatively and quantitatively.
This work aims to estimate a dense flow field relating two images, coupled with a robust pixel-wise confidence map indicating the reliability and accuracy of the prediction, and develops a flexible probabilistic approach that jointly learns the flow prediction and its uncertainty.
This work proposes a universal network architecture that is directly applicable to all the aforementioned dense correspondence problems, and achieves both high accuracy and robustness to large displacements by investigating the combined use of global and local correlation layers.
The proposed GOCor module is a fully differentiable dense matching module, acting as a direct replacement to the feature correlation layer, capable of effectively learning spatial matching priors to resolve further matching ambiguities.
A novel image retrieval re-ranking network named Correlation Verification Networks (CVNet), comprising deeply stacked 4D convolutional layers, gradually compresses dense feature correlation into image similarity while learning diverse geometric matching patterns from various image pairs to enable cross-scale matching.
Experimental results show that the proposed C-VTON approach is able to produce photo-realistic and visually convincing results and significantly improves on the existing state-of-the-art techniques.
This paper proposes a unified learning framework that leverages and aggregates the cross-modality contextual information, including visual context from high-level image representation, and geometric context from 2D keypoint distribution, and proposes an effective N-pair loss that eschews the empirical hyper-parameter search and improves the convergence.
This work proposes a fully self-supervised approach towards learning depth-aware keypoints from unlabeled videos by incorporating a differentiable pose estimation module that jointly optimizes the keypoints and their depths in a Structure-from-Motion setting.
Adding a benchmark result helps the community track progress.