1
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
2
Towards Language-Driven Video Inpainting via Multimodal Large Language Models
3
Flow-Guided Diffusion for Video Inpainting
4
ProPainter: Improving Propagation and Transformer for Video Inpainting
5
Deficiency-Aware Masked Transformer for Video Inpainting
6
FishDreamer: Towards Fisheye Semantic Completion via Unified Image Outpainting and Segmentation
7
Imagen Video: High Definition Video Generation with Diffusion Models
8
Flow-Guided Transformer for Video Inpainting
9
Deep 360° Optical Flow Estimation Based on Multi-Projection Fusion
10
Error Compensation Framework for Flow-Guided Video Inpainting
11
Rethinking Alignment in Video Super-Resolution Transformers
13
Optical Camera Communication in Vehicular Applications: A Review
14
MaskViT: Masked Visual Pre-Training for Video Prediction
15
FisheyeEX: Polar Outpainting for Extending the FoV of Fisheye Lens
16
Surround-View Fisheye Camera Perception for Automated Driving: Overview, Survey & Challenges
17
Review on Panoramic Imaging and Its Applications in Scene Understanding
18
Reduce Information Loss in Transformers for Pluralistic Image Inpainting
19
Cylin-Painting: Seamless 360° Panoramic Image Outpainting and Beyond
20
Towards An End-to-End Framework for Flow-Guided Video Inpainting
21
MISF:Multi-level Interactive Siamese Filtering for High-Fidelity Image Inpainting
22
PanoFlow: Learning 360° Optical Flow for Surrounding Temporal Understanding
23
How Do Vision Transformers Work?
24
MaskGIT: Masked Generative Image Transformer
25
VRT: A Video Restoration Transformer
26
Generalised Image Outpainting with U-Transformer
27
MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition
28
Image-Adaptive Hint Generation via Vision Transformer for Outpainting
29
Generative Adversarial Networks
30
Transfer Beyond the Field of View: Dense Panoramic Semantic Segmentation via Unsupervised Domain Adaptation
31
KITTI-360: A Novel Dataset and Benchmarks for Urban Scene Understanding in 2D and 3D
32
Resolution-robust Large Mask Inpainting with Fourier Convolutions
33
FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting
34
Structured Denoising Diffusion Models in Discrete State-Spaces
35
CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows
36
FitVid: Overfitting in Pixel-Level Video Prediction
37
SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
38
FDAN: Flow-guided Deformable Alignment Network for Video Super-Resolution
39
BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment
40
InOut: Diverse Image Outpainting via GAN Inversion
41
CDFI: Compression-Driven Network Design for Frame Interpolation
42
Painting Outside as Inside: Edge Guided Image Outpainting via Bidirectional Rearrangement with Progressive Step Learning
43
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
44
Flow-edge Guided Video Completion
45
Learning Joint Spatial-Temporal Transformations for Video Inpainting
46
Design of a panoramic annular lens with ultrawide angle and small blind area.
47
BiFuse: Monocular 360 Depth Estimation via Bi-Projection Fusion
48
Language Models are Few-Shot Learners
49
Contextual Residual Aggregation for Ultra High-Resolution Image Inpainting
50
RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
51
MaskFlownet: Asymmetric Feature Matching With Learnable Occlusion Mask
52
Adversarial Video Generation on Complex Datasets
53
Axial Attention in Multidimensional Transformers
54
Copy-and-Paste Networks for Deep Video Inpainting
55
Boundless: Generative Adversarial Networks for Image Extension
56
AdaCoF: Adaptive Collaboration of Flows for Video Frame Interpolation
57
Can we PASS beyond the Field of View? Panoramic Annular Semantic Segmentation for Real-World Surrounding Perception
58
Generating Diverse High-Fidelity Images with VQ-VAE-2
59
Wide-Context Semantic Image Extrapolation
60
Deep Flow-Guided Video Inpainting
62
Free-Form Video Inpainting With 3D Gated Convolution and Temporal PatchGAN
63
Deformable ConvNets V2: More Deformable, Better Results
64
YouTube-VOS: Sequence-to-Sequence Video Object Segmentation
65
Painting Outside the Box: Image Outpainting with GANs
66
Video-to-Video Synthesis
67
Learning Blind Video Temporal Consistency
68
OmniDepth: Dense Depth Estimation for Indoors Spherical Panoramas
69
Video Inpainting by Jointly Learning Temporal Structure and Spatial Details
70
Free-Form Image Inpainting With Gated Convolution
71
A comparative review of plausible hole filling strategies in the context of scene depth image completion
72
Image Inpainting for Irregular Holes Using Partial Convolutions
73
Im2Pano3D: Extrapolating 360° Structure and Semantics Beyond the Field of View
74
Neural Discrete Representation Learning
75
Optical system design of space fisheye lens and performance analysis
76
Globally and locally consistent image completion
77
Temporally coherent completion of dynamic video
78
Optical Flow Estimation Using a Spatial Pyramid Network
79
A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation
80
Context Encoders: Feature Learning by Inpainting
81
A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation
82
Rear obstacle detection system with fisheye stereo camera using HCT
83
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
84
Simplified compact fisheye lens challenges and design
85
FlowNet: Learning Optical Flow with Convolutional Networks
86
Deep Unsupervised Learning using Nonequilibrium Thermodynamics
87
Video Inpainting of Complex Scenes
88
Flow and Color Inpainting for Video Completion
89
Microsoft COCO: Common Objects in Context
90
A 360-degree panoramic video system design
91
PatchMatch: a randomized correspondence algorithm for structural image editing
92
Design of a panoramic annular lens with a long focal length.
93
Simultaneous structure and texture image inpainting
96
Focal Attention for Long-Range Interactions in Vision Transformers
97
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
98
"Learnable Gated Temporal Shift Module for Deep Video Inpainting"
99
360 o snapshot imaging with a convex array of long-wave infrared cameras
100
GENERATIVE ADVERSARIAL NETS
101
architectures following
102
The newly introduced 3D-Decoupled Cross Attention (DDCA) and Mix Fusion Feed Forward Network (MixF3N) are seamlessly integrated into the FlowLens architecture, further boosting its performance
103
Through extensive experiments and user studies
106
for the small model. The input features of the are split into 7 × 7 overlapping patches with 3 × 3
107
propose FlowLens , a novel clip-recurrent transformer framework designed to enhance scene visibility and perception beyond the field of view in real-time,