3260 papers • 126 benchmarks • 313 datasets
Event-Based Video Reconstruction aims to generate a sequence of intensity frames from an asynchronous stream of events (per-pixel brightness change signals outputted by an event camera).
(Image credit: Papersgraph)
These leaderboards are used to track progress in event-based-video-reconstruction-6
Use these libraries to find event-based-video-reconstruction-6 models and implementations
No subtasks available.
This work presents a new High Quality Frames (HQF) dataset, containing events and ground truth frames from a DAVIS240C that are well-exposed and minimally motion-blurred, and presents strategies for improving training data for event based CNNs that result in 20-40% boost in performance of existing state-of-the-art (SOTA) video reconstruction networks retrained with this method.
The proposed algorithm includes a frame augmentation pre-processing step that deblurs and temporally interpolates frame data using events and outperforms state-of-the-art methods in both absolute intensity error and image similarity indexes.
This paper proposes a unified evaluation methodology and introduces an open-source framework called EVREAL to comprehensively benchmark and analyze various event-based video reconstruction methods from the literature and provides valuable insights into the performance of these methods under varying settings, challenging scenarios, and downstream tasks.
Event cameras, which output events by detecting spatio- temporal brightness changes, bring a novel paradigm to image sensors with high dynamic range and low latency. Previous works have achieved impressive performances on event-based video reconstruction by introducing convolutional neural networks (CNNs). However, intrinsic locality of convolutional operations is not capable of modeling long-range dependency, which is crucial to many vision tasks. In this paper, we present a hybrid CNN- Transformer network for event-based video reconstruction (ET-Net), which merits the fine local information from CNN and global contexts from Transformer In addition, we further propose a Token Pyramid Aggregation strategy to implement multi-scale token integration for relating internal and intersected semantic concepts in the token-space. Experimental results demonstrate that our proposed method achieves superior performance over state-of-the-art methods on multiple real-world event datasets. The code is available at https://github.com/WarranWeng/ET-Net.
This paper proposes a novel Event-based Video reconstruction framework based on a fully Spiking Neural Network (EVSNN), which utilizes Leaky-Integrate-and-Fire (LIF) neuron and Membrane Potential (MP) neuron, and finds that the spiking neurons have the potential to store useful temporal information (memory) to complete such time-dependent tasks.
This study proposes HyperE2VID, a dynamic neural network architecture for event-based video reconstruction that uses hypernetworks to generate per-pixel adaptive filters guided by a context fusion module that combines information from event voxel grids and previously reconstructed intensity images.
Adding a benchmark result helps the community track progress.