3260 papers • 126 benchmarks • 313 datasets
This task has no description! Would you like to contribute one?
(Image credit: Papersgraph)
These leaderboards are used to track progress in future-prediction-4
No benchmarks available.
Use these libraries to find future-prediction-4 models and implementations
No subtasks available.
The proposed Deep Stochastic IOC RNN Encoder-decoder framework, DESIRE, for the task of future predictions of multiple interacting agents in dynamic scenes significantly improves the prediction accuracy compared to other baseline methods.
An end-to-end, multi-task learning system utilizing rich visual features about human behavioral information and interaction with their surroundings is proposed, providing the first empirical evidence that joint modeling of paths and activities benefits future path prediction.
An approach for pixel-level future prediction given an input image of a scene observing that a scene is comprised of distinct entities that undergo motion is presented and empirically validate the approach against alternate representations and ways of incorporating multi-modality.
This work addresses questions of temporal extent, scaling, and level of semantic abstraction with a flexible multi-granular temporal aggregation framework and shows that it is possible to achieve state of the art in both next action and dense anticipation with simple techniques such as max-pooling and attention.
An uncertainty-based accident anticipation model with spatio-temporal relational learning that sequentially predicts the probability of traffic accident occurrence with dashcam videos is proposed to take advantage of graph convolution and recurrent networks for relational feature learning, and leverage Bayesian neural networks to address the intrinsic variability of latent relational representations.
This paper presents MultiPath++, a future prediction model that achieves state-of-the-art performance on popular benchmarks, and reconsiders the choice of pre-defined static anchors, and develops a way to learn latent anchor embeddings end-to-end in the model.
A region-based relation learning paradigm that models social interactions via region-wise dynamics of joint states, i.e., the changes in the density of crowds, and it is shown that diverse prediction benefits from region-based relation learning.
To the best of the knowledge, this is the first end-to-end trainable network architecture with motion and content separation to model the spatiotemporal dynamics for pixel-level future prediction in natural videos.
Adding a benchmark result helps the community track progress.