3260 papers • 126 benchmarks • 313 datasets
Action parsing is the task of, given a video or still image, assigning each frame or image a label describing the action in that frame or image.
(Image credit: Papersgraph)
These leaderboards are used to track progress in action-parsing-9
No benchmarks available.
Use these libraries to find action-parsing-9 models and implementations
No subtasks available.
A novel bilinear pooling operation is proposed, which is used in intermediate layers of a temporal convolutional encoder-decoder net and is learnable and hence can capture more complex local statistics than the conventional counterpart.
The proposed bilinear form outperforms the previous state-of-the-art methods on the challenging temporal action segmentation task, enabling to represent high-order information in complex deep models effectively and efficiently.
Adding a benchmark result helps the community track progress.