3260 papers • 126 benchmarks • 313 datasets
Spatial Token Mixer (STM) is a module for vision transformers that aims to improve the efficiency of token mixing. STM is a type of depthwise convolution that operates on the spatial dimension of the tokens. STM is a drop-in replacement for the token mixing layers in vision transformers.
(Image credit: Papersgraph)
These leaderboards are used to track progress in spatial-token-mixer-14
No benchmarks available.
Use these libraries to find spatial-token-mixer-14 models and implementations
No datasets available.
No subtasks available.
Adding a benchmark result helps the community track progress.