Home Research Papers Datasets State of the Art Pricing

Discover, visualize, and connect AI research papers. Explore the latest trends and insights in artificial intelligence research.

Product

Home
Research Papers
About

Support

Contact
Terms of Service
Privacy Policy

© 2026 Papersgraph. All rights reserved.

natural-language-processing-9

Long-range modeling

3260 papers • 126 benchmarks • 313 datasets

A new task for testing the long-sequence modeling capabilities and efficiency of language models. Image credit: SCROLLS: Standardized CompaRison Over Long Language Sequences

(Image credit: Papersgraph)

Benchmarks

These leaderboards are used to track progress in long-range-modeling-9

Trend

Dataset

Best Model

Actions

LRA

LRA

SCROLLS

SCROLLS

Libraries

i

Use these libraries to find long-range-modeling-9 models and implementations

2 papers 50

Datasets

LRA

SCROLLS

MuLD

Pathfinder-X2

Subtasks

No subtasks available.

Most implemented papers

Efficiently Modeling Long Sequences with Structured State Spaces

Karan Goel, Christopher R'e, Albert Gu•Sat Oct 30 2021

The Structured State Space sequence model (S4) is proposed based on a new parameterization for the SSM, and it is shown that it can be computed much more efficiently than prior approaches while preserving their theoretical strengths.

2984

Content

Introduction Benchmarks Datasets Subtasks Libraries Papers

0

Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition

Wanli Ouyang, Hongwen Zhang, Ken Ziyu Liu, Zhenghao Chen, Zhiyong Wang•Mon Mar 30 2020

A simple method to disentangle multi-scale graph convolutions and a unified spatial-temporal graph convolutional operator named G3D are presented and a powerful feature extractor named MS-G3D is developed based on which the model outperforms previous state-of-the-art methods on three large-scale datasets.

999 0

Long Range Arena: A Benchmark for Efficient Transformers

Sebastian Ruder, Mostafa Dehghani, Yikang Shen, Yi Tay, Samira Abnar, Dara Bahri, Philip Pham, J. Rao, Liu Yang, Donald Metzler•Sat Nov 07 2020

A systematic and unified benchmark, LRA, specifically focused on evaluating model quality under long-context scenarios is proposed, paving the way towards better understanding this class of efficient Transformer models.

839 0

Mega: Moving Average Equipped Gated Attention

Graham Neubig, Chunting Zhou, Junxian He, Luke Zettlemoyer, Jonathan May, Xuezhe Ma, Liangke Gui, Xiang Kong•Tue Sep 20 2022

This paper introduces Mega, a simple, theoretically grounded, single-head gated attention mechanism equipped with (exponential) moving average to incorporate inductive bias of position-aware local dependencies into the position-agnostic attention mechanism.

223 0

Hungry Hungry Hippos: Towards Language Modeling with State Space Models

Christopher Ré, Daniel Y. Fu, Tri Dao, Khaled Kamal Saab, A. Thomas, A. Rudra•Tue Dec 27 2022

A new SSM layer, H3, is proposed that is explicitly designed for the impact on language modeling and achieves promising initial results, achieving lower perplexity than Transformers and outperforming Transformers in zero- and few-shot learning on a majority of tasks in the SuperGLUE benchmark.

568 0

SCROLLS: Standardized CompaRison Over Long Language Sequences

Omer Levy, Jonathan Berant, Mor Geva, Ankit Gupta, Wenhan Xiong, Elad Segal, Uri Shaham, Maor Ivgi, Avia Efrat, Ori Yoran, Adi Haviv•Sun Jan 09 2022

This work introduces SCROLLS, a suite of tasks that require reasoning over long texts, and examines existing long-text datasets, and handpick ones where the text is naturally long, while prioritizing tasks that involve synthesizing information across the input.

164 0

Diagonal State Spaces are as Effective as Structured State Spaces

Jonathan Berant, Ankit Gupta•Sat Mar 26 2022

This work shows that one can match the performance of S4 even without the low rank correction and thus assuming the state matrices to be diagonal, and proposes a new diagonal state space model (DSS) that is conceptually simpler and straightforward to implement.

423 0

On the Parameterization and Initialization of Diagonal State Space Models

Christopher Ré, Karan Goel, Ankit Gupta, Albert Gu•Wed Jun 22 2022

This work systematically describes various design choices in parameterizing and computing diagonal SSMs, and performs a controlled empirical study ablating the effects of these choices.

491 0

Weakly Supervised Object Localization via Transformer with Implicit Spatial Calibration

Haotian Bai, Ruimao Zhang, Jiong Wang, Xiang Wan•Wed Jul 20 2022

A simple yet effective Spatial Calibration Module (SCM) is introduced for accurate WSOL, incorporating semantic similarities of patch tokens and their spatial relationships into a unified diffusion model, and introduces a learnable parameter to dynamically adjust the semantic correlations and spatial context intensities for effective information propagation.

45 0

Simplified State Space Layers for Sequence Modeling

Jimmy Smith, Andrew Warrington, Scott W. Linderman•Mon Aug 08 2022

A state space layer that can leverage efficient and widely implemented parallel scans, allowing S5 to match the computational efficiency of S4, while also achieving state-of-the-art performance on several long-range sequence modeling tasks.

864 0

Adding a benchmark result helps the community track progress.

Long-range modeling | State-of-the-Art