Home Research Papers Datasets State of the Art Pricing

Discover, visualize, and connect AI research papers. Explore the latest trends and insights in artificial intelligence research.

Product

Home
Research Papers
About

Support

Contact
Terms of Service
Privacy Policy

© 2026 Papersgraph. All rights reserved.

natural-language-processing-4

Paraphrase Identification

3260 papers • 126 benchmarks • 313 datasets

The goal of Paraphrase Identification is to determine whether a pair of sentences have the same meaning. Source: Adversarial Examples with Difficult Common Words for Paraphrase Identification Image source: On Paraphrase Identification Corpora

(Image credit: Papersgraph)

Benchmarks

These leaderboards are used to track progress in paraphrase-identification-8

Trend

Dataset

Best Model

Actions

Quora Question Pairs

Quora Question Pairs

MSRP

MSRP

Quora Question Pairs Dev

Libraries

i

Use these libraries to find paraphrase-identification-8 models and implementations

huggingface/transformers

3 papers 116,090

Datasets

GLUE

IMDb Movie Reviews

IMDb Movie Reviews

MRPC

PAWS-X

PAWS

WikiHop

Subtasks

No subtasks available.

Most implemented papers

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, Kenton Lee, Kristina Toutanova, Ming-Wei Chang•Mon Dec 31 2018

A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

109344

Content

Introduction Benchmarks Datasets Subtasks Libraries Papers

Quora Question Pairs Dev

2017_test set

2017_test set

MRPC

MRPC

WikiHop

WikiHop

TURL

TURL

PIT

PIT

AP

AP

IMDb

IMDb

Yelp

Yelp

kaushaltrivedi/fast-bert

3 papers 1,828

utterworks/fast-bert

3 papers 1,828

labmlai/annotated_deep_learning_pap…

2 papers 39,185

2 papers 2,158

Tencent/TurboTransformers

2 papers 1,394

IndicoDataSolutions/finetune

2 papers 692

awslabs/mlm-scoring

2 papers 317

Karthik-Bhaskar/Context-Based-Quest…

2 papers 36

sunyilgdx/prompts4keras

2 papers 22

Quora Question Pairs

Quora Question Pairs

PIT

Paralex

FLUE

0

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Quoc V. Le, Yiming Yang, R. Salakhutdinov, Zhilin Yang, Zihang Dai, J. Carbonell•Tue Jun 18 2019

XLNet is proposed, a generalized autoregressive pretraining method that enables learning bidirectional contexts by maximizing the expected likelihood over all permutations of the factorization order and overcomes the limitations of BERT thanks to its autore progressive formulation.

9150 0

FNet: Mixing Tokens with Fourier Transforms

J. Ainslie, J. Lee-Thorp, Ilya Eckstein, Santiago Ontañón•Sat May 08 2021

The FNet model is significantly faster: when compared to the “efficient Transformers” on the Long Range Arena benchmark, FNet matches the accuracy of the most accurate models, while outpacing the fastest models across all sequence lengths on GPUs (and across relatively shorter lengths on TPUs).

652 0

Bilateral Multi-Perspective Matching for Natural Language Sentences

Radu Florian, Wael Hamza, Zhiguo Wang•Sun Feb 12 2017

This work proposes a bilateral multi-perspective matching (BiMPM) model under the "matching-aggregation" framework that achieves the state-of-the-art performance on all tasks.

833 0

data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language

Michael Auli, Alexei Baevski, Wei-Ning Hsu, Qiantong Xu, Arun Babu, Jiatao Gu•Sun Feb 06 2022

Data2vec is a framework that uses the same learning method for either speech, NLP or computer vision to predict latent representations of the full input data based on a masked view of the input in a self-distillation setup using a standard Transformer architecture.

1012 0

ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs

Hinrich Schütze, Bowen Zhou, Bing Xiang, Wenpeng Yin•Tue Dec 15 2015

This work presents a general Attention Based Convolutional Neural Network (ABCNN) for modeling a pair of sentences and proposes three attention schemes that integrate mutual influence between sentences into CNNs; thus, the representation of each sentence takes into consideration its counterpart.

963 0

Multi-Task Deep Neural Networks for Natural Language Understanding

Jianfeng Gao, Weizhu Chen, Xiaodong Liu, Pengcheng He•Wed Jan 30 2019

A Multi-Task Deep Neural Network (MT-DNN) for learning representations across multiple natural language understanding (NLU) tasks that allows domain adaptation with substantially fewer in-domain labels than the pre-trained BERT representations.

1332 0

SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization

Jianfeng Gao, Weizhu Chen, Xiaodong Liu, Pengcheng He, Haoming Jiang, T. Zhao•Thu Nov 07 2019

A new learning framework for robust and efficient fine-tuning for pre-trained models to attain better generalization performance and outperforms the state-of-the-art T5 model, which is the largest pre- trained model containing 11 billion parameters, on GLUE.

593 0

SpanBERT: Improving Pre-training by Representing and Predicting Spans

Danqi Chen, Daniel S. Weld, Omer Levy, Luke Zettlemoyer, Mandar Joshi, Yinhan Liu•Tue Jul 23 2019

The approach extends BERT by masking contiguous random spans, rather than random tokens, and training the span boundary representations to predict the entire content of the masked span, without relying on the individual token representations within it.

2111 0

TinyBERT: Distilling BERT for Natural Language Understanding

Linlin Li, Qun Liu, Xin Jiang, Xiaoqi Jiao, Yichun Yin, Lifeng Shang, Xiao Chen, F. Wang•Sun Sep 22 2019

A novel Transformer distillation method that is specially designed for knowledge distillation (KD) of the Transformer-based models is proposed and, by leveraging this new KD method, the plenty of knowledge encoded in a large “teacher” BERT can be effectively transferred to a small “student” TinyBERT.

2184 0

Adding a benchmark result helps the community track progress.

Paraphrase Identification | State-of-the-Art