Peek Across: Improving Multi-Document Modeling via Cross-Document Question-Answering (2023-05-24T00:00:00.000000Z)

TL;DR

This work proposes pre- training a generic multi-document model from a novel cross-document question answering pre-training objective, and develops a novel multi- document QA formulation that directs the model to better recover cross-text informational relations, and introduces a natural augmentation that artificially increases the pre- Training data.

Abstract

The integration of multi-document pre-training objectives into language models has resulted in remarkable improvements in multi-document downstream tasks. In this work, we propose extending this idea by pre-training a generic multi-document model from a novel cross-document question answering pre-training objective.To that end, given a set (or cluster) of topically-related documents, we systematically generate semantically-oriented questions from a salient sentence in one document and challenge the model, during pre-training, to answer these questions while “peeking” into other topically-related documents.In a similar manner, the model is also challenged to recover the sentence from which the question was generated, again while leveraging cross-document information.This novel multi-document QA formulation directs the model to better recover cross-text informational relations, and introduces a natural augmentation that artificially increases the pre-training data. Further, unlike prior multi-document models that focus on either classification or summarization tasks, our pre-training objective formulation enables the model to perform tasks that involve both short text generation (e.g., QA) and long text generation (e.g., summarization).Following this scheme, we pre-train our model - termed QAmden - and evaluate its performance across several multi-document tasks, including multi-document QA, summarization, and query-focused summarization, yielding improvements of up to 7%, and significantly outperforms zero-shot GPT-3.5 and GPT-4.

Authors

Arman Cohan

18 papers

Ido Dagan

19 papers

J. Goldberger

3 papers

TL;DR

Abstract

Authors

References67 items

Stop Uploading Test Data in Plain Text: Practical Strategies for Mitigating Data Contamination by Evaluation Benchmarks

GPT-4 Technical Report

Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models

Cross-document Event Coreference Search: Task, Dataset and Modeling

Few-shot Learning with Retrieval Augmented Language Models

QASem Parsing: Text-to-text Modeling of QA-based Semantics

LinkBERT: Pretraining Language Models with Document Links

Training Compute-Optimal Large Language Models

Training language models to follow instructions with human feedback

UnifiedQA-v2: Stronger Generalization via Broader Cross-Format Training

Long Context Question Answering via Supervised Contrastive Learning

Proposition-Level Clustering for Multi-Document Summarization

LongT5: Efficient Text-To-Text Transformer for Long Sequences

PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

CCQA: A New Web-Scale Question Answering Dataset for Model Pre-Training

iFacetSum: Coreference-based Interactive Faceted Summarization for Multi-Document Exploration

Asking It All: Generating Contextualized Questions for any Semantic Role

Question Answering Infused Pre-training of General-Purpose Contextualized Representations

Self-Supervised Document Similarity Ranking via Contextualized Language Models and Hierarchical Inference

Efficiently Summarizing Text and Graph Encodings of Multi-Document Clusters

What’s in Your Head? Emergent Behaviour in Multi-Task Transformer Models

Data Augmentation for Abstractive Query-Focused Multi-Document Summarization

QANom: Question-Answer driven SRL for Nominalizations

Multi-document Summarization via Deep Learning Techniques: A Survey

Long Range Arena: A Benchmark for Efficient Transformers

Coarse-to-Fine Query Focused Multi-Document Summarization

Multi-XScience: A Large-scale Dataset for Extreme Multi-document Summarization of Scientific Articles

QADiscourse - Discourse Relations as QA Pairs: Representation, Crowdsourcing and Baselines

Transformers: State-of-the-Art Natural Language Processing

Efficient Transformers: A Survey

Accelerating Real-Time Question Answering via Question Generation

Big Bird: Transformers for Longer Sequences

SummPip: Unsupervised Multi-Document Summarization with Sentence Graph Compression

UnifiedQA: Crossing Format Boundaries With a Single QA System

Heterogeneous Graph Neural Networks for Extractive Document Summarization

ETC: Encoding Long and Structured Inputs in Transformers

Longformer: The Long-Document Transformer

Generating Representative Headlines for News Stories

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

MRQA 2019 Shared Task: Evaluating Generalization in Reading Comprehension

QuASE: Question-Answer Driven Sentence Encoding

Multi-News: A Large-Scale Multi-Document Summarization Dataset and Abstractive Hierarchical Model

Synthetic QA Corpora Generation with Roundtrip Consistency

Hierarchical Transformers for Multi-Document Summarization

HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering

Large-Scale QA-SRL Parsing

Constructing Datasets for Multi-hop Reading Comprehension Across Documents

Attention is All you Need

Question-Answer Driven Semantic Role Labeling: Using Natural Language to Annotate Natural Language

Teaching Machines to Read and Comprehend

Adam: A Method for Stochastic Optimization

ROUGE: A Package for Automatic Evaluation of Summaries

Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval

2022, Appendix A))

Matthew Peters, Arie Cattan, and Ido Dagan

CDLM: Cross-document language modeling

Looking for a Few Good Metrics: ROUGE and its Evaluation

error bars around results, summary statistics from sets of experiments), and is it transparent whether you are reporting the max, mean, etc. or just a single run? Not applicable

Did you discuss any potential risks of your work? Last page sections named Limitations and Ethics Statement

B Did you use or create scientific artifacts? Section 3

Have you used AI writing assistants when working on this paper?

for preprocessing, for normalization, or for evaluation), did you report the implementation, model, and parameter settings used

OpenAI

Do the abstract and introduction summarize the paper's main claims? Abstract and Section 1

Reference Ground-Truth Summary GPT-3.5 QAMDEN ACL 2023 Responsible NLP Checklist A For every submission: A1. Did you describe the limitations of your work? Last page section named Limitations

Field of Study

Venue Information

Name

Type

URL

Alternate Names