natural-language-processing-10

Instruction Following

3260 papers • 126 benchmarks • 313 datasets

This task has no description! Would you like to contribute one?

(Image credit: Papersgraph)

Benchmarks

These leaderboards are used to track progress in instruction-following-20

Trend

Dataset

Best Model

Actions

IFEval

Libraries

i

Use these libraries to find instruction-following-20 models and implementations

opengvlab/llama-adapter

3 papers 5,502

Datasets

Subtasks

visual instruction following

Most implemented papers

Self-Instruct: Aligning Language Models with Self-Generated Instructions

Noah A. Smith, Hannaneh Hajishirzi, Alisa Liu, Yizhong Wang, Daniel Khashabi, Swaroop Mishra, Yeganeh Kordi•Mon Dec 19 2022

Self-Instruct is introduced, a framework for improving the instruction-following capabilities of pretrained language models by bootstrapping off their own generations by generating instructions, input, and output samples from a language model, then filters invalid or similar ones before using them to finetune the original model.

Content

2 papers 10,918

2 papers 10,918

2 papers 289

2 papers 63

daniel-furman/sft-demos

2 papers 56

HomeroRR/rmm

2 papers 10

Alpaca Data Galician

Tamil Alpaca

Tamil Alpaca Orca

CIDAR

2882

0

Paper Graph

Habitat: A Platform for Embodied AI Research

Devi Parikh, Dhruv Batra, V. Koltun, Julian Straub, Jitendra Malik, M. Savva, Abhishek Kadian, Oleksandr Maksymets, Yili Zhao, Erik Wijmans, Bhavana Jain, Jia Liu•Mon Apr 01 2019

The comparison between learning and SLAM approaches from two recent works are revisited and evidence is found -- that learning outperforms SLAM if scaled to an order of magnitude more experience than previous investigations, and the first cross-dataset generalization experiments are conducted.

1692 0

Paper Graph

QLoRA: Efficient Finetuning of Quantized LLMs

Luke Zettlemoyer, Ari Holtzman, Tim Dettmers, Artidoro Pagnoni•Mon May 22 2023

QLoRA finetuning on a small high-quality dataset leads to state-of-the-art results, even when using smaller models than the previous SoTA, and current chatbot benchmarks are not trustworthy to accurately evaluate the performance levels of chatbots.

3843 0

Paper Graph

Visual Instruction Tuning

Chunyuan Li, Haotian Liu, Yong Jae Lee, Qingyang Wu•Sun Apr 16 2023

This paper presents LLaVA: Large Language and Vision Assistant, an end-to-end trained large multimodal model that connects a vision encoder and LLM for general-purpose visual and language understanding and introduces GPT-4 generated visual instruction tuning data, the model and code base publicly available.

7668 0

Paper Graph

Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks

Noah A. Smith, Hannaneh Hajishirzi, Yejin Choi, Yizhong Wang, Daniel Khashabi, Kuntal Kumar Pal, Swaroop Mishra, Chitta Baral, Shailaja Keyur Sampat, Pegah Alipoormolabashi, Maitreya Patel, Neeraj Varshney, Mihir Parmar, Yeganeh Kordi, Amirreza Mirzaei, Anjana Arunkumar, Arjun Ashok, Arut Selvan Dhanasekaran, Atharva Naik, David Stap, Eshaan Pathak, Giannis Karamanolakis, H. Lai, I. Purohit, Ishani Mondal, Jacob Anderson, Kirby Kuznia, Krima Doshi, M. Moradshahi, Mirali Purohit, Phani Rohitha Kaza, Pulkit Verma, Ravsehaj Singh Puri, Rushang Karia, Savan Doshi, Siddhartha Mishra, Sujan Reddy, Sumanta Patro, Tanay Dixit, Xudong Shen•Fri Apr 15 2022

Tk-Instruct is built, a transformer model trained to follow a variety of in-context instructions (plain language task definitions or k-shot examples) that outperforms existing instruction-following models such as InstructGPT by over 9% on the authors' benchmark despite being an order of magnitude smaller.

1027 0

Paper Graph

LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention

Hongsheng Li, Pan Lu, Renrui Zhang, Shilin Yan, Peng Gao, Y. Qiao, Jiaming Han, Aojun Zhou, Xiangfei Hu•Mon Mar 27 2023

A zero-initialized attention mechanism with zero gating is proposed, which adaptively injects the new instructional cues into LLaMA, while effectively preserves its pre-trained knowledge on traditional vision and language tasks, demonstrating the superior generalization capacity of the approach.

948 0

Paper Graph

Mapping Instructions to Actions in 3D Environments with Visual Goal Prediction

Yoav Artzi, Dipendra Misra, Andrew Bennett, Valts Blukis, Eyvind Niklasson, Max Shatkhin•Fri Aug 31 2018

A model that maps raw visual observations to goals using LINGUNET, a language-conditioned image generation network, and then generates the actions required to complete them is designed.

197 0

Paper Graph

Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following

P. Heng, Hongsheng Li, Renrui Zhang, Peng Gao, Ziyu Guo, Ke Chen, Jiaming Han, Xiangyang Zhu, Yiwen Tang, Xianzheng Ma, Xianzhi Li•Thu Aug 31 2023

Point-LLM is presented, the first 3D large language model (LLM) following 3D multi-modal instructions, which injects the semantics of Point-Bind into pre-trained LLMs, e.g., LLaMA, which requires no 3D instruction data, but exhibits superior 3D and multi- modal question-answering capacity.

192 0

Paper Graph

LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model

Hongsheng Li, Conghui He, Pan Lu, Renrui Zhang, Peng Gao, W. Zhang, Xiangyu Yue, Y. Qiao, Jiaming Han, Aojun Zhou, Ziyi Lin, Shijie Geng•Thu Apr 27 2023

This work augments LLaMA-Adapter by unlocking more learnable parameters and proposes an early fusion strategy to feed visual tokens only into the early LLM layers, contributing to better visual knowledge incorporation and achieves strong multi-modal reasoning with only a small-scale image-text and instruction dataset.

714 0

Paper Graph

Adding a benchmark result helps the community track progress.