3260 papers • 126 benchmarks • 313 datasets
Prompt engineering is the process of designing and refining the prompts used to generate text from language models, such as GPT-3 or similar models. The goal of prompt engineering is to improve the quality and relevance of the generated text by carefully crafting the prompts to elicit the desired responses from the model. Prompt engineering involves several steps, including selecting the appropriate model architecture and parameters, designing the prompt format and structure, selecting the appropriate task and training data, and fine-tuning the model using the selected prompt and data. Prompt engineering is a crucial step in the development of language models, as it can greatly influence the quality and effectiveness of the model's responses. By carefully designing and refining the prompts used to generate text, researchers and developers can improve the accuracy and relevance of the model's output, making it more useful for a wide range of applications, including chatbots, language translation, content creation, and more.
(Image credit: Papersgraph)
These leaderboards are used to track progress in prompt-engineering-20
Use these libraries to find prompt-engineering-20 models and implementations
It is demonstrated that the simple pre-training task of predicting which caption goes with which image is an efficient and scalable way to learn SOTA image representations from scratch on a dataset of 400 million (image, text) pairs collected from the internet.
Context Optimization (CoOp) is proposed, a simple approach specifically for adapting CLIP-like vision-language models for downstream image recognition that achieves superb domain generalization performance compared with the zero-shot model using hand-crafted prompts.
A system for easily mapping any natural language tasks into a human-readable prompted form and fine-tune a pretrained encoder-decoder model on this multitask mixture covering a wide variety of tasks.
A novel method P-Tuning is proposed that employs trainable continuous prompt embeddings in concatenation with discrete prompts that stabilizes training by minimizing the gap between various discrete prompts, and improves performance by a sizeable margin on a wide range of NLU tasks including LAMA and SuperGLUE.
Conditional Context Optimization (CoCoOp), which extends CoOp by further learning a lightweight neural network to generate for each image an input-conditional token (vector), and yields stronger domain generalization performance as well.
This paper introduces Visual Prompt Tuning (VPT) as an efficient and effective alternative to full fine-tuning for large-scale Transformer models in vision and shows that VPT achieves significant performance gains compared to other parameter efficient tuning protocols.
This work develops an understanding of the effective prompt formats and proposes to use weak supervision, a procedure for combining the noisy predictions, to produce the final predictions for the inputs of a large language model.
This study proposes a novel pluggable, extensible, and efficient approach named DifferentiAble pRompT (DART), which can convert small language models into better few-shot learners without any prompt engineering.
This research documents the experimental evaluation of the performance of OpenAI's `text-davinci-003` model, often-referred to as GPT-3.5, on the multistate multiple choice (MBE) section of the Bar Exam, and believes that these results strongly suggest that an LLM will pass the MBE component of the bar exam in the near future.
A total of 30 advanced MLLMs are comprehensively evaluated on the first comprehensive MLLM Evaluation benchmark MME, which suggests that existing MLLMs still have a large room for improvement, but also reveals the potential directions for the subsequent model optimization.
Adding a benchmark result helps the community track progress.