3260 papers • 126 benchmarks • 313 datasets
Few-Shot Learning is an example of meta-learning, where a learner is trained on several related tasks, during the meta-training phase, so that it can generalize well to unseen (but related) tasks with just few examples, during the meta-testing phase. An effective approach to the Few-Shot Learning problem is to learn a common representation for various tasks and train task specific classifiers on top of this representation. Source: Penalty Method for Inversion-Free Deep Bilevel Optimization
(Image credit: Papersgraph)
These leaderboards are used to track progress in few-shot-learning-11
Use these libraries to find few-shot-learning-11 models and implementations
An algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learning problems, including classification, regression, and reinforcement learning is proposed.
GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic.
LLaMA, a collection of foundation language models ranging from 7B to 65B parameters, is introduced and it is shown that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets.
This work proposes Prototypical Networks for few-shot classification, and provides an analysis showing that some simple design decisions can yield substantial improvements over recent approaches involving complicated architectural choices and meta-learning.
This work employs ideas from metric learning based on deep neural features and from recent advances that augment neural networks with external memories to learn a network that maps a small labelled support set and an unlabelled example to its label, obviating the need for fine-tuning to adapt to new class types.
This article introduces BioBERT (Bidirectional Encoder Representations from Transformers for Biomedical Text Mining), which is a domain-specific language representation model pre-trained on large-scale biomedical corpora that largely outperforms BERT and previous state-of-the-art models in a variety of biomedical text mining tasks when pre- trained on biomedical Corpora.
A conceptually simple, flexible, and general framework for few-shot learning, where a classifier must learn to recognise new classes given only few examples from each, which is easily extended to zero- shot learning.
A family of algorithms for learning a parameter initialization that can be fine-tuned quickly on a new task, using only first-order derivatives for the meta-learning updates, including Reptile, which works by repeatedly sampling a task, training on it, and moving the initialization towards the trained weights on that task.
The results reveal that reducing intra-class variation is an important factor when the feature backbone is shallow, but not as critical when using deeper backbones, and a baseline method with a standard fine-tuning practice compares favorably against other state-of-the-art few-shot learning algorithms.
A simple process: meta-learning over a whole-classification pre-trained model on its evaluation metric achieves competitive performance to state-of-the-art methods on standard bench-marks and sheds some light on understanding the trade-offs between the meta- learning objective and the whole- classification objective in few-shot learning.
Adding a benchmark result helps the community track progress.