3260 papers • 126 benchmarks • 313 datasets
This task has no description! Would you like to contribute one?
(Image credit: Papersgraph)
These leaderboards are used to track progress in learning-to-execute-4
No benchmarks available.
Use these libraries to find learning-to-execute-4 models and implementations
No datasets available.
No subtasks available.
The Universal Transformer (UT), a parallel-in-time self-attentive recurrent sequence model which can be cast as a generalization of the Transformer model and which addresses issues of parallelizability and global receptive field, is proposed.
A generalist learner is constructed that effectively incorporates knowledge captured by specialist models, and a series of improvements to the input representation, training regime and processor architecture over CLRS are presented, improving average single-task performance by over 20% from prior art.
A learned conditional masking mechanism is proposed, which enables the model to strongly generalize far outside of its training range with near-perfect accuracy on a variety of algorithms.
A novel GNN architecture, the Instruction Pointer Attention Graph Neural Networks (IPA-GNN), is introduced, which achieves improved systematic generalization on the task of learning to execute programs using control flow graphs.
The Program-guided Transformer (ProTo) is proposed, which integrates both semantic and structural guidance of a program by leveraging cross-attention and masked self-att attention to pass messages between the specification and routines in the program.
This work introduces Learning to Execute (L2E), which leverages information contained in approximate plans to learn universal policies that are conditioned on plans and exhibits increased performance when compared to pure RL, pure planning, or baseline methods combining learning and planning.
Surprisingly, it is shown that the model can also predict the location of the error, despite being trained only on labels indicating the presence/absence and kind of error.
This paper extends the Minecraft Corpus Dataset by annotating all builder utterances into eight types, including clarification questions, and proposes a new builder agent model capable of determining when to ask or execute instructions.
This work proposes the CLRS Algorithmic Reasoning Benchmark, covering classical algorithms from the Introduction to Algorithms textbook, and performs extensive experiments to demonstrate how several popular algorithmic reasoning baselines perform on these tasks, and consequently, highlight links to several open challenges.
Adding a benchmark result helps the community track progress.