3260 papers • 126 benchmarks • 313 datasets
This task has no description! Would you like to contribute one?
(Image credit: Papersgraph)
These leaderboards are used to track progress in systematic-generalization-2
No benchmarks available.
Use these libraries to find systematic-generalization-2 models and implementations
No subtasks available.
A new benchmark, gSCAN, is introduced for evaluating compositional generalization in models of situated language understanding, taking inspiration from standard models of meaning composition in formal linguistics and defining a language grounded in the states of a grid world.
This work argues for the importance of learning to segment and represent objects jointly, and demonstrates that, starting from the simple assumption that a scene is composed of multiple entities, it is possible to learn to segment images into interpretable objects with disentangled representations.
A diagnostic benchmark suite, named CLUTRR, is introduced to clarify some key issues related to the robustness and systematicity of NLU systems, and highlights a substantial performance gap between state-of-the-art NLU models.
This work introduces Prioritized Level Replay, a general framework for estimating the future learning potential of a level given the current state of the agent's policy, and finds that temporal-difference errors, while previously used to selectively sample past transitions, also prove effective for scoring a level's futurelearning potential in generating entire episodes that an agent would experience when replaying it.
Surprisingly, it is found that an explicitly compositional Neural Module Network model also generalizes badly on CLOSURE, even when it has access to the ground-truth programs at test time.
It is argued that NetHack is sufficiently complex to drive long-term research on problems such as exploration, planning, skill acquisition, and language-conditioned RL, while dramatically reducing the computational resources required to gather a large amount of experience.
This paper introduces a sequential extension to Slot Attention which is trained to predict optical flow for realistic looking synthetic scenes and shows that conditioning the initial state of this model on a small set of hints is sufficient to significantly improve instance segmentation.
It is suggested that models can successfully one-shot generalize to novel concepts and compositions through semantic linking, either inductively or deductively, and it is demonstrated that prior knowledge plays a key role as well.
This model is the first one that significantly outperforms the provided baseline and reaches state-of-the-art performance on grounded SCAN, a grounded natural language navigation dataset designed to require systematic generalization in its test splits.
Adding a benchmark result helps the community track progress.