1
Hopular: Modern Hopfield Networks for Tabular Data
2
History Compression via Language Models in Reinforcement Learning
3
A Globally Convergent Evolutionary Strategy for Stochastic Constrained Optimization with Applications to Reinforcement Learning
4
CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP
5
Convergence Proof for Actor-Critic Methods Applied to PPO and RUDDER
6
Cross-Domain Few-Shot Learning by Representation Fusion
7
Forgetful Experience Replay in Hierarchical Reinforcement Learning from Demonstrations
8
Playing Minecraft with Behavioural Cloning
9
Sample Efficient Reinforcement Learning through Learning from Demonstrations in Minecraft
10
Retrospective Analysis of the 2019 MineRL Competition on Sample Efficient Reinforcement Learning
11
Hierarchical Deep Q-Network with Forgetting from Imperfect Demonstrations in Minecraft
12
PyTorch: An Imperative Style, High-Performance Deep Learning Library
13
Continuous Deep Maximum Entropy Inverse Reinforcement Learning using online POMDP
14
Reinforcement Learning from Imperfect Demonstrations under Soft Expert Guidance
15
MineRL: A Large-Scale Dataset of Minecraft Demonstrations
16
Search on the Replay Buffer: Bridging Planning and Reinforcement Learning
17
Watch, Try, Learn: Meta-Learning from Demonstrations and Reward
18
SQIL: Imitation Learning via Regularized Behavioral Cloning
19
Successor Options: An Option Discovery Framework for Reinforcement Learning
20
The MineRL Competition on Sample Efficient Reinforcement Learning using Human Priors
21
Go-Explore: a New Approach for Hard-Exploration Problems
22
Experience Replay for Continual Learning
23
Inverse reinforcement learning for video games
24
RUDDER: Return Decomposition for Delayed Rewards
25
Hierarchical Imitation and Reinforcement Learning
26
Truncated Horizon Policy Search: Combining Reinforcement Learning & Imitation Learning
27
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
28
Pretraining Deep Actor-Critic Reinforcement Learning Algorithms With Expert Demonstrations
29
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
30
Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
31
Eigenoption Discovery through the Deep Successor Representation
32
Meta Learning Shared Hierarchies
33
Rainbow: Combining Improvements in Deep Reinforcement Learning
34
Overcoming Exploration in Reinforcement Learning with Demonstrations
35
Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations
36
One-Shot Visual Imitation Learning via Meta-Learning
37
Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards
38
Proximal Policy Optimization Algorithms
39
Deep Q-learning From Demonstrations
40
Learning from Demonstrations for Real World Reinforcement Learning
41
One-Shot Imitation Learning
42
Deeply AggreVaTeD: Differentiable Imitation Learning for Sequential Prediction
43
FeUdal Networks for Hierarchical Reinforcement Learning
44
Time delays, competitive interdependence, and firm performance
45
The Option-Critic Architecture
46
Probabilistic inference for determining options in reinforcement learning
47
Playing Atari Games with Deep Reinforcement Learning and Human Checkpoint Replay
48
Successor Features for Transfer in Reinforcement Learning
49
Generative Adversarial Imitation Learning
50
Exploration from Demonstration for Interactive Reinforcement Learning
51
Learning from Demonstration for Shaping through Inverse Reinforcement Learning
52
Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation
53
Adaptive Skills Adaptive Partitions (ASAP)
54
Mastering the game of Go with deep neural networks and tree search
55
Reinforcement Learning from Demonstration through Shaping
57
Learning from Limited Demonstrations
58
Compositional Planning Using Optimal Option Models
59
Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega
60
Unified Inter and Intra Options Learning Using Policy Gradient Methods
61
Integrating reinforcement learning with human demonstrations of varying ability
62
Robot Programming by Demonstration
63
A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning
64
Optimal policy switching algorithms for reinforcement learning
65
Efficient Reductions for Imitation Learning
66
Effects of feedback delay on learning
67
Active Learning for Reward Estimation in Inverse Reinforcement Learning
68
Robot Programming by Demonstration
69
Search-based structured prediction
70
Biopython: freely available Python tools for computational molecular biology and bioinformatics
71
Sequence Comparison: Theory and Methods
72
Maximum Entropy Inverse Reinforcement Learning
73
A Game-Theoretic Approach to Apprenticeship Learning
74
Exact finite approximations of average-cost countable Markov decision processes
75
Large-scale kernel machines
76
Matplotlib: A 2D Graphics Environment
77
Clustering by Passing Messages Between Data Points
78
Apprenticeship learning via inverse reinforcement learning
79
MUSCLE: multiple sequence alignment with high accuracy and high throughput.
80
Learning Options in Reinforcement Learning
81
Approximately Optimal Approximate Reinforcement Learning
82
T-Coffee: A novel method for fast and accurate multiple sequence alignment.
83
Algorithms for Inverse Reinforcement Learning
84
Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning
85
Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition
86
Highly specific protein sequence motifs for genome analysis.
88
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.
89
Learning from Demonstration
90
LSTM can Solve Hard Long Time Lag Problems
91
Learning to Take Actions
92
Learning from delayed rewards
93
On the Complexity of Multiple Sequence Alignment
94
CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice.
95
Building symbolic representations of intuitive real-time skills from performance data
96
PROSITE: recent developments.
97
Markov Decision Processes: Discrete Stochastic Dynamic Programming
98
Improving Generalization for Temporal Difference Learning: The Successor Representation
99
Amino acid substitution matrices from protein blocks.
100
Efficient Training of Artificial Neural Networks for Autonomous Navigation
101
Basic local alignment search tool.
102
Statistical Composition of High-Scoring Segments from Molecular Sequences
103
Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes.
104
Multiple sequence alignment with hierarchical clustering.
105
An improved algorithm for matching biological sequences.
106
Use of the 'Perceptron' algorithm to distinguish translational initiation sites in E. coli.
107
Identification of common molecular subsequences.
108
Cases in which Parsimony or Compatibility Methods will be Positively Misleading
109
A linear space algorithm for computing maximal common subsequences
110
Understanding the Effects of Dataset Characteristics on Offline Reinforcement Learning
111
Hopfield networks is all you need
112
XAI and Strategy Extraction via Reward Redistribution
113
A LIGN -RUDDER: L EARNING F ROM F EW D EMON - STRATIONS BY R EWARD R EDISTRIBUTION
114
Modern hopfield networks and attention for immune repertoire classification
116
mazelab: A customizable framework to create maze and gridworld environments
118
Active Imitation Learning: Formal and Practical Reductions to I.I.D. Learning
120
Reinforcement Learning: An Introduction
121
DIALIGN : multiple DNA and protein sequence alignment at BiBiServ
122
Untersuchungen zu dynamischen neuronalen Netzen
123
Cognitive models from subcognitive skills
124
Atlas of protein sequence and structure
125
From Few Demonstrations by Reward Redistribution –30 Mar press/ Few-shot learning by dimensionality reduction in gradient
126
A.15 Step (IV) Compute a position-specific scoring matrix (PSSM)