ANA, a tool to automatically annotate named entities in text based on entity lists, spans the whole pipeline from obtaining the lists to analyzing the errors of the distant supervision and it is shown that the F1-score can be increased by on average 18 points through distantly supervised data obtained by ANEA.
Distant supervision allows obtaining labeled training corpora for low-resource settings where only limited hand-annotated data exists. However, to be used effectively, the distant supervision must be easy to obtain. In this work, we present ANEA, a tool to automatically annotate named entities in text based on entity lists. It spans the whole pipeline from obtaining the lists to analyzing the errors of the distant supervision. A tuning step allows the user to improve the automatic annotation with their linguistic insights without having to manually label or check all tokens. In six low-resource scenarios, we show that the F1-score can be increased by on average 18 points through distantly supervised data obtained by ANEA.